Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graviton3 support – intermittent crash/coredump #5797

Open
arasmussen opened this issue Apr 8, 2024 · 1 comment
Open

Graviton3 support – intermittent crash/coredump #5797

arasmussen opened this issue Apr 8, 2024 · 1 comment

Comments

@arasmussen
Copy link

What's going wrong?

  • pm2 daemon crashes (coredump logged in /var/log/messages)

How could we reproduce this issue?

  • it doesn't seem to reproduce consistently. maybe 10% of our deploys (pm2 startOrReload) result in a crash/coredump.
  • We just moved from graviton2 to graviton3, so I suspect that the issue is related to an incompatibility with graviton3.

Supporting information

--- PM2 report ----------------------------------------------------------------
Date                 : Mon Apr 08 2024 18:32:37 GMT+0000 (Coordinated Universal Time)
===============================================================================
--- Daemon -------------------------------------------------
pm2d version         : 5.3.0
node version         : 18.17.0
node path            : /home/ec2-user/.nvm/versions/node/v18.17.0/bin/pm2
argv                 : /home/ec2-user/.nvm/versions/node/v18.17.0/bin/node,/home/ec2-user/.nvm/versions/node/v18.17.0/lib/node_modules/pm2/lib/Daemon.js
argv0                : node
user                 : ec2-user
uid                  : 1000
gid                  : 1000
uptime               : 57min
===============================================================================
--- CLI ----------------------------------------------------
local pm2            : 5.3.0
node version         : 18.17.0
node path            : /home/ec2-user/.nvm/versions/node/v18.17.0/bin/pm2
argv                 : /home/ec2-user/.nvm/versions/node/v18.17.0/bin/node,/home/ec2-user/.nvm/versions/node/v18.17.0/bin/pm2,report
argv0                : node
user                 : ec2-user
uid                  : 1000
gid                  : 1000
===============================================================================
--- System info --------------------------------------------
arch                 : arm64
platform             : linux
type                 : Linux
cpus                 : unknown
cpus nb              : 4
freemem              : 14946652160
totalmem             : 16449142784
home                 : /home/ec2-user
===============================================================================
--- PM2 list -----------------------------------------------
┌────┬───────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id │ name      │ namespace   │ version │ mode    │ pid      │ uptime │ ↺    │ status    │ cpu      │ mem      │ user     │ watching │
└────┴───────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘
===============================================================================
--- Daemon logs --------------------------------------------
/home/ec2-user/.pm2/pm2.log last 20 lines:
PM2        | 2024-04-08T17:26:16: PM2 log: pid=149369 msg=failed to kill - retrying in 100ms
PM2        | 2024-04-08T17:26:16: PM2 log: pid=137480 msg=failed to kill - retrying in 100ms
PM2        | 2024-04-08T17:26:16: PM2 log: Process with pid 149369 still alive after 30000ms, sending it SIGKILL now...
PM2        | 2024-04-08T17:26:17: PM2 log: Process with pid 137480 still alive after 30000ms, sending it SIGKILL now...
PM2        | 2024-04-08T17:34:44: PM2 log: ===============================================================================
PM2        | 2024-04-08T17:34:45: PM2 log: --- New PM2 Daemon started ----------------------------------------------------
PM2        | 2024-04-08T17:34:45: PM2 log: Time                 : Mon Apr 08 2024 17:34:45 GMT+0000 (Coordinated Universal Time)
PM2        | 2024-04-08T17:34:45: PM2 log: PM2 version          : 5.3.0
PM2        | 2024-04-08T17:34:45: PM2 log: Node.js version      : 18.17.0
PM2        | 2024-04-08T17:34:45: PM2 log: Current arch         : arm64
PM2        | 2024-04-08T17:34:45: PM2 log: PM2 home             : /home/ec2-user/.pm2
PM2        | 2024-04-08T17:34:45: PM2 log: PM2 PID file         : /home/ec2-user/.pm2/pm2.pid
PM2        | 2024-04-08T17:34:45: PM2 log: RPC socket file      : /home/ec2-user/.pm2/rpc.sock
PM2        | 2024-04-08T17:34:45: PM2 log: BUS socket file      : /home/ec2-user/.pm2/pub.sock
PM2        | 2024-04-08T17:34:45: PM2 log: Application log path : /home/ec2-user/.pm2/logs
PM2        | 2024-04-08T17:34:45: PM2 log: Worker Interval      : 30000
PM2        | 2024-04-08T17:34:45: PM2 log: Process dump file    : /home/ec2-user/.pm2/dump.pm2
PM2        | 2024-04-08T17:34:45: PM2 log: Concurrent actions   : 2
PM2        | 2024-04-08T17:34:45: PM2 log: SIGTERM timeout      : 1600
PM2        | 2024-04-08T17:34:45: PM2 log: ===============================================================================

@arasmussen arasmussen changed the title Graviton3 support Graviton3 support – crash/coredump Apr 8, 2024
@arasmussen arasmussen changed the title Graviton3 support – crash/coredump Graviton3 support – intermittent crash/coredump Apr 8, 2024
@arasmussen arasmussen changed the title Graviton3 support – intermittent crash/coredump Graviton 3 support – intermittent crash/coredump Apr 8, 2024
@arasmussen
Copy link
Author

Just migrated our instances to Graviton2 (from m7g to m6g) and confirmed we are not able to reproduce this issue there.

@arasmussen arasmussen changed the title Graviton 3 support – intermittent crash/coredump Graviton3 support – intermittent crash/coredump Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant