Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes appear to shut off without submitting power state change and uptime report #2332

Open
scottyeager opened this issue May 16, 2024 · 0 comments
Labels
type_bug Something isn't working
Milestone

Comments

@scottyeager
Copy link

A farmer reported that their farmerbot is often attempting RMB communication with nodes that appear to be in the standby state. Up investigation, I found that the nodes appear to shut off without setting their power state or submitting a final uptime report.

It's somewhat unconclusive, because the nodes simply cease to submit any logs, but the it seems rather unlikely to me that nodes would just happen to become unresponsive exactly when they are supposed to power down due to farmerbot.

Here are two examples:

Node 5751

NodeUptimeReported(uptime=336, timestamp=1715600454, event_index=2)
Node booted at 1715600118 NodeUptimeReported(uptime=336, timestamp=1715600454, event_index=2)

PowerStateChanged(state='Up', timestamp=1715600460, event_index=32)

PowerTargetChanged(target='Down', timestamp=1715602740, event_index=32)

PowerTargetChanged(target='Up', timestamp=1715627700, event_index=32)

NodeUptimeReported(uptime=386, timestamp=1715628528, event_index=7)

From farmerbot logs:

2:36PM (CEST) ERR error="failed to update node 5751 with error: failed to get node 5751 statistics from rmb with error: context deadline exceeded"

And the node logs (these are the final messages received before the node's next boot):

2024-05-13 14:19:06	[+] noded: 2024-05-13T12:19:06Z info got power target change event node=5620
2024-05-13 14:19:00	[-] powerd: 2024/05/13 12:19:00 Connecting to wss://03.tfchain.grid.tf/...
2024-05-13 14:19:00	[+] noded: 2024-05-13T12:19:00Z info got power target change event node=5751

Node 2103

NodeUptimeReported(uptime=306, timestamp=1715598222, event_index=12)
Node booted at 1715597916 NodeUptimeReported(uptime=306, timestamp=1715598222, event_index=12)

PowerStateChanged(state='Up', timestamp=1715598234, event_index=41)

PowerTargetChanged(target='Down', timestamp=1715600658, event_index=17)

PowerTargetChanged(target='Up', timestamp=1715627694, event_index=37)

NodeUptimeReported(uptime=329, timestamp=1715628318, event_index=27)
1:59PM ERR error="failed to update node 2103 with error: failed to get node 2103 statistics from rmb with error: context deadline exceeded"
2024-05-13 13:45:27	[+] redis: 2237:M 13 May 2024 11:45:27.196 * Background saving terminated with success
2024-05-13 13:45:27	[+] redis: 7562:C 13 May 2024 11:45:27.097 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
2024-05-13 13:45:27	[+] redis: 7562:C 13 May 2024 11:45:27.097 * DB saved on disk
2024-05-13 13:45:27	[+] redis: 2237:M 13 May 2024 11:45:27.095 * Background saving started by pid 7562
2024-05-13 13:45:27	[+] redis: 2237:M 13 May 2024 11:45:27.095 * 100 changes in 300 seconds. Saving...
2024-05-13 13:44:18	[+] noded: 2024-05-13T11:44:18Z info got power target change event node=2103
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type_bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

2 participants