Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extreme CPU usage when sync has hung #1634

Closed
teor2345 opened this issue Jan 25, 2021 · 1 comment
Closed

Extreme CPU usage when sync has hung #1634

teor2345 opened this issue Jan 25, 2021 · 1 comment
Labels
C-bug Category: This is a bug

Comments

@teor2345
Copy link
Contributor

teor2345 commented Jan 25, 2021

Version

zebrad a merge of PRs #1620 and #1622, around 2021-01-22.

Platform

Linux ... 5.4.89 #1-NixOS SMP Tue Jan 12 19:16:25 UTC 2021 x86_64 GNU/Linux

Description

The zebrad syncer hangs when there are no peers (#1633), with a single core using 100% CPU.

I tried this:

Running zebrad on mainnet and testnet.

I expected to see this happen:

zebrad uses a small amount of CPU, even if it has no peers.

Instead, this happened:

The ChainSync component stopped logging output, until a peer connected to the zebrad Zcash listener port, but then it went back to hanging after a few minutes.

The 100% CPU usage continued during this time.

Analysis

This bug seems related to the network hangs in #1435.
It could be caused by buffer misuse in #1593, or another underlying cause of the syncer hang in #1633.

It might also be caused by FuturesUnordered bugs in tokio-rs/tokio#3493 (comment) , mitigated by rust-lang/futures-rs#2333.

Commands

zebrad start

Logs

Jan 25 12:53:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:54:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:55:52.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:56:40.267  INFO {zebrad="a49f4943" net="Main"}:peer{addr=127.0.0.1:40296}:msg_as_req{msg=getheaders}:state: zebra_state::service: responding to peer GetBlocks or GetHeaders final_height=Height(1119240) response_len=158 chain_tip_height=Height(1122481)
stop_height=None intersection_height=Some(Height(1119082))
Jan 25 12:56:46.288  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:56:46.288  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 12:57:47.289  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 12:57:47.289  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Height(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 12:57:47.290  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer requests
Jan 25 12:57:53.291  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:57:53.291  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Height(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer
 requests
Jan 25 12:59:00.294  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:59:00.294  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Heigh
t(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer
 requests
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:01:08.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:01:08.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:01:08.298  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Heigh
t(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 13:02:07.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:03:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:04:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:05:52.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:07:07.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:08:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:09:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
@teor2345 teor2345 added C-bug Category: This is a bug S-needs-triage Status: A bug report needs triage labels Jan 25, 2021
@teor2345 teor2345 added this to the 2021 Sprint 2 milestone Jan 25, 2021
@teor2345 teor2345 added this to No Estimate in Effort Affinity Grouping via automation Jan 25, 2021
@teor2345 teor2345 added this to To Do in 🦓 via automation Jan 25, 2021
@teor2345 teor2345 moved this from No Estimate to S - 3 in Effort Affinity Grouping Jan 25, 2021
@mpguerra mpguerra modified the milestones: 2021 Sprint 2, 2021 Sprint 3 Jan 25, 2021
@teor2345 teor2345 added P-Low and removed S-needs-triage Status: A bug report needs triage labels Jan 28, 2021
@teor2345 teor2345 modified the milestones: 2021 Sprint 3, 2021 Sprint 4 Feb 8, 2021
@mpguerra mpguerra removed this from the 2021 Sprint 4 milestone Feb 22, 2021
@teor2345
Copy link
Contributor Author

teor2345 commented Mar 4, 2021

The sync doesn't hang any more, and I haven't seen this CPU issue for a while.

@teor2345 teor2345 closed this as completed Mar 4, 2021
🦓 automation moved this from To Do to Done Mar 4, 2021
Effort Affinity Grouping automation moved this from S - 3 to Done Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug
Projects
No open projects
Development

No branches or pull requests

2 participants