Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of dual mutex in blockchain.go #1099

Merged
merged 7 commits into from Feb 11, 2022

Conversation

hqjang-pepper
Copy link
Contributor

@hqjang-pepper hqjang-pepper commented Jan 3, 2022

Proposed changes

  • This PR removes chainmu in blockchain.go.
  • The mu supposed to protect internal state from getting corrupted whilst chaimu from multiple modifications to the chain.
  • I found that functions such as Rollback or SetHead may cause race condition problems with chain insertion, since they use different mutex but can be called at the same time.

Related Ethereum PR : https://github.com/ethereum/go-ethereum/pull/18436/files

Types of changes

Please put an x in the boxes related to your change.

  • Bugfix
  • New feature or enhancement
  • Others

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING GUIDELINES doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes ($ make test)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Related issues

  • Please leave the issue numbers or links related to this PR here.

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...

@jeongkyun-oh
Copy link
Contributor

@hqjang-pepper Could you please check why the test cases are failed.

@hqjang-pepper hqjang-pepper self-assigned this Jan 4, 2022
@hqjang-pepper
Copy link
Contributor Author

@hqjang-pepper Could you please check why the test cases are failed.

Following check-fails are not directly from this PR changes.
Since I had no problems in local tests, so it's hard to reason with..
Gotta need some time to figure it out to deal with.

blockchain/blockchain.go Show resolved Hide resolved
blockchain/blockchain.go Show resolved Hide resolved
Copy link
Contributor

@ehnuje ehnuje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, however, have you found the root cause of timeout issue on CircleCI test?

@aidan-kwon aidan-kwon added this to the v1.8.0 milestone Jan 21, 2022
@hqjang-pepper
Copy link
Contributor Author

LGTM, however, have you found the root cause of timeout issue on CircleCI test?

I missed removing mutex at WriteBlockWithState.
Without removing, it causes race condition when InsertChain is called.
Changes were applied at this following PR. e93aae4

@ehnuje
Copy link
Contributor

ehnuje commented Jan 25, 2022

LGTM, however, have you found the root cause of timeout issue on CircleCI test?

I missed removing mutex at WriteBlockWithState. Without removing, it causes race condition when InsertChain is called. Changes were applied at this following PR. e93aae4

@hqjang-pepper If so, why didn't it make any problem on the local test? Due to timing issue?

@hqjang-pepper
Copy link
Contributor Author

LGTM, however, have you found the root cause of timeout issue on CircleCI test?

I missed removing mutex at WriteBlockWithState. Without removing, it causes race condition when InsertChain is called. Changes were applied at this following PR. e93aae4

@hqjang-pepper If so, why didn't it make any problem on the local test? Due to timing issue?

@ehnuje Sorry for late response.
There was no timeout issue when I tested through make test, which was different with results from CircleCi. I'm not sure that the reason no timeout failure was timing issue. But after I checked again, timeout issue also occurred in my local environment, and I found the cause by debugging.

@kjhman21
Copy link
Collaborator

kjhman21 commented Feb 4, 2022

@hqjang-pepper It requires heavy testing especially changing calling locations of locks. How did you conduct testing?

Copy link
Contributor Author

@hqjang-pepper hqjang-pepper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I conducted load test on two version, before/after mutex integration.
As a result of sending enough requests(more than 1m), the TPS difference between the two versions was 1~10.
It's not that big difference, so we decided to merge this PR.

blockchain/blockchain.go Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants