Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix non-deterministic map iteration #12693

Closed
wants to merge 7 commits into from
Closed

fix non-deterministic map iteration #12693

wants to merge 7 commits into from

Conversation

bruce-wayne2
Copy link
Contributor

Description

We should ensure that events emitted are in a deterministic order on any node


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • added ! to the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • followed the guidelines for building modules
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • included comments for documenting Go code
  • updated the relevant documentation or specification
  • reviewed "Files changed" and left comments if necessary
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic
  • reviewed API design and naming
  • reviewed documentation is accurate
  • reviewed tests and test coverage
  • manually tested (if applicable)

@bruce-wayne2 bruce-wayne2 requested a review from a team as a code owner July 22, 2022 04:35
Copy link
Contributor

@alexanderbez alexanderbez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I don't understand why this needs to be done. Events are not merkle-ized or kept in state.
  2. If there is sufficient reason to sort them, at least use generics to get the keys from the map.

@peterbourgon
Copy link

peterbourgon commented Jul 22, 2022

  • I don't understand why this needs to be done. Events are not merkle-ized or kept in state.

Agreed. And, even more, as events are not part of consensus, their ordering is by definition nondeterministic. So consumers must not rely on any specific ordering. (Consumers also must not assume that events are an accurate representation of consensus state. But I guess that's a different issue altogether.)

@alexanderbez
Copy link
Contributor

Yeah, so I'm not necessarily completely opposed to this, but just seems unnecessary...

@bruce-wayne2
Copy link
Contributor Author

The previous event sending method was in the form of key-value, so it was ordered. Now it is changed to the reflection method, but it has become disordered, and some programs may have abnormal errors.

For example this project: https://github.com/CosmWasm/wasmd/blob/82130463ad6d72548aabd7a2e7daf0d2a9b645dc/x/wasm/keeper/msg_dispatcher.go#L131

The event is passed to the contract here. If the implementation of the contract is to process the event, there will be a problem of inconsistent gas consumption.

@alexanderbez
Copy link
Contributor

Ok, I'm not very clear on the interaction between WASM and event consumption, but I don't see this as too much overhead so I'm cool with getting it in. Can we do two things prior to merge:

  1. Define a generics function to sort keys
  2. Add a changelog entry

:)

@peterbourgon
Copy link

peterbourgon commented Jul 26, 2022

This PR does not impact ordering of events, it impacts ordering of attributes within a given event.

  1. Is that the intent? If so, can the title and changelog entries be updated accordingly?
  2. Does it make sense that the order of event attributes is capable of affecting behavior at a determinism level?

edit: The provided test asserts that the "amount" attr will come before the "denom" attr:

			attrs := em.Events()[0].Attributes
 			s.Require().Len(attrs, 2)
 			s.Require().Equal(attrs[0].Key, "amount")
 			s.Require().Equal(attrs[1].Key, "denom")

If that lexicographical ordering of attributes by key is an invariant provided by Tendermint, then it should be documented accordingly — is it? If so, where? If it's not documented, then is it not the case that code which relies on attribute order is buggy?

@bruce-wayne2
Copy link
Contributor Author

This PR does not impact ordering of events, it impacts ordering of attributes within a given event.

  1. Is that the intent? If so, can the title and changelog entries be updated accordingly?
  2. Does it make sense that the order of event attributes is capable of affecting behavior at a determinism level?

edit: The provided test asserts that the "amount" attr will come before the "denom" attr:

			attrs := em.Events()[0].Attributes
 			s.Require().Len(attrs, 2)
 			s.Require().Equal(attrs[0].Key, "amount")
 			s.Require().Equal(attrs[1].Key, "denom")

If that lexicographical ordering of attributes by key is an invariant provided by Tendermint, then it should be documented accordingly — is it? If so, where? If it's not documented, then is it not the case that code which relies on attribute order is buggy?

This problem is closely related to the implementation of the wasm virtual machine: wasm passes the event to the contract. If the contract wants to get the event that it is interested in, the gas consumption will be inconsistent due to the order of the key values in the event. The specific details may still require wasm team to explain.

CosmWasm/wasmd#910

@peterbourgon
Copy link

The approach in the linked issue is fundamentally unsound, because events are not part of consensus. If you need deterministic execution, you have to use consensus state, you cannot use events.

@tac0turtle
Copy link
Member

@bruce-wayne2 were you able to replicate your bug with this change?

@ValarDragon
Copy link
Contributor

Arguably, isn't it a feature that events not get used in any determinism sensitive context, and places of non determinism help highlight this?

E.g. wasm using events is a bug

@peterbourgon
Copy link

wasm using events is a bug

💯 — events aren't, and can never be, reliably deterministic. Code which expects events to be deterministic is broken.

@alexanderbez
Copy link
Contributor

Arguably, isn't it a feature that events not get used in any determinism sensitive context, and places of non determinism help highlight this?

IMO yes!

@carter-ya
Copy link

carter-ya commented Jul 29, 2022

@marbar3778

We now sort before emit events, this problem no longer occurs

@Vizualni
Copy link
Contributor

Which cosmos-sdk versions are vulnerable?

Copy link
Collaborator

@robert-zaremba robert-zaremba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utack. I think it's always better to have determinism wherever possible.

s.Require().Equal(attrs[1].Key, "denom")
}
})
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to add rapid / fuzzy test

@iramiller
Copy link
Contributor

Arguably, isn't it a feature that events not get used in any determinism sensitive context, and places of non determinism help highlight this?

IMO yes!

In the words of Steve Jobs: You're holding it wrong.

Events are critical for many different integrations and they are often the first piece of information people reach for. When problems like this crop up it's worth asking if the users of the software are trying to tell you something about what they need. It would be worth while to seriously scope what it would take to offer determinism guarantees for events and consider adding it in future releases.

@peterbourgon
Copy link

peterbourgon commented Jul 29, 2022

Arguably, isn't it a feature that events not get used in any determinism sensitive context, and places of non determinism help highlight this?

IMO yes!

In the words of Steve Jobs: You're holding it wrong.

Events are critical for many different integrations and they are often the first piece of information people reach for. When problems like this crop up it's worth asking if the users of the software are trying to tell you something about what they need. It would be worth while to seriously scope what it would take to offer determinism guarantees for events and consider adding it in future releases.

What specifically do you mean when you ask for events? Do you mean a stream of messages, produced by a Cosmos full node, and consumed by a separate, likely out-of-process, subscriber? Assuming so, what happens when that subscriber is slow to consume the event stream? Or disconnects and re-connects at an older e.g. height?

If events must be reliable and deterministic, the producer is obliged to tolerate, and compensate for, any delivery pathology exhibited by a consumer. This isn't a tractable problem. You can't let a slow event receiver impact ABCI or consensus or whatever else. If you don't control your subscribers, then events have to be best-effort, which means they can't be deterministic. (For my assumed definition of deterministic, at least.)

I mean, they can be, I guess! But an events API that's deterministic and doesn't allow bad consumers to impact your core control loops or whatever is, in the end, more or less the same as the existing state query API. It certainly ain't a Websocket, for example. So, practically infeasible.

@iramiller
Copy link
Contributor

Assuming so, what happens when that subscriber is slow to consume the event stream?

When you need to reliably deliver events yes your node will slow due to back pressure. When event delivery fails your node should fault and exit. Not every node is a validator. Not every node is current, some are in catch up. If the node's job is to deliver event data reliably then when that is not possible it should fault and exit.

At times I feel like the SDK team forgets they are building an SDK, not the final blockchain app. Users of the SDK will bring their own requirements and infrastructure to the table as well.

@alexanderbez
Copy link
Contributor

alexanderbez commented Jul 30, 2022

Which cosmos-sdk versions are vulnerable?

SDK versions are not vulnerable. I want to make that clear. This is a misunderstanding in how events work. They were never documented or intended to be deterministic in their ordering or delivery.

Events are critical for many different integrations and they are often the first piece of information people reach for. When problems like this crop up it's worth asking if the users of the software are trying to tell you something about what they need.

Yes, absolutely! Relayers, explorers, clients, etc.. are all consumers of events and rightfully so! The State Machine, however, is no such a place for event consumption. As @ValarDragon pointed out, CW should be utilizing Proto *Response data structures instead.

@peterbourgon
Copy link

peterbourgon commented Jul 31, 2022

When you need to reliably deliver events yes your node will slow due to back pressure. When event delivery fails your node should fault and exit. Not every node is a validator. Not every node is current, some are in catch up. If the node's job is to deliver event data reliably then when that is not possible it should fault and exit.

Is a node's primary responsibility to the network which it joins, or the clients which call the endpoints it serves?

At times I feel like the SDK team forgets they are building an SDK, not the final blockchain app. Users of the SDK will bring their own requirements and infrastructure to the table as well.

But the SDK is not actually an SDK, right? When you build a Cosmos app, you just provide the "business logic" so to speak, but Cosmos handles all of the external interactions. So you're not really importing Cosmos into your app, you're basically injecting your app into Cosmos. That makes it more akin a framework, or application container, or something. You don't get to define the rules of the game, you have follow the rules that the SDK establishes.

@tac0turtle
Copy link
Member

tac0turtle commented Jul 31, 2022

Thank you everyone for chiming in. Events are by their nature non deterministic. They always have been, we did try to put them into consensus but had push back from various parts of the ecosystem. This is still on the table but this needs to be enabled on the tendermint side, sdk storing events in consensus would duplicate state.

Secondly, this issue seems to be focused on authz events due to their nature of odd ordering. This isn't a system wide issue, this is based on the Juno halt and other things users have run into in relation to authz. Seems like authz does not have proper fuzz testing in a distributed environment as we saw with the time issue.

At times I feel like the SDK team forgets they are building an SDK, not the final blockchain app. Users of the SDK will bring their own requirements and infrastructure to the table as well.

I don't think its fair to show up on a random pr and say this. We welcome people to open issues in relation to things they are experiencing in the sdk. We have always been open to discuss things to better understand user needs, if you don't make your needs known to us then we cant know what they are, we are trying to keep up with an ever growing ecosystem and their needs. Tendermint has said many times events are nondeterministic and we treat them the same way, this is the first time I'm hearing people wanting this feature other than Robert.

@iramiller
Copy link
Contributor

So you're not really importing Cosmos into your app, you're basically injecting your app into Cosmos.

And every serious network maintains a fork of the Cosmos SDK in order to maintain more control over how their network functions than the out of the box Cosmos SDK implementation provides. This is kind of a shame as in many cases this is only required due to decisions to favor types vs interfaces, intervals over public access making extension without a fork impossible.

Future refactors like the aborted middleware system of v0.46 are going to be a wonderful thing for asserting more control over how a network builds on top of the SDK.

I'm going to stop replying here now as I am sure those working this issue are quite tired of the slightly off topic discussion in their feeds.

@codecov
Copy link

codecov bot commented Aug 1, 2022

Codecov Report

Merging #12693 (0af0802) into main (dc95e33) will increase coverage by 9.03%.
The diff coverage is 100.00%.

❗ Current head 0af0802 differs from pull request most recent head 46cf0a8. Consider uploading reports for the commit 46cf0a8 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #12693      +/-   ##
==========================================
+ Coverage   56.29%   65.32%   +9.03%     
==========================================
  Files         639      689      +50     
  Lines       54457    70711   +16254     
==========================================
+ Hits        30657    46195   +15538     
- Misses      21356    21962     +606     
- Partials     2444     2554     +110     
Impacted Files Coverage Δ
types/events.go 84.65% <100.00%> (+0.35%) ⬆️
x/bank/keeper/msg_server.go 19.27% <0.00%> (-55.43%) ⬇️
x/distribution/client/common/common.go 15.78% <0.00%> (-40.47%) ⬇️
x/auth/keeper/params.go 82.35% <0.00%> (-17.65%) ⬇️
x/bank/types/send_authorization.go 55.55% <0.00%> (-13.41%) ⬇️
store/v2alpha1/multi/migration.go 60.97% <0.00%> (-6.38%) ⬇️
x/auth/module.go 48.14% <0.00%> (-3.86%) ⬇️
x/distribution/module.go 59.43% <0.00%> (-3.57%) ⬇️
x/evidence/module.go 42.50% <0.00%> (-3.45%) ⬇️
x/capability/module.go 61.03% <0.00%> (-3.35%) ⬇️
... and 89 more

@bruce-wayne2 bruce-wayne2 deleted the merge-main branch August 1, 2022 04:05
@bruce-wayne2 bruce-wayne2 restored the merge-main branch August 1, 2022 04:06
@mergify
Copy link
Contributor

mergify bot commented Aug 1, 2022

⚠️ The sha of the head commit of this PR conflicts with #12781. Mergify cannot evaluate rules on this PR. ⚠️

@aaronc
Copy link
Member

aaronc commented Aug 1, 2022

We have discussed (optional) deterministic events along with a number of other event improvements. These things are on our radar but are whole epics in and of themselves. Echoing what @marbar3778 said, if you have specific needs, priorities, etc. let us know and we will take these into consideration as we plan things.

@alexanderbez
Copy link
Contributor

Just pointing this out again -- we should start using *Response data types more effectively. I.e. actually using them.

mergify bot pushed a commit that referenced this pull request Aug 17, 2022
## Description

We should ensure that events emitted are in a deterministic order on any node.


Sorry, the previous [PR-12693](#12693) was closed due to wrong operation.

Close: #12693 



---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [x] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [x] updated the relevant documentation or specification
- [x] reviewed "Files changed" and left comments if necessary
- [x] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
mergify bot pushed a commit that referenced this pull request Aug 17, 2022
## Description

We should ensure that events emitted are in a deterministic order on any node.

Sorry, the previous [PR-12693](#12693) was closed due to wrong operation.

Close: #12693

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [x] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [x] updated the relevant documentation or specification
- [x] reviewed "Files changed" and left comments if necessary
- [x] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 6ed11b8)

# Conflicts:
#	CHANGELOG.md
mergify bot pushed a commit that referenced this pull request Aug 17, 2022
## Description

We should ensure that events emitted are in a deterministic order on any node.

Sorry, the previous [PR-12693](#12693) was closed due to wrong operation.

Close: #12693

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [x] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [x] updated the relevant documentation or specification
- [x] reviewed "Files changed" and left comments if necessary
- [x] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 6ed11b8)

# Conflicts:
#	CHANGELOG.md
tac0turtle pushed a commit that referenced this pull request Aug 18, 2022
* feat: deterministic map iteration (#12781)

## Description

We should ensure that events emitted are in a deterministic order on any node.

Sorry, the previous [PR-12693](#12693) was closed due to wrong operation.

Close: #12693

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [x] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [x] updated the relevant documentation or specification
- [x] reviewed "Files changed" and left comments if necessary
- [x] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 6ed11b8)

# Conflicts:
#	CHANGELOG.md

* resolve conflict (#12947)

* go mod tidy

* fix test

Co-authored-by: bruce-wayne2 <97930236+bruce-wayne2@users.noreply.github.com>
Co-authored-by: Julien Robert <julien@rbrt.fr>
MissingNO57 pushed a commit to fetchai/cosmos-sdk that referenced this pull request Sep 16, 2022
* feat: deterministic map iteration (#12781)

## Description

We should ensure that events emitted are in a deterministic order on any node.

Sorry, the previous [PR-12693](cosmos/cosmos-sdk#12693) was closed due to wrong operation.

Close: #12693
MissingNO57 added a commit to fetchai/cosmos-sdk that referenced this pull request Sep 27, 2022
* fix: update x/mint parameter validation (backport #12384) (#12396)

* chore: optimize get last completed upgrade (#12268)

* feat: improve GetLastCompletedUpgrade

* rename

* use var block

* fix: Simulation is not deterministic due to GenTx (backport #12374) (#12437)

* fix: use go install instead of go get in makefile (#12435)

* chore: fumpt sdk v45 series #12442

* feat: Move AppModule.BeginBlock and AppModule.EndBlock to extension interfaces (backport #12603) (#12638)

* feat: Move AppModule.BeginBlock and AppModule.EndBlock to extension interfaces (#12603)

## Description
Most modules right now have a no-op for AppModule.BeginBlock and AppModule.EndBlock. We should move these methods off the AppModule interface so we have less deadcode, and instead move them to extension interfaces.

1. I added `BeginBlockAppModule` and `EndBlockAppModule` interface.
2. Remove the dead-code from modules that do no implement them
3. Add type casting in the the module code to use the new interface

Closes: #12462

* feat: add message index event attribute to authz message execution (backport #12668) (#12673)

* chore(store): upgrade iavl to v0.19.0 (backport #12626) (#12697)

* feat: Add `GetParamSetIfExists` to prevent panic on breaking param changes (#12724)

* feat: Add convenience method for constructing key to access account's balance for a given denom (backport #12674) (#12745)

* feat: Add convenience method for constructing key to access account's balance for a given denom (#12674)

This PR adds a convenience method for constructing the key necessary to query for the account's balance of a given denom.

I ran into this issue since we are using ABCI query now to perform balance requests because we are also requesting merkle proofs for the returned balance [here](https://github.com/celestiaorg/celestia-node/pull/911/files#diff-0ee31f5a7bd88e9f758e6bebdf3ee36365519e55a451098d9638c39afe5eac42R144).

It would be nice to have a definitive convenience method for constructing the key.

[Ref.](github.com/celestiaorg/celestia-node/pull/911)

(cherry picked from commit a1777a8)

# Conflicts:
#	CHANGELOG.md
#	x/bank/legacy/v043/store.go
#	x/bank/types/keys.go

* Update CHANGELOG.md

* fix conflict

Co-authored-by: rene <41963722+renaynay@users.noreply.github.com>
Co-authored-by: Marko <marbar3778@yahoo.com>

* chore: bump tm in 0.45.x (#12784)

* bump tendermint version

* add changelog entry

* replace on jhump

* updates

* updates

* updates

Co-authored-by: Aleksandr Bezobchuk <aleks.bezobchuk@gmail.com>

* chore: 0.45.7 changelog prep (#12821)

* prepare for release

* modify release notes

* docs(staking): typo in staking/state (backport #12834) (#12836)

* fix(docs): typo in staking/state (#12834)

(cherry picked from commit fe89212)

# Conflicts:
#	x/staking/spec/01_state.md

* updates

Co-authored-by: Ari Rubinstein <arirubinstein@users.noreply.github.com>
Co-authored-by: Julien Robert <julien@rbrt.fr>

* chore: fee payer event (backport #12850) (#12856)

* chore: changelog update (backport #12859) (#12862)

* fix: Use fixed length hex for pointer at FwdCapabilityKey (backport #11737) (#12818)

* fix: Use fixed length hex for pointer at FwdCapabilityKey (backport #11737)

* Update CHANGELOG.md

* add comments and unit tests

* feat: deterministic map iteration (backport #12781) (#12944)

* feat: deterministic map iteration (#12781)

## Description

We should ensure that events emitted are in a deterministic order on any node.

Sorry, the previous [PR-12693](cosmos/cosmos-sdk#12693) was closed due to wrong operation.

Close: #12693

* chore: bump tendermint to `0.34.21` and iavl to `0.19.1` (#12970)

* perf: Amortize clearing unsorted cache entries (Juno genesis fix) (backport #12885) (#12961)

* perf: Amortize clearing unsorted cache entries (Juno genesis fix) (#12885)

This change fixes a bounty by the Juno team.  Juno's invariant checks took 10 hours during their most recent chain halt. This PR cuts that down to 30 seconds.  See https://github.com/CosmosContracts/bounties#improve-speed-of-invariant-checks.

The root problem is deep in the `can-withdraw` invariant check, which calls this repeatedly: https://github.com/cosmos/cosmos-sdk/blob/main/x/distribution/keeper/store.go#L337.  Iterators have a chain of parents and in this case creates an iterator from the `cachekv` store.  For the genesis file, it has a cache of 500,000+ unsorted entries, which are sorted as strings here: https://github.com/cosmos/cosmos-sdk/blob/main/store/cachekv/store.go#L314.  Each delegation from `can-withdraw` uses this cache and many of the cache checks miss or are a very small range.  This means very few entries get removed from the unsorted cache and they have to be re-sorted on the next call.  With a full cache it takes about 180ms on my machine to sort them.

This change introduce a minimum number of entries that will get processed and removed from the unsorted list. It's set at the same value that directs the code to sort them in the first place.  This ensures the unsorted values get removed in a relative short amount of time, and amortizes the cost to ensure an individual check does not have to process the entire cache.

## Benchmarks
On running the benchmarks included in this change produces:
```shell
name                    old time/op    new time/op    delta
LargeUnsortedMisses-32     21.2s ± 9%      0.0s ± 1%   -99.91%  (p=0.000 n=20+17)

name                    old alloc/op   new alloc/op   delta
LargeUnsortedMisses-32    1.64GB ± 0%    0.00GB ± 0%   -99.83%  (p=0.000 n=19+19)

name                    old allocs/op  new allocs/op  delta
LargeUnsortedMisses-32     20.0k ± 0%     41.1k ± 0%  +105.23%  (p=0.000 n=19+20)
```

## Invariant checks results
This is what the invariant checks for Juno look like with this change (on a Hetzner AX101):

```shell
INF starting node with ABCI Tendermint in-process
4:11PM INF Starting multiAppConn service impl=multiAppConn module=proxy
4:11PM INF Starting localClient service connection=query impl=localClient module=abci-client
4:11PM INF Starting localClient service connection=snapshot impl=localClient module=abci-client
4:11PM INF Starting localClient service connection=mempool impl=localClient module=abci-client
4:11PM INF Starting localClient service connection=consensus impl=localClient module=abci-client
4:11PM INF Starting EventBus service impl=EventBus module=events
4:11PM INF Starting PubSub service impl=PubSub module=pubsub
4:11PM INF Starting IndexerService service impl=IndexerService module=txindex
4:11PM INF ABCI Handshake App Info hash= height=0 module=consensus protocol-version=0 software-version=v9.0.0-36-g8fd6f16
4:11PM INF ABCI Replay Blocks appHeight=0 module=consensus stateHeight=0 storeHeight=0
4:12PM INF asserting crisis invariants inv=1/11 module=x/crisis name=gov/module-account
4:12PM INF asserting crisis invariants inv=2/11 module=x/crisis name=distribution/nonnegative-outstanding
4:12PM INF asserting crisis invariants inv=3/11 module=x/crisis name=distribution/can-withdraw
4:12PM INF asserting crisis invariants inv=4/11 module=x/crisis name=distribution/reference-count
4:12PM INF asserting crisis invariants inv=5/11 module=x/crisis name=distribution/module-account
4:12PM INF asserting crisis invariants inv=6/11 module=x/crisis name=bank/nonnegative-outstanding
4:12PM INF asserting crisis invariants inv=7/11 module=x/crisis name=bank/total-supply
4:12PM INF asserting crisis invariants inv=8/11 module=x/crisis name=staking/module-accounts
4:12PM INF asserting crisis invariants inv=9/11 module=x/crisis name=staking/nonnegative-power
4:12PM INF asserting crisis invariants inv=10/11 module=x/crisis name=staking/positive-delegation
4:12PM INF asserting crisis invariants inv=11/11 module=x/crisis name=staking/delegator-shares
4:12PM INF asserted all invariants duration=28383.559601 height=4136532 module=x/crisis
```

## Alternatives
There is another PR which fixes this problem for the Juno genesis file cosmos/cosmos-sdk#12886. However, because of its concurrent nature, it happens to hit a large range relatively early, clearing the unsorted entries and allowing the rest of the checks to not sort it.

(cherry picked from commit 4fc1f73)

# Conflicts:
#	CHANGELOG.md

* fix conflict

Co-authored-by: blazeroni <blazeroni@gmail.com>
Co-authored-by: Julien Robert <julien@rbrt.fr>
Co-authored-by: Marko <marbar3778@yahoo.com>

* fix: proper error when parsing telemetry configuration (backport #12981) (#12999)

* ci: fix release notes not populated by goreleaser (#13019)

Co-authored-by: Julien Robert <julien@rbrt.fr>

* fix: missing return statement in BaseApp.Query (backport #13046) (#13050)

* fix: missing return statement in BaseApp.Query (#13046)

## Description

Closes: #13040

* chore: v0.45.8 release changelog (#13053)

* chore: v0.45.8 release changelog

* fix issue name

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Robert Zaremba <robert@zaremba.ch>
Co-authored-by: Adu <foriteration@gmail.com>
Co-authored-by: Julien Robert <julien@rbrt.fr>
Co-authored-by: Jacob Gadikian <jacobgadikian@gmail.com>
Co-authored-by: Federico Kunze Küllmer <31522760+fedekunze@users.noreply.github.com>
Co-authored-by: rene <41963722+renaynay@users.noreply.github.com>
Co-authored-by: Marko <marbar3778@yahoo.com>
Co-authored-by: Aleksandr Bezobchuk <aleks.bezobchuk@gmail.com>
Co-authored-by: Ari Rubinstein <arirubinstein@users.noreply.github.com>
Co-authored-by: yihuang <huang@crypto.com>
Co-authored-by: blazeroni <blazeroni@gmail.com>
dudong2 pushed a commit to dudong2/lbm-sdk that referenced this pull request Nov 3, 2022
* feat: deterministic map iteration (#12781)

We should ensure that events emitted are in a deterministic order on any node.

Sorry, the previous [PR-12693](cosmos/cosmos-sdk#12693) was closed due to wrong operation.

Close: #12693

---

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [x] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [x] updated the relevant documentation or specification
- [x] reviewed "Files changed" and left comments if necessary
- [x] confirmed all CI checks have passed

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 6ed11b8)

* resolve conflict (#12947)

* go mod tidy

* fix test

Co-authored-by: bruce-wayne2 <97930236+bruce-wayne2@users.noreply.github.com>
Co-authored-by: Julien Robert <julien@rbrt.fr>
omritoptix pushed a commit to dymensionxyz/RDK that referenced this pull request Nov 29, 2022
* feat: deterministic map iteration (#12781)

We should ensure that events emitted are in a deterministic order on any node.

Sorry, the previous [PR-12693](cosmos/cosmos-sdk#12693) was closed due to wrong operation.

Close: #12693

---

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [x] added `!` to the type prefix if API or client breaking change
- [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#pr-targeting))
- [x] provided a link to the relevant issue or specification
- [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/main/docs/building-modules)
- [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/main/CONTRIBUTING.md#testing)
- [x] added a changelog entry to `CHANGELOG.md`
- [x] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [x] updated the relevant documentation or specification
- [x] reviewed "Files changed" and left comments if necessary
- [x] confirmed all CI checks have passed

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)

(cherry picked from commit 6ed11b8)

* resolve conflict (#12947)

* go mod tidy

* fix test

Co-authored-by: bruce-wayne2 <97930236+bruce-wayne2@users.noreply.github.com>
Co-authored-by: Julien Robert <julien@rbrt.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants