Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add consensus_decode_from_finite_reader optimization #1023

Merged
merged 1 commit into from Jun 1, 2022
Merged

Add consensus_decode_from_finite_reader optimization #1023

merged 1 commit into from Jun 1, 2022

Conversation

dpc
Copy link
Contributor

@dpc dpc commented May 29, 2022

As things are right now, memory exhaustion protection in Decodable
is based on checking input-decoded lengths against arbitrary limits,
and ad-hoc wrapping collection deserialization in Take.

The problem with that are two-fold:

  • Potential consensus bugs due to incorrect limits.
  • Performance degradation when decoding nested structured,
    due to recursive Take<Take<..>> readers.

This change introduces a systematic approach to the problem.

A concept of a "size-limited-reader" is introduced to rely on
the input data to finish at enforced limit and fail deserialization.

Memory exhaustion protection is now achived by capping allocations
to reasonable values, yet allowing the underlying collections
to grow to accomodate rare yet legitmately oversized data (with tiny
performance cost), and reliance on input data size limit.

A set of simple rules allow avoiding recursive Take wrappers.

Fix #997

/// it returns for OOM protection. See [`Decodable::consensus_decode_from_finite_reader`].
fn consensus_decode_bytes_from_finite_reader<D: io::Read>(mut d: D, mut len: usize) -> Result<Vec<u8>, Error> {
// there should never be any byte vectors longer than this,
// at least in real block data, but if they are we'll just re-allocate
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true, because network message is using it to decode payload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_VEC_SIZE it is then.

src/consensus/encode.rs Outdated Show resolved Hide resolved
@dpc
Copy link
Contributor Author

dpc commented May 29, 2022

OK. I have to benchmark this on a laptop, with a lot of noise, so I've ran the bench multiple times and picked some best results for both master and this PR:

old:
test blockdata::block::benches::bench_block_deserialize                 ... bench:   1,381,111 ns/iter (+/- 18,765)
test blockdata::transaction::benches::bench_transaction_deserialize     ... bench:         133 ns/iter (+/- 5)

test blockdata::block::benches::bench_block_deserialize                 ... bench:   1,431,350 ns/iter (+/- 187,241)
test blockdata::transaction::benches::bench_transaction_deserialize     ... bench:         177 ns/iter (+/- 4)

test blockdata::block::benches::bench_block_deserialize                 ... bench:   1,364,857 ns/iter (+/- 22,960)
test blockdata::transaction::benches::bench_transaction_deserialize     ... bench:         127 ns/iter (+/- 6)


new:
test blockdata::block::benches::bench_block_deserialize                 ... bench:   1,112,535 ns/iter (+/- 45,702)
test blockdata::transaction::benches::bench_transaction_deserialize     ... bench:         145 ns/iter (+/- 3)

test blockdata::block::benches::bench_block_deserialize                 ... bench:   1,093,602 ns/iter (+/- 21,832)
test blockdata::transaction::benches::bench_transaction_deserialize     ... bench:          97 ns/iter (+/- 2)

test blockdata::block::benches::bench_block_deserialize                 ... bench:   1,133,440 ns/iter (+/- 36,818)
test blockdata::transaction::benches::bench_transaction_deserialize     ... bench:         146 ns/iter (+/- 2)

@dpc dpc marked this pull request as ready for review May 29, 2022 05:44
#[inline]
fn consensus_decode<D: io::Read>(d: D) -> Result<Self, encode::Error> {
Ok(Script(Decodable::consensus_decode(d)?))
Self::consensus_decode_from_finite_reader(d.take(MAX_VEC_SIZE as u64))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ae28a50:

Curious why this isn't just a default impl on the trait? This code looks like it's repeated for every individual type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you already have a default impl going the other way.

FWIW I think it would be ok to have both methods default-impl'd to call each other, with doccomments instructing trait implementors to implement one. I believe that std does this somewhere, though I don't remember where..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like it. I'll give it a try.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed

@@ -167,7 +167,7 @@ pub fn deserialize<T: Decodable>(data: &[u8]) -> Result<T, Error> {
/// doesn't consume the entire vector.
pub fn deserialize_partial<T: Decodable>(data: &[u8]) -> Result<(T, usize), Error> {
let mut decoder = Cursor::new(data);
let rv = Decodable::consensus_decode(&mut decoder)?;
let rv = Decodable::consensus_decode_from_finite_reader(&mut decoder)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 3a87775:

I'm a liiitle tempted to change the above line to Cursor::new(&data[..MAX_VEC_SIZE]) but maybe that's a little too weird/surprising.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it. It would give a little bit of protection if someone somehow passed a huge vector in here. But then - if the caller allows malicious input to create huge vectors, they kind of already lost anyway. And the ability of decoding/encoding huge data might be useful for someone that can trust it. Eg. Vec<Block> stored in the file system or something. And arbitrary limits might get in the way.

apoelstra
apoelstra previously approved these changes May 30, 2022
Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 888a1ec

@tcharding
Copy link
Member

I did a bunch of docs fixes (mainly grammar), if you'd like to use them I pushed a patch to branch dpc-issue-997-2 on my fork.

I had one question, just out of interest, is there a reason you use such short line length in the comments?

While cleaning up I've been favouring line length 100 for comments (I try to not touch them if they are at least 80 though if changing them does not make them noticeably nicer.)

@dpc
Copy link
Contributor Author

dpc commented May 31, 2022

I had one question, just out of interest, is there a reason you use such short line length in the comments?

I typically let rustfmt deal with it, so I wasn't thinking much about. I'm so used to typing out garbage that rustfmt just fixes for me, that I have a hard enough time with the code, and wasn't even thinking about comments. :D

I did a bunch of docs fixes (mainly grammar), if you'd like to use them I pushed a patch to branch dpc-issue-997-2 on my fork.

Notes. Will do. Thanks a lot!

@tcharding
Copy link
Member

I typically let rustfmt deal with it, so I wasn't thinking much about. I'm so used to typing out garbage that rustfmt just fixes for me, that I have a hard enough time with the code, and wasn't even thinking about comments. :D

ah, no sweat. We should have that fixed up soon hopefully :)

tcharding
tcharding previously approved these changes May 31, 2022
Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my comments are nits only.

ACK 888a1ec

chunk_size: usize,
}

/// Read `len` bytes from reader, where `len` could potentially be malicious
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is stale, there is no len argument now. This PR was a little confusing to review because the last patch modifies code in the first patch.

src/consensus/encode.rs Outdated Show resolved Hide resolved
fn read_bytes_from_finite_reader<D: io::Read>(mut d: D, mut opts: ReadBytesFromFiniteReaderOpts) -> Result<Vec<u8>, Error> {
let mut ret = vec![];

debug_assert_ne!(opts.chunk_size, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question please since we have been discussing debug asserts a bit but I'm still not totally understanding when to use what. This debug statement is defensive programming against insane input with chunck size 0 causing an infinite loop, right? What made you choose to use debug assert instead of either
A. a regular asset
B. returning an error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't really sure myself what to do. chunk_size should always be a constant, and this whole function is kind of internal, so I thought checking in debug_ is good enough to spot some silly mistake. I'm still not sure about it. I wish I could make chunk_size a const of some kind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just turned it into assert_ne. Compiler can probably optimize it anyway.

io::Cursor::new(&data),
ReadBytesFromFiniteReaderOpts { len: data.len(), chunk_size }
).unwrap(),
&data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why references here? This works too

            assert_eq!(
                read_bytes_from_finite_reader(
                    io::Cursor::new(&data),
                    ReadBytesFromFiniteReaderOpts { len: data.len(), chunk_size }
                ).unwrap(),
                data
            );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Natural possessiveness, I guess. 🤷

@dpc dpc dismissed stale reviews from tcharding and apoelstra via af2c62f May 31, 2022 01:51
src/network/message.rs Outdated Show resolved Hide resolved
@tcharding
Copy link
Member

LGTM, can you squash this all down into a single commit please?

As things are right now, memory exhaustion protection in `Decodable`
is based on checking input-decoded lengths against arbitrary limits,
and ad-hoc wrapping collection deserialization in `Take`.

The problem with that are two-fold:

* Potential consensus bugs due to incorrect limits.
* Performance degradation when decoding nested structured,
  due to recursive `Take<Take<..>>` readers.

This change introduces a systematic approach to the problem.

A concept of a "size-limited-reader" is introduced to rely on
the input data to finish at enforced limit and fail deserialization.

Memory exhaustion protection is now achived by capping allocations
to reasonable values, yet allowing the underlying collections
to grow to accomodate rare yet legitmately oversized data (with tiny
performance cost), and reliance on input data size limit.

A set of simple rules allow avoiding recursive `Take` wrappers.

Fix #997
Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 082e185

@apoelstra
Copy link
Member

No comment on choice of assert ;)

Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 082e185

@apoelstra apoelstra merged commit 58a62c0 into rust-bitcoin:master Jun 1, 2022
Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post merge ACK. Thanks for doing this cleanly

@dpc
Copy link
Contributor Author

dpc commented Jun 3, 2022

Bad news. Somehow, somewhere I've lost e6a3ec41bb9b3f072453665338059879d982eada . No idea when and why.

Oh, it doesn't even show up.

commit e6a3ec41bb9b3f072453665338059879d982eada (issue-997)
Reflog: HEAD@{87} (Dawid Ciężarkiewicz <dpc@dpc.pw>)
Reflog message: commit (amend): Forward default `consensus_decode` to `consensus_decode_from_finite_reader`
Author: Dawid Ciężarkiewicz <dpc@dpc.pw>
Date:   Mon May 30 18:14:09 2022 -0700

    Forward default `consensus_decode` to `consensus_decode_from_finite_reader`

I'll introduce it in another PR. You might want to take a second look if I didn't mess up something else, just in case. I'm really sorry - no idea RN how it happened.

*Edit: I push it again, should show up as 80d8cc2 . I intended for it to be in there, would make this PR even smaller.

@dpc
Copy link
Contributor Author

dpc commented Jun 12, 2022

@RCasatta since I know you're using these traits in blocks_iterator and care about performance there you might want to be aware of this and #1035 .

ChallengeDev210 pushed a commit to ChallengeDev210/rust-bitcoin that referenced this pull request Aug 1, 2022
…te_reader` optimization

082e185 Add `consensus_decode_from_finite_reader` optimization (Dawid Ciężarkiewicz)

Pull request description:

  As things are right now, memory exhaustion protection in `Decodable`
  is based on checking input-decoded lengths against arbitrary limits,
  and ad-hoc wrapping collection deserialization in `Take`.

  The problem with that are two-fold:

  * Potential consensus bugs due to incorrect limits.
  * Performance degradation when decoding nested structured,
    due to recursive `Take<Take<..>>` readers.

  This change introduces a systematic approach to the problem.

  A concept of a "size-limited-reader" is introduced to rely on
  the input data to finish at enforced limit and fail deserialization.

  Memory exhaustion protection is now achived by capping allocations
  to reasonable values, yet allowing the underlying collections
  to grow to accomodate rare yet legitmately oversized data (with tiny
  performance cost), and reliance on input data size limit.

  A set of simple rules allow avoiding recursive `Take` wrappers.

  Fix #997

ACKs for top commit:
  apoelstra:
    ACK 082e185
  tcharding:
    ACK 082e185

Tree-SHA512: fa04b62a4799c9a11c5f85ec78a18fa9c2cd4819c83a0d6148fbb203c6fa15c2689cb0847e612b35b8c285a756d81690b31a9bede4486b845f0c16b9fcc6d097
TheBlueMatt added a commit that referenced this pull request Nov 2, 2022
a0489d4 fuzz: use travis-fuzz.sh in CI (Andrew Poelstra)
4c6f9b3 fuzz: remove mysteriously-not-necessary quotes from gh action script (Andrew Poelstra)
7baa21c fuzz: disable features in honggfuzz (Andrew Poelstra)
e003e57 Add `consensus_decode_from_finite_reader` optimization (Dawid Ciężarkiewicz)

Pull request description:

  Backport for #1023. Required for #997. Addresses issues like #1359

ACKs for top commit:
  tcharding:
    ACK a0489d4
  TheBlueMatt:
    ACK a0489d4.

Tree-SHA512: 9145d9666e35ae77598aaecf89222c7234637b57ded39b69fbb93ee9ce01c6d7c938b36a2d86319ba84155f2424e524386593d6c0d7af449be6bd118f729fd64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Block/tx consensus decoding bug
4 participants