Efficient Ethereum Log Queries #56

hkalodner · 2019-08-30T23:26:01Z

Currently there is no efficient trustless way to query for ethereum events output by Arbitrum contracts. I'll begin by reviewing how logs work in ethereum (which are the compilation target of events) and what properties they're expected to have.

Ethereum Logs

A log in Ethereum is a list of up to 4 32 byte values referred to as topics along with an arbitrary sized data field. An ethereum block contains a bloom filter which contains each topic within each log output by any transaction within the block. The correctness of this bloom filter is enforced by consensus. If a user wishes to find all logs with a certain topic within a certain range of block heights, they first download all of the block headers and log bloom filters for those blocks and then do an inclusion check on each of the bloom filters. Next they download the receipts for every transaction from each of the blocks which matches the bloom filter. Finally they search each of those transactions for a matching log output.

When contracts output matching events relatively infrequently, this process can be performed much more efficiently than having to check every single transaction. This process is also trustless since bloom filters are guaranteed to have no false negatives. Note that one might attempt to solve this problem by asking an untrusted node to just send you all events matching a given query. It would be easy to verify that all events returned correctly match the query, but it would be impossible to guarantee

Arbitrum Logs

The Arbitrum VM supports two forms of verifiable output. Messages which transfer data and value which must post their data on chain to guarantee data availability to non-validator parties, and logs which are just a commitment to data from the VM and only show up in an accumulator attached to each assertion.

Currently a log entry is created for every transaction processed which essentially specifies the receipt for that transaction which includes the return data and a list of logged events. In order to find all events matching a given query, you must look at every single logged value. This eliminates the bloom filter optimization that Ethereum provides making it much more expensive to filter events.

Suggested changes

Arbitrum could reach a similar level of performance if Arbitrum VM's log receipts for blocks of transactions rather than single transactions. One way to accomplish this would be to log a block of transactions along with a bloom filter periodically, say once per 2 seconds. This would move Arbitrum VM's even closer to mimicking Ethereum's functionality which we have already done in many other ways.

This would allow light clients to calculate events matching a given query efficiently while maintaining an AnyTrust validity guarantee.

edfelten · 2020-01-03T11:53:06Z

The expected setup is that some node has a full list of the log outputs and can supply the result of a query. What we want is for the Arbitrum VM code to emit periodically an accumulator hash of the log history, in such a way that the result of a query can come with a concise proof of correctness, verifiable by a client who only knows a recent accumulator hash.

edfelten · 2020-01-23T19:40:27Z

Propose a design roughly like this:

VM publishes a simple hash-chain* of log items, each with a sequence number.

(If we prefer, the hash-chain* structure can really be a structure like a Merkle mountain range, which allows log-time proof of contents at any sequence number. Or we can stick with simpler linear chain.)

After every N log items, VM also publishes a summary block. Summary blocks are also hash-chained* together. Each summary block includes a Bloom timestamp data structure.

Bloom timestamp data structure is like a Bloom filter, except that instead of having a boolean in each slot, each slot instead records the latest summary block number whose log-item group touches that Bloom filter slot. If a topic hits several Bloom filter slots, we can take the min of the Bloom timestamp slots it touches, and we know that the topic doesn't appear in groups before that one.

hkalodner added enhancement New feature or request arb-compiler-evm labels Aug 30, 2019

hkalodner added this to the Alpha 3 milestone Aug 30, 2019

hkalodner added this to To do in Alpha 3 Aug 30, 2019

hkalodner removed this from To do in Alpha 3 Sep 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Efficient Ethereum Log Queries #56

Efficient Ethereum Log Queries #56

hkalodner commented Aug 30, 2019

edfelten commented Jan 3, 2020

edfelten commented Jan 23, 2020

Efficient Ethereum Log Queries #56

Efficient Ethereum Log Queries #56

Comments

hkalodner commented Aug 30, 2019

Ethereum Logs

Arbitrum Logs

Suggested changes

edfelten commented Jan 3, 2020

edfelten commented Jan 23, 2020