Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support filtering & decoding transaction input for txs dataset with --function-signature #145

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

cool-mestorf
Copy link
Contributor

Motivation

Closes #140.

Solution

Most codes are copied from log_decoder which makes a lot of duplicate codes, some seems to be able to be abstracted. Refactoring for removing duplicate codes & documentation TBD, review for current code structure would be very helpful.

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

@sslivkoff
Copy link
Member

hi. just looking through this now

tx decoding would be a great feature. the broad strokes of the code looks correct. basing the initial implementation off of the log_decoder code makes a lot of sense. longer term it would be good to migrate all of the decoding to something cleaner with alloy

can you write out some example cli commands that demonstrate the decoding functionality? would like to see the behavior 1) with no function arguments 2) with function arguments 3) with the same function across multiple contracts 4) with unnamed function arguments

the PR has good scope for the initial version. but some future roadmap items to keep in mind, in case it might influence current tradeoff decisions: 1) want to be able to provide multiple function signatures instead of just one 2) want to input/fetch entire contract abi's instead of providing individual signatures manually, so you could do things like decode all txs in the chain history 3) it would be nice to separate the filtering functionality from the decoding functionality. if you can only enter a single function signature, it makes sense to leave them combined, but in the future I think we want to move beyond single function signatures. of course this will make everything more complicated because the current approach would need separate columns for each function. so probably ok to punt that for now

@cool-mestorf
Copy link
Contributor Author

Thanks for the comments. I generally agree to your comments and I will bring some example commands and its outputs soon. Proposal 1 and 3 looks very reasonable, I will look over my code and try to separate filtering / decoding functionality that enables multiple function signatures as input. Proposal 2 seems a bit trickier, I will try to find some reference from foundry, etc. to fetch abi / contract code from etherscan, or providing path to JSON abi or contract code.

@cool-mestorf
Copy link
Contributor Author

example commands filtering & parsing ERC20 transfers - used arbitrum blocks because ethereum has too many transfers in single block

  1. without --function-signature flag (no filters with function signature applied)
block_number,transaction_index,transaction_hash,nonce,from_address,to_address,value_binary,value_string,value_f64,input,gas_limit,gas_used,gas_price,transaction_type,max_priority_fee_per_gas,max_fee_per_gas,success,chain_id
166322086,0,0x4c3fdc37d131a66b43be4026785b4fe480810a1a7818768915a2c6a01fe01397,0,0x00000000000000000000000000000000000a4b05,0x00000000000000000000000000000000000a4b05,0x0000000000000000000000000000000000000000000000000000000000000000,0,0.0,0x6bf6a42d0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000120b1c70000000000000000000000000000000000000000000000000000000009e9dfa60000000000000000000000000000000000000000000000000000000000000001,0,0,0,106,,,true,42161
166322086,1,0x7233be0a295a6e001ffaa95d814837dd4854fa5b803c3cc6ff032759b49a6645,4,0x76bf3a4f7215736947e17bb3ae105ec8b0a459b1,0xff970a61a04b1ca14834a43f5de4533ebddb5cc8,0x0000000000000000000000000000000000000000000000000000000000000000,0,0.0,0xa9059cbb00000000000000000000000041d3d33156ae7c62c094aae2995003ae63f587b300000000000000000000000000000000000000000000000000000000003dc88e,628000,505696,100000000,2,100000000,100000000,true,42161
166322086,2,0x1aa7de79251129874b9be99c86d905aadab84693f3627801a8a1370c8985eb80,3208,0x1a717fd4b642a290d7a54632b1b96e3859422fe5,0x0000000000000000000000000000000001664799,0x0000000000000000000000000000000000000000000000000000000000000000,0,0.0,0x646174613a6170706c69636174696f6e2f6a736f6e2c7b2270223a22724152422d3230222c226f70223a226d696e74222c227469636b223a2272415242222c22736f6c7574696f6e223a22307839373131356635333233346332356461346261653631326434313163666132616465626162323666373331633730393334353166303633323836313337626666222c22616d74223a223130303030227d,1000000,698343,100000000,2,0,100000000,true,42161
166322086,3,0xeda43fd508295778dd90b5da664ec042c7dbe7222cf437ec1b64a3f6a81cd090,43,0xcd00b7d9b23fd2f3b200683a1bd1525a8feeeeb7,0xc20a5f8c060371e6c326302819da52adfc8a57d3,0x000000000000000000000000000000000000000000000000001744eb7ba40a8a,6549702646696586,6549702646696586.0,0x,430054,315157,100000000,2,0,135000000,true,42161
166322086,4,0x476fd25c8d23b5afb5ff0a616866f57861d80b2183b95fc4257b29e622efdc91,249,0xd2739d594375f7a00009871a1e760664758d9a26,0x1231deb6f5749ef6ce6943a275a1d3e7486f4eae,0x000000000000000000000000000000000000000000000000000221b262dd8000,600000000000000,600000000000000.0,0x4630a0d8fc9b5f7350997a624cb7ac9402077d014f8ab70edba2fb4c60fd91e3bf80680b00000000000000000000000000000000000000000000000000000000000000c00000000000000000000000000000000000000000000000000000000000000100000000000000000000000000d2739d594375f7a00009871a1e760664758d9a26000000000000000000000000000000000000000000000000000000000015c6490000000000000000000000000000000000000000000000000000000000000160000000000000000000000000000000000000000000000000000000000000000f6a756d7065722e65786368616e67650000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002a30783030303030303030303030303030303030303030303030303030303030303030303030303030303000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000200000000000000000000000001111111254eeb25477b68fb85ed929f73a9605820000000000000000000000001111111254eeb25477b68fb85ed929f73a9605820000000000000000000000000000000000000000000000000000000000000000000000000000000000000000af88d065e77c8cc2239327c5edb3a432268e5831000000000000000000000000000000000000000000000000000221b262dd800000000000000000000000000000000000000000000000000000000000000000e0000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000a8e449022e000000000000000000000000000000000000000000000000000221b262dd8000000000000000000000000000000000000000000000000000000000000015c616000000000000000000000000000000000000000000000000000000000000006000000000000000000000000000000000000000000000000000000000000000014000000000000000000000006f38e884725a116c9c7fbf208e79fe8828a2595f2e9b3012000000000000000000000000000000000000000000000000,2270085,1539258,100000000,2,100000000,100000000,true,42161
166322086,5,0xc48669df1dbce283b582022062f7c454100f04cdc65e3537490811e6e31599b2,891425,0x9b64203878f24eb0cdf55c8c6fa7d08ba0cf77e5,0xfd086bc7cd5c481dcc9c85ebe478a1c0b69fcbb9,0x0000000000000000000000000000000000000000000000000000000000000000,0,0.0,0xa9059cbb0000000000000000000000005926e274367794db439f049e6ff4a9d9ecbde35e000000000000000000000000000000000000000000000000000000002234b028,1427965,508457,1100000000,0,,,true,42161
  1. with --function-signature "transfer(address,uint)" - as human readable parser does not give any name to its arguments calldata decoder applies arbitrary argument name with its index (code)
block_number,transaction_index,transaction_hash,nonce,from_address,to_address,value_binary,value_string,value_f64,input,gas_limit,gas_used,gas_price,transaction_type,max_priority_fee_per_gas,max_fee_per_gas,success,chain_id,param__arg_0,param__arg_1_binary,param__arg_1_string,param__arg_1_f64
166322086,1,0x7233be0a295a6e001ffaa95d814837dd4854fa5b803c3cc6ff032759b49a6645,4,0x76bf3a4f7215736947e17bb3ae105ec8b0a459b1,0xff970a61a04b1ca14834a43f5de4533ebddb5cc8,0x0000000000000000000000000000000000000000000000000000000000000000,0,0.0,0xa9059cbb00000000000000000000000041d3d33156ae7c62c094aae2995003ae63f587b300000000000000000000000000000000000000000000000000000000003dc88e,628000,0,100000000,2,100000000,100000000,true,42161,0x41d3d33156ae7c62c094aae2995003ae63f587b3,0x00000000000000000000000000000000000000000000000000000000003dc88e,4049038,4049038.0
166322086,5,0xc48669df1dbce283b582022062f7c454100f04cdc65e3537490811e6e31599b2,891425,0x9b64203878f24eb0cdf55c8c6fa7d08ba0cf77e5,0xfd086bc7cd5c481dcc9c85ebe478a1c0b69fcbb9,0x0000000000000000000000000000000000000000000000000000000000000000,0,0.0,0xa9059cbb0000000000000000000000005926e274367794db439f049e6ff4a9d9ecbde35e000000000000000000000000000000000000000000000000000000002234b028,1427965,505696,1100000000,0,,,true,42161,0x5926e274367794db439f049e6ff4a9d9ecbde35e,0x000000000000000000000000000000000000000000000000000000002234b028,573878312,573878312.0
  1. with --function-signature "transfer(address to, uint value)" - column name is taken from given function signature
block_number,transaction_index,transaction_hash,nonce,from_address,to_address,value_binary,value_string,value_f64,input,gas_limit,gas_used,gas_price,transaction_type,max_priority_fee_per_gas,max_fee_per_gas,success,chain_id,param__to,param__value_binary,param__value_string,param__value_f64
166322086,1,0x7233be0a295a6e001ffaa95d814837dd4854fa5b803c3cc6ff032759b49a6645,4,0x76bf3a4f7215736947e17bb3ae105ec8b0a459b1,0xff970a61a04b1ca14834a43f5de4533ebddb5cc8,0x0000000000000000000000000000000000000000000000000000000000000000,0,0.0,0xa9059cbb00000000000000000000000041d3d33156ae7c62c094aae2995003ae63f587b300000000000000000000000000000000000000000000000000000000003dc88e,628000,0,100000000,2,100000000,100000000,true,42161,0x41d3d33156ae7c62c094aae2995003ae63f587b3,0x00000000000000000000000000000000000000000000000000000000003dc88e,4049038,4049038.0
166322086,5,0xc48669df1dbce283b582022062f7c454100f04cdc65e3537490811e6e31599b2,891425,0x9b64203878f24eb0cdf55c8c6fa7d08ba0cf77e5,0xfd086bc7cd5c481dcc9c85ebe478a1c0b69fcbb9,0x0000000000000000000000000000000000000000000000000000000000000000,0,0.0,0xa9059cbb0000000000000000000000005926e274367794db439f049e6ff4a9d9ecbde35e000000000000000000000000000000000000000000000000000000002234b028,1427965,505696,1100000000,0,,,true,42161,0x5926e274367794db439f049e6ff4a9d9ecbde35e,0x000000000000000000000000000000000000000000000000000000002234b028,573878312,573878312.0

@cool-mestorf
Copy link
Contributor Author

I've fixed some code to handle function signatures with anonymous argument name.

Also, considerations for future expansion of this feature - I think current structure keeps separation of filtering (code) and decoding (code). If multiple function signatures are allowed, schema.calldata_decoder would have to be vector of calldata decoders, or rather hashmap of function signature to calldata decoders. extract step will involve extraction of function signature in calldata, and transform step will use the extracted signature to find right decoder from hashmap. Column names will have to be renamed too to avoid collisions between multiple function signature arg names - easiest way would be prefixing column names with human readable signature or raw function signature ex. transfer(address,uint)__param__arg_0 or 0xa9059cbb__param__arg_0.

Some concerns I have is 1. how to support calldatas with unknown human readable signature but known schema, and 2. any needs to only filter but do not parse calldata into separate columns.

@hrik2001
Copy link

Hey, sorry for dropping by randomly. But really looking forward to this feature. When do you folks think that this would be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for filtering txs dataset with function signature, as well as optional calldata decoding
3 participants