Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vm] Use prod configs for aptos #13317

Merged
merged 7 commits into from
May 23, 2024
Merged

Conversation

georgemitenkov
Copy link
Contributor

@georgemitenkov georgemitenkov commented May 17, 2024

Description

This PR cherry picks a simple change from #13276 to have aptos production configs in one place and share them between the codebase. In particular:

Commit 1: introduce aptos_prod_..._config() functions to have a single way of creating production configurations.
Commit 2: move VM randomness config to the same crate to have all configs in a single place.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

Key Areas to Review

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented May 17, 2024

⏱️ 107h 37m total CI duration on this PR
Job Cumulative Duration Recent Runs
replay-mainnet / replay-verify (15) 6h 8m 🟩
replay-testnet / replay-verify (18) 6h 6m 🟩
replay-mainnet / replay-verify (16) 5h 44m 🟩
replay-mainnet / replay-verify (18) 5h 18m 🟩
replay-mainnet / replay-verify (11) 4h 56m 🟩
replay-testnet / replay-verify (16) 4h 38m 🟩
replay-mainnet / replay-verify (12) 4h 33m 🟩
replay-mainnet / replay-verify (17) 4h 24m 🟩
replay-mainnet / replay-verify (10) 3h 16m 🟩
windows-build 2h 44m 🟩🟩🟩🟩
replay-mainnet / replay-verify (14) 2h 43m 🟩
rust-targeted-unit-tests 2h 40m 🟩🟩🟩🟩 (+3 more)
replay-testnet / replay-verify (10) 2h 32m 🟩
replay-testnet / replay-verify (9) 2h 28m 🟩
replay-mainnet / replay-verify (3) 2h 21m 🟩
replay-testnet / replay-verify (0) 2h 3m 🟩
replay-testnet / replay-verify (12) 2h 3m 🟩
replay-mainnet / replay-verify (13) 1h 54m 🟩
replay-testnet / replay-verify (8) 1h 52m 🟩
replay-testnet / replay-verify (15) 1h 50m 🟩
replay-testnet / replay-verify (11) 1h 48m 🟩
rust-smoke-tests 1h 46m 🟩🟥🟩🟩
execution-performance / single-node-performance 1h 42m 🟩🟩🟩🟩
replay-mainnet / replay-verify (9) 1h 41m 🟩
replay-mainnet / replay-verify (6) 1h 36m 🟩
rust-move-tests 1h 30m 🟥🟩🟩🟩 (+4 more)
replay-mainnet / replay-verify (8) 1h 30m 🟩
replay-mainnet / replay-verify (4) 1h 29m 🟩
replay-mainnet / replay-verify (2) 1h 24m 🟩🟩
replay-mainnet / replay-verify (7) 1h 19m 🟩
replay-mainnet / replay-verify (5) 1h 18m 🟩
replay-testnet / replay-verify (6) 1h 18m 🟩
replay-testnet / replay-verify (1) 1h 17m 🟩
replay-mainnet / replay-verify (0) 1h 15m 🟩🟩
replay-testnet / replay-verify (2) 1h 14m 🟩
replay-testnet / replay-verify (17) 1h 13m 🟩
replay-testnet / replay-verify (14) 1h 12m 🟩
replay-testnet / replay-verify (13) 1h 3m 🟥🟩
rust-move-unit-coverage 59m 🟩🟩🟩🟩
replay-testnet / replay-verify (4) 57m 🟥🟩
replay-testnet / replay-verify (5) 56m 🟩
replay-testnet / replay-verify (3) 55m 🟩
replay-testnet / replay-verify (7) 52m 🟩
rust-lints 44m 🟥🟥🟥🟩 (+3 more)
forge-framework-upgrade-test / forge 43m 🟥🟩🟩
forge-e2e-test / forge 41m 🟩🟩🟩
replay-mainnet / replay-verify (1) 40m 🟩🟩
forge-compat-test / forge 39m 🟩🟩🟩
rust-images / rust-all 39m 🟩🟥🟩🟩
cli-e2e-tests / run-cli-tests 38m 🟥🟥🟥
run-tests-main-branch 30m 🟩🟩🟩🟩 (+3 more)
rust-build-cached-packages 25m 🟩🟩🟩🟩
test-target-determinator 17m 🟩🟩🟩🟩
execution-performance / test-target-determinator 17m 🟩🟩🟩🟩
check 17m 🟩🟩🟩🟩
check-dynamic-deps 13m 🟩🟩🟩🟩🟩 (+4 more)
general-lints 12m 🟩🟩🟩🟩 (+3 more)
semgrep/ci 4m 🟩🟩🟩🟩🟩 (+4 more)
node-api-compatibility-tests / node-api-compatibility-tests 3m 🟩🟩🟩
file_change_determinator 2m 🟩🟩🟩🟩🟩 (+4 more)
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+4 more)
file_change_determinator 50s 🟩🟩🟩🟩
permission-check 27s 🟩🟩🟩🟩🟩 (+4 more)
permission-check 27s 🟩🟩🟩🟩🟩 (+4 more)
permission-check 27s 🟩🟩🟩🟩🟩 (+4 more)
permission-check 20s 🟩🟩🟩🟩🟩 (+4 more)
permission-check 15s 🟩🟩🟩🟩
determine-test-metadata 14s 🟩🟩
determine-docker-build-metadata 6s 🟩🟩🟩🟩

🚨 4 jobs on the last run were significantly faster/slower than expected

Job Duration vs 7d avg Delta
cli-e2e-tests / run-cli-tests 11m 7m +66%
rust-move-tests 12m 8m +48%
execution-performance / single-node-performance 25m 19m +37%
test-target-determinator 4m 3m +31%

settingsfeedbackdocs ⋅ learn more about trunk.io

@georgemitenkov georgemitenkov changed the title [vm] use prod configs for aptos [vm] Use prod configs for aptos May 17, 2024
@georgemitenkov georgemitenkov enabled auto-merge (squash) May 20, 2024 22:13

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

types/src/on_chain_config/aptos_features.rs Outdated Show resolved Hide resolved
types/src/on_chain_config/aptos_features.rs Outdated Show resolved Hide resolved
// For historical reasons, we support still < gas version 5, but if a new caller don't specify
// the gas version, we default to 5, which was introduced in late '22.
let gas_feature_version = gas_feature_version_opt.unwrap_or(5);
if gas_feature_version < 5 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HM, is there any invariant that we can check here? Seems like the only thing we need from that parameter is whether it's None or <5? what happens if some large number is provided? Couldn't we just pass a boolean instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to not take gas feature version into account

types/src/on_chain_config/aptos_features.rs Outdated Show resolved Hide resolved
types/src/vm/configs.rs Outdated Show resolved Hide resolved
types/src/vm/configs.rs Outdated Show resolved Hide resolved
let enable_invariant_violation_check_in_swap_loc =
!timed_features.is_enabled(TimedFeatureFlag::DisableInvariantViolationCheckInSwapLoc);

let mut type_max_cost = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is max cost actually 0 or 0 is used as None here? the below ones make sense if not enabled means byte/base cost being actually 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is basically to ensure pseudo gas metering in type to tag conversion is not charged if feature is not enabled.

types/src/vm/configs.rs Show resolved Hide resolved
@@ -1388,12 +1386,12 @@ impl AptosVM {

/// Deserialize a module bundle.
fn deserialize_module_bundle(&self, modules: &ModuleBundle) -> VMResult<Vec<CompiledModule>> {
let max_version = get_max_binary_format_version(self.features(), None);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vgao1996 this is a bug, right? We default to 5 here because of passing None, but instead should select 5,6,7 based on feature flags

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looking at the code, but one thing I noticed here is that the Option passed in is NOT the binary format version, but the gas feature version.

Copy link

codecov bot commented May 22, 2024

Codecov Report

Attention: Patch coverage is 88.72180% with 15 lines in your changes are missing coverage. Please review.

Project coverage is 33.0%. Comparing base (d6ce1b9) to head (4239576).
Report is 1 commits behind head on main.

Files Patch % Lines
...kup-cli/src/backup_types/state_snapshot/restore.rs 0.0% 4 Missing ⚠️
aptos-move/aptos-vm/src/aptos_vm.rs 25.0% 3 Missing ⚠️
types/src/on_chain_config/aptos_features.rs 75.0% 3 Missing ⚠️
third_party/move/move-vm/runtime/src/runtime.rs 33.3% 2 Missing ⚠️
...ptos-move/aptos-resource-viewer/src/module_view.rs 0.0% 1 Missing ⚠️
third_party/move/move-vm/runtime/src/config.rs 66.6% 1 Missing ⚠️
third_party/move/move-vm/runtime/src/loader/mod.rs 85.7% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #13317    +/-   ##
========================================
  Coverage    32.9%    33.0%            
========================================
  Files        1768     1764     -4     
  Lines      339264   339051   -213     
========================================
- Hits       111949   111917    -32     
+ Misses     227315   227134   -181     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@vgao1996 vgao1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I feel like we should just run replay verify

@@ -1388,12 +1386,12 @@ impl AptosVM {

/// Deserialize a module bundle.
fn deserialize_module_bundle(&self, modules: &ModuleBundle) -> VMResult<Vec<CompiledModule>> {
let max_version = get_max_binary_format_version(self.features(), None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looking at the code, but one thing I noticed here is that the Option passed in is NOT the binary format version, but the gas feature version.

types/src/on_chain_config/aptos_features.rs Outdated Show resolved Hide resolved
@georgemitenkov georgemitenkov enabled auto-merge (squash) May 23, 2024 14:32

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on 3ffe0986b5fe4acb76544ae7ae85d73b91a6a411 ==> 4239576e05ab2cdf9e169c16e79f411e850b1ec3

Compatibility test results for 3ffe0986b5fe4acb76544ae7ae85d73b91a6a411 ==> 4239576e05ab2cdf9e169c16e79f411e850b1ec3 (PR)
1. Check liveness of validators at old version: 3ffe0986b5fe4acb76544ae7ae85d73b91a6a411
compatibility::simple-validator-upgrade::liveness-check : committed: 6515.889965929541 txn/s, latency: 4911.816745880862 ms, (p50: 4800 ms, p90: 7200 ms, p99: 8700 ms), latency samples: 252480
2. Upgrading first Validator to new version: 4239576e05ab2cdf9e169c16e79f411e850b1ec3
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 2617.5822523732086 txn/s, latency: 11866.865119266055 ms, (p50: 12900 ms, p90: 14600 ms, p99: 15900 ms), latency samples: 109000
3. Upgrading rest of first batch to new version: 4239576e05ab2cdf9e169c16e79f411e850b1ec3
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 3341.025610142515 txn/s, latency: 9156.100827165868 ms, (p50: 8800 ms, p90: 14100 ms, p99: 14300 ms), latency samples: 137820
4. upgrading second batch to new version: 4239576e05ab2cdf9e169c16e79f411e850b1ec3
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 6408.734424403526 txn/s, latency: 5107.058919845692 ms, (p50: 4800 ms, p90: 8300 ms, p99: 9400 ms), latency samples: 233300
5. check swarm health
Compatibility test for 3ffe0986b5fe4acb76544ae7ae85d73b91a6a411 ==> 4239576e05ab2cdf9e169c16e79f411e850b1ec3 passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 4239576e05ab2cdf9e169c16e79f411e850b1ec3

two traffics test: inner traffic : committed: 7905.876760287385 txn/s, latency: 4955.465840573947 ms, (p50: 4800 ms, p90: 6300 ms, p99: 12400 ms), latency samples: 3420520
two traffics test : committed: 100.07740781385272 txn/s, latency: 1879.9189655172413 ms, (p50: 1900 ms, p90: 2100 ms, p99: 2800 ms), latency samples: 1740
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.217, avg: 0.206", "QsPosToProposal: max: 0.339, avg: 0.244", "ConsensusProposalToOrdered: max: 0.471, avg: 0.427", "ConsensusOrderedToCommit: max: 0.377, avg: 0.357", "ConsensusProposalToCommit: max: 0.811, avg: 0.784"]
Max round gap was 1 [limit 4] at version 1605905. Max no progress secs was 4.654881 [limit 15] at version 1605905.
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on 3ffe0986b5fe4acb76544ae7ae85d73b91a6a411 ==> 4239576e05ab2cdf9e169c16e79f411e850b1ec3

Compatibility test results for 3ffe0986b5fe4acb76544ae7ae85d73b91a6a411 ==> 4239576e05ab2cdf9e169c16e79f411e850b1ec3 (PR)
Upgrade the nodes to version: 4239576e05ab2cdf9e169c16e79f411e850b1ec3
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1143.0574929172499 txn/s, submitted: 1144.6039564755067 txn/s, failed submission: 1.546463558256813 txn/s, expired: 1.546463558256813 txn/s, latency: 2581.358629686896 ms, (p50: 1900 ms, p90: 4700 ms, p99: 9900 ms), latency samples: 103480
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1149.445380769666 txn/s, submitted: 1152.5281844728795 txn/s, failed submission: 3.082803703213663 txn/s, expired: 3.082803703213663 txn/s, latency: 2590.111743295019 ms, (p50: 2100 ms, p90: 4500 ms, p99: 9300 ms), latency samples: 104400
5. check swarm health
Compatibility test for 3ffe0986b5fe4acb76544ae7ae85d73b91a6a411 ==> 4239576e05ab2cdf9e169c16e79f411e850b1ec3 passed
Upgrade the remaining nodes to version: 4239576e05ab2cdf9e169c16e79f411e850b1ec3
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1183.7857794639474 txn/s, submitted: 1185.1665560340844 txn/s, failed submission: 1.380776570136797 txn/s, expired: 1.380776570136797 txn/s, latency: 2687.1052877138413 ms, (p50: 2100 ms, p90: 4800 ms, p99: 9900 ms), latency samples: 102880
Test Ok

@georgemitenkov georgemitenkov merged commit 6582ad5 into main May 23, 2024
53 of 54 checks passed
@georgemitenkov georgemitenkov deleted the george/aptos-prod-configs branch May 23, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants