Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ID Consistency in Fixtures / Snapshots. #1025

Merged
merged 13 commits into from Feb 12, 2024

Conversation

plypaul
Copy link
Contributor

@plypaul plypaul commented Feb 5, 2024

Description

I found some cases where there were run-to-run ID inconsistencies in some fixture objects. Previously, the test fixtures included a ConsistentIdObjectRepository that handled this, but there have since been a few additions. This PR adds / migrates existing fixtures to use mf_engine_test_fixture_mapping (may be renamed) that is a more comprehensive version of ConsistentIdObjectRepository. This will make snapshots generated using those fixtures (added in later PRs) consistent.

To recap:

  • Many tests compare snapshots of objects / SQL queries.
  • Those objects with IDs may have been created through a fixture, and IDs are assigned sequentially in the order of creation.
  • When running tests, the order of execution of fixtures for a given scope can vary from run to run.
  • When the run order changes, IDs for objects may change since the IDs are assigned sequentially.
  • ID changes will cause unexpected snapshot diffs.
  • To resolve this, we create a single session-level fixture that creates all affected objects in a consistent order.

@cla-bot cla-bot bot added the cla:yes label Feb 5, 2024
@plypaul plypaul force-pushed the plypaul--88.1--split-multi-hop-manifest branch from 4cf39db to 0ec9b86 Compare February 5, 2024 19:33
@plypaul plypaul force-pushed the plypaul--88.2--improve-snapshot-id-consistency branch 5 times, most recently from 7dad28b to d9613a0 Compare February 5, 2024 22:44
@plypaul plypaul changed the title Improved Consistency of IDs in Snapshots Improved ID Consistency in Snapshots Feb 5, 2024
@plypaul plypaul force-pushed the plypaul--88.2--improve-snapshot-id-consistency branch from d9613a0 to 118b647 Compare February 6, 2024 21:38
@plypaul plypaul changed the title Improved ID Consistency in Snapshots Improve ID Consistency in Snapshots Feb 6, 2024
@plypaul plypaul added the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Feb 6, 2024
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS February 6, 2024 21:47 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS February 6, 2024 21:47 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS February 6, 2024 21:47 — with GitHub Actions Inactive
@plypaul plypaul temporarily deployed to DW_INTEGRATION_TESTS February 6, 2024 21:47 — with GitHub Actions Inactive
@github-actions github-actions bot removed the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Feb 6, 2024
@plypaul plypaul changed the title Improve ID Consistency in Snapshots Improve ID Consistency in Fixtures / Snapshots. Feb 7, 2024
@plypaul plypaul marked this pull request as ready for review February 9, 2024 00:14
Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things:

  1. I can't load the snapshots except the fop few. Those look fine, so I'm trusting your assertion about the remainder being limited to number changes is correct.
  2. I do not see how this solves the problem you were having, in part because I'm not clear on where you were encountering inconsistencies.
  3. We might need to do even more here because xdist doesn't do test-suite-execution-level fixtures

All that said, the general cleanup in here is worth a lot - the fixtures are cleaner and better organized, so we should merge this and see if we still need to do more later.

data_sets: OrderedDict[str, SemanticModelDataSet]
) -> OrderedDict[str, ReadSqlSourceNode]:
"""Return a mapping from the name of the semantic model to the dataflow plan node that reads from it."""
# Moved from model_fixtures.py.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the note, please do clean these up before merging.

return data_sets


@pytest.fixture(scope="session")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To resolve this, we create a single session-level fixture that creates all affected objects in a consistent order.

Note: the "session" scope marker works on a single-instance local run. The problem we might run into here is xdist spawns multiple sessions, each with its own fixture loadout, so there's no guarantee that there will be exactly 1 of these fixtures. Relevant issues: pytest-dev/pytest-xdist#783 pytest-dev/pytest-xdist#271

Now, this might not make a difference in practice - xdist has some concrete splitting logic for tests, so all tests within a module are likely to be grouped into the same process, which means the ID generators are initialized consistently for those blocks of tests. If that logic changes, or if the number of process instances changes such that tests are split into more (or less) granular blocks, the IDs might change as well.

The thing I'm not certain of is if any of this really matters to us - due to the changes in #1015 every time we create a query execution plan we re-set the IDs regardless, so I just don't know if there's any practical impact on IDs here. I assume there is, otherwise you wouldn't bother with this, but if there was you should know that this might not be a long-term robust solution.

Also, since this is apparently not part of the stack with #1015 (or maybe that one was moved on top of this somewhere?) it's kind of hard for me to reason about how those two things will work together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replied as a top-level comment.

Also, since this is apparently not part of the stack with #1015 (or maybe that one was moved on top of this somewhere?) it's kind of hard for me to reason about how those two things will work together.

Should have outlined this better, but yes, this PR is part of the same stack and is under #1015.

semantic_manifest_lookup=cyclic_join_semantic_manifest_lookup,
time_spine_source_node=consistent_id_object_repository.cyclic_join_time_spine_source_node,
)
return mf_engine_test_fixture_mapping[SemanticManifestName.CYCLIC_JOIN_MANIFEST].dataflow_plan_builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is much nicer, thank you.

@@ -3,7 +3,8 @@
import logging
from dataclasses import dataclass
from enum import Enum
from typing import Dict, Mapping, OrderedDict, Sequence, Tuple
from typing import Dict, Mapping, OrderedDict, Sequence
from typing import Tuple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, but whatever....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using the IDE to add imports, so maybe a bug there? I'll update.


logger = logging.getLogger(__name__)


@dataclass(frozen=True)
class ConsistentIdObjectRepository:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huzzah!

@@ -19,7 +19,7 @@ class IdNumberSpace:
CONSISTENT_ID_REPOSITORY = 10000


@pytest.fixture(autouse=True)
@pytest.fixture(autouse=True, scope="function")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, the function-scope here may be what imposes id number consistency, but this shouldn't be a change in behavior since scope defaults to function so I'm not really sure what you were running into or what this fixes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left over from some testing iteration - I'll remove.

@@ -45,19 +46,21 @@ def query_parser( # noqa: D
def extended_date_dataflow_plan_builder( # noqa: D
mf_engine_test_fixture_mapping: Mapping[SemanticManifestName, MetricFlowEngineTestFixture]
) -> DataflowPlanBuilder:
# Scope needs to be function as the DataflowPlanBuilder contains state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we explicitly tag these with "function" then?

Comment on lines 18 to 37
start_value: int

@staticmethod
def for_test_start() -> IdNumberSpace:
"""Before each test run."""
return IdNumberSpace(0)

@staticmethod
def for_block(block_index: int, block_size: int = 2000, first_block_offset: int = 10000) -> IdNumberSpace:
"""Used to define fixed-size blocks of ID number spaces.

This is useful for creating unique / non-overlapping IDs for different groups of objects.
"""
if not block_index >= 0:
raise RuntimeError(f"block_index should be >= 0. Got: {block_index}")
if not block_size > 1:
raise RuntimeError(f"block_size should be > 1. Got: {block_size}")
if not first_block_offset >= 0:
raise RuntimeError(f"first_block_offset should be >= 0. Got: {first_block_offset}")
return IdNumberSpace(first_block_offset + block_size * block_index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future, can we please avoid mixing large mechanical changes like renames with structural or functional changes (or even other renames)?

It's hard to convey just how much harder it is to review this commit as a result, and in other contexts this kind of thing leads to preventable incidents because reviewers get confused about the intersection of these changes and fail to recognize when some small thing is out of place in the mass of mechanical updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've been meaning to avoid such cases as I do get that it makes for a more challenging review.

Comment on lines +55 to +62
AMBIGUOUS_RESOLUTION_MANIFEST = SemanticManifestSetupPropertySet(
semantic_manifest_name="ambiguous_resolution_manifest",
id_number_space=IdNumberSpace.for_block(0),
)
# Not including CONFIG_LINTER_MANIFEST as it has intentional errors for running validations.
CYCLIC_JOIN_MANIFEST = SemanticManifestSetupPropertySet(
semantic_manifest_name="cyclic_join_manifest", id_number_space=IdNumberSpace.for_block(1)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this refines the consistency further, but I'm still not sure if this solves the consistency problem altogether.

@plypaul
Copy link
Contributor Author

plypaul commented Feb 9, 2024

I do not see how this solves the problem you were having, in part because I'm not clear on where you were encountering inconsistencies.

The issue I saw was similar to the following:

  • There are 2 fixtures (fixture_0 and fixture_1) that assign IDs.
  • When all tests are run in a testing session, fixture_0 runs first, followed by fixture_1. The ID for the object in fixture0 is id_0 and the ID for the object in fixture1 is id_1.
  • When running a single test that only depends on fixture1, the ID for the object is now id_0 since fixture_0 didn't run.

We might need to do even more here because xdist doesn't do test-suite-execution-level fixtures

There shouldn't need to be anything special done for xdist since it's effectively the same as running a subset of tests.

Base automatically changed from plypaul--88.1--split-multi-hop-manifest to main February 12, 2024 19:13
@plypaul plypaul force-pushed the plypaul--88.2--improve-snapshot-id-consistency branch from 118b647 to 71326ca Compare February 12, 2024 21:36
@plypaul plypaul merged commit 2ade7a8 into main Feb 12, 2024
9 checks passed
@plypaul plypaul deleted the plypaul--88.2--improve-snapshot-id-consistency branch February 12, 2024 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants