Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable comparing migrations to implementation in tests. #724

Closed
Luttik opened this issue Aug 11, 2020 · 16 comments
Closed

Enable comparing migrations to implementation in tests. #724

Luttik opened this issue Aug 11, 2020 · 16 comments

Comments

@Luttik
Copy link

Luttik commented Aug 11, 2020

I would love to have the option to check, in my ci pipeline, if the migrations are up to date with the implementation of the models in the code.

We are almost there
I feel like we can get quite close to this. We can spin up an SQLite and apply the migration to the database, we can even call command.revision(alembic_cfg, autogenerate=True). However, that, of course, creates a migration file instead of returning a concise list of changes to the python runtime.

My search for solutions
It seems like the generation of changes happens somewhere in alembic.autogenerate.compare._populate_migration_script but since this is a protected method I am hesitant to develop against it.

What I would love
I would absolutely love it if there would be some documentation on how to properly test alembic:

  • Assert that upgrading works
  • Assert that downgrading works
  • Assert that migrations are up-to-date with the current codebase. (IMO the most important one)

Of course, the documentation would actually require a good interface to do those things with. But I do believe this would eliminate a significant set of production database issues.

@Luttik Luttik added the requires triage New issue that requires categorization label Aug 11, 2020
@zzzeek zzzeek added documentation question usage and API questions and removed requires triage New issue that requires categorization labels Aug 11, 2020
@zzzeek
Copy link
Member

zzzeek commented Aug 11, 2020

hi there -

The third part of this is accomplished using the compare_metadata() API function documented at https://alembic.sqlalchemy.org/en/latest/api/autogenerate.html#getting-diffs https://alembic.sqlalchemy.org/en/latest/api/autogenerate.html#alembic.autogenerate.compare_metadata

This approach is in widespread use already for the use case you describe, for example here's how many Openstack components use it based on a library function in the "oslo.db" library: https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/test_migrations.py#L567 .

it's not too hard to mention the compare_metadata() function in the recipes section in some way.

For the "assert upgrading / downgrading" works case, openstack has a stepwise utility here as well that does what they call a "Snakewalk": https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/test_migrations.py#L39 , however this is implemented in an abstract way in oslo.db as there are projects that use it either with Alembic or SQLAlchemy-migrate. the idea of snakewalk includes that every single migration is tested individually. openstack Nova does this though they are using sqlalchemy-migrate, for every new migration they have to write a new test as you can see here : https://github.com/openstack/nova/blob/master/nova/tests/unit/db/test_migrations.py#L296 those numbers like "_check_231" "_check_232" etc. are actual version numbers.

Within Alembic itself I'm not comfortable releasing and maintaining the "snakewalk" utility as it is quite complicated and Alembic does not currently have a lot of development resources, but maybe this can be of use. I'm not really sure how to approach documenting this from Alembic as I don't have a simple standalone example that can illustrate what these test suites do.

@Luttik
Copy link
Author

Luttik commented Aug 13, 2020

@zzzeek thanks for the extensive approach. Although it is not yet completely clear to me. What I want is something like this. Again I feel like the presented code snippets get me 90% of the way but that the last 10% is obscured.

def test_migrations_up_to_date(request: FixtureRequest):
    # Use a temp database to not need to affect external systems.
    engine = create_engine('sqlite://')

    # I need the existing migrations to compare them to the the autogenerated models
    # I believe this is the way to obtain those??
    os.chdir(Path(request.session.fspath))
    config: Config = Config('alembic.ini')

    # ... todo: somehow connect the config to the sqlite engine
    # so I have a database that is up-to-date with the current migrations.
    upgrade(config, 'head')

    # Create a comparison between the current and the autogenerated migrations
    mc = MigrationContext.configure(engine.connect())
    diff = compare_metadata(mc, OrmBase.metadata)

    # Assert that there are no differences between the current and the autogenerated migrations
    assert not diff

@zzzeek
Copy link
Member

zzzeek commented Aug 13, 2020

OK here's how to run a command, like upgrade:

https://alembic.sqlalchemy.org/en/latest/api/commands.html

so I can paste these together:

from alembic.config import Config
from alembic import command
from myapplication import OrmBase
from alembic.migration import MigrationContext
from alembic.autogenerate import compare_metadata

def test_thing():
    alembic_cfg = Config("/path/to/yourapp/alembic.ini")
    with engine.begin() as connection:
        alembic_cfg.attributes['connection'] = connection
        command.upgrade(alembic_cfg, "head")

        mc = MigrationContext.configure(connection)

        diff = compare_metadata(mc, OrmBase.metadata)
        assert not diff

try that.

@Luttik
Copy link
Author

Luttik commented Aug 13, 2020

@zzzeek Thanks, that was enough info to get me there.

One thing that I had to change from my pretty default migrations/env.py (or for many alembic/env.py was replacing the (I believe standard) run_migrations_online() method with this:

def get_default_connection() -> Connection:
    connectable = engine_from_config(
        config.get_section(config.config_ini_section),
        prefix="sqlalchemy.",
        poolclass=pool.NullPool,
        connect_args=dict(
            sslmode="verify-ca",
            sslrootcert=environ.get("DB_SSL_CA"),
            sslcert=environ.get("DB_SSL_CERT"),
            sslkey=environ.get("DB_SSL_KEY"),
        ),
    )

    print(f"using {connectable.engine.url=}")

    with connectable.connect() as connection:
        return connection


def run_migrations_online() -> None:
    """Run migrations in 'online' mode.

    In this scenario, we need to create an Engine
    and associate a connection with the context.

    """
    connection = config.attributes.get('connection', None) or get_default_connection()
    context.configure(connection=connection, target_metadata=target_metadata)

    with context.begin_transaction():
        context.run_migrations()

@zzzeek Or am I wrong again and should I have utilized the run_migrations_offline method in some way?

tl;dr it works for me now but I needed to do a small workaround.

@zzzeek
Copy link
Member

zzzeek commented Aug 14, 2020

the purpose of the env.py script is explicitly so that it can be customized for the application, so that you were able to dive in there and do whatever you had to in order to "make it work" is a feature. In this case I forgot to finish reading my own docs, which is that yes when that "connection" is stuck in there we need to do the recipe at https://alembic.sqlalchemy.org/en/latest/cookbook.html#connection-sharing .

The run_migrations_offline() function is not used unless you use --sql mode which is less common.

@Luttik
Copy link
Author

Luttik commented Aug 14, 2020

Yes the env.py file was a stroke of genius.

@ziima
Copy link
Contributor

ziima commented Oct 19, 2020

Would it be possible to add --check option to the revision command to return a non-zero exit code if a non-empty migration is generated?

I admit that it's inspired by Django's makemigrations --check, see https://docs.djangoproject.com/en/3.1/ref/django-admin/#django-admin-makemigrations.

@zzzeek
Copy link
Member

zzzeek commented Oct 19, 2020

that sounds more like a fixture you want to have in your test suite. these are commonplace and are constructed using the approach at https://alembic.sqlalchemy.org/en/latest/api/autogenerate.html#alembic.autogenerate.compare_metadata . basically when you run your test suite, include a test that does compare_metadata and fails if any diffs are generated.

@ziima
Copy link
Contributor

ziima commented Oct 20, 2020

In my point of view, alembic is external tool for the application, so migration checks shouldn't be part of application unittests. I place them on the same level as static analysis tools, such as mypy or flake8. Hence I'd like to have a simple command to run the checks.

I found this tool https://pypi.org/project/alembic-autogen-check/ that seems to do the job. But in my opinion it would be still worth having it as part of alembic CLI.

@zzzeek
Copy link
Member

zzzeek commented Oct 20, 2020

In my point of view, alembic is external tool for the application, so migration checks shouldn't be part of application unittests. I place them on the same level as static analysis tools, such as mypy or flake8. Hence I'd like to have a simple command to run the checks.

oh, well I'd say we are both correct, in that flake8 and mypy are certainly run as part of test suites, but they are typically as top level targets in a tox.ini file, so you're right, this would be nice to have....

I found this tool https://pypi.org/project/alembic-autogen-check/ that seems to do the job.

and... we have it! someone's already done this, so use that.

But in my opinion it would be still worth having it as part of alembic CLI.

why? it only means it won't be maintained as well, Alembic is not actually keeping up with our existing issue load very well right now. it's just a line in your requirements file and using plugins and extensions for testing things so that the work of tool maintanance is spread among more people is common.

it looks like you've put a few issues on there so let's see if @4Catalyzer is responsive.

@zzzeek
Copy link
Member

zzzeek commented Nov 18, 2020

OK re-upping as someone else is asking about this, @4Catalyzer does not seem to be responsive to issues and the project hasn't been updated since march so I am more comfortable evaluating proposals for the "check" feature to be included in alembic.

@ziima
Copy link
Contributor

ziima commented Nov 25, 2020

I'm glad to hear that. Here's my proposal: add --check option to revision command. It would print the differences in migrations (using logging, since alembic has no verbosity) and return non-zero exit code, if it found any. It would be only available when --autogenerate is used. Other command options are applied accordingly.

I kind of lost in the branching, but I believe the revision command fails if unexpected branching occurs.

@zzzeek
Copy link
Member

zzzeek commented Nov 26, 2020

I think that's fine. just as an FYI I dont know if I agree with something else you proposed at 4Catalyzer/alembic-autogen-check#12 because most database migrations aren't compatible with SQLite.

@ziima
Copy link
Contributor

ziima commented Nov 27, 2020

I meant it as an optional usage. But we can resolve that when we get there.

@zzzeek zzzeek added autogenerate - detection command interface feature motivated volunteers requested a feature that has noone to implement; can reopen a 'wontfix' high priority and removed documentation question usage and API questions labels Jan 8, 2021
openstack-mirroring pushed a commit to openstack/openstack that referenced this issue Nov 3, 2021
* Update nova from branch 'master'
  to 9c2d9fd1eb5bbfd32fb99f367d0da1d1de51c330
  - Merge "db: De-duplicate list of removed table columns"
  - db: De-duplicate list of removed table columns
    
    We have a policy of removing fields from SQLAlchemy models at least one
    cycle before we remove the underlying database columns. This can result
    in a discrepancy between the state that our newfangled database
    migration tool, alembic, sees and what's actually in the database. We
    were ignoring these removed fields (and one foreign key constraint) in
    two different locations for both databases: as part of the alembic
    configuration and as part of the tests we have to ensure our migrations
    are in sync with our models (note: the tests actually use the alembic
    mechanism to detect the changes [1]). De-duplicate these.
    
    [1] sqlalchemy/alembic#724 (comment)
    
    Change-Id: I978b4e44cf7f522a70cc5ca76e6d6f1a985d5469
    Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
openstack-mirroring pushed a commit to openstack/nova that referenced this issue Nov 3, 2021
We have a policy of removing fields from SQLAlchemy models at least one
cycle before we remove the underlying database columns. This can result
in a discrepancy between the state that our newfangled database
migration tool, alembic, sees and what's actually in the database. We
were ignoring these removed fields (and one foreign key constraint) in
two different locations for both databases: as part of the alembic
configuration and as part of the tests we have to ensure our migrations
are in sync with our models (note: the tests actually use the alembic
mechanism to detect the changes [1]). De-duplicate these.

[1] sqlalchemy/alembic#724 (comment)

Change-Id: I978b4e44cf7f522a70cc5ca76e6d6f1a985d5469
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
@sqla-tester
Copy link
Collaborator

Nathan Louie has proposed a fix for this issue in the main branch:

add check command for upgrade diffs https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/4295

1 similar comment
@sqla-tester
Copy link
Collaborator

Nathan Louie has proposed a fix for this issue in the main branch:

add check command for upgrade diffs https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/4295

AnishReddyRavula pushed a commit to ChameleonCloud/nova that referenced this issue Jan 17, 2024
We have a policy of removing fields from SQLAlchemy models at least one
cycle before we remove the underlying database columns. This can result
in a discrepancy between the state that our newfangled database
migration tool, alembic, sees and what's actually in the database. We
were ignoring these removed fields (and one foreign key constraint) in
two different locations for both databases: as part of the alembic
configuration and as part of the tests we have to ensure our migrations
are in sync with our models (note: the tests actually use the alembic
mechanism to detect the changes [1]). De-duplicate these.

[1] sqlalchemy/alembic#724 (comment)

Change-Id: I978b4e44cf7f522a70cc5ca76e6d6f1a985d5469
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants