Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When running pytest on a database with --reuse-db where python manage.py migrate was already run, cause failure on old runpython #756

Open
allan-simon opened this issue Aug 19, 2019 · 14 comments

Comments

@allan-simon
Copy link

allan-simon commented Aug 19, 2019

I explain myself,

I have a project with a lot of migrations, with 3 important migrations for our case (in that order):

  1. a migration that add the field foo to the model Bar
  2. a migration that has a RunPython, doing some computation based on the field foo
  3. a migration that remove the field foo

now if i first do

python manage.py migrate

I finish with a database without field foo => normal and expected

if I then run pytest --reuse-db I will got SQL failure because pytest re-runs all migrations, but without actually replaying the changes on the database , but only running the RunPython, so the RunPython of 2) is run on database were the field has already been removed

is it a known behaviour , if so why ?

I would have thought that it would see migrations has been run , and will not execute anything, runpython included

@allan-simon allan-simon changed the title When running pytest on a database where python manage.py migrate was already run, cause failure on old runpython When running pytest on a database with --reuse-db where python manage.py migrate was already run, cause failure on old runpython Aug 19, 2019
@allan-simon
Copy link
Author

note that it does not seem to be deterministic (I've removed --reuse-db and it passed once in CI, and now it does not work anymore, and our CI are on aws spot instances, which mean they are recycled pretty often, so we don't have any dangling state between two CI run)

@blueyed
Copy link
Contributor

blueyed commented Aug 20, 2019

if I then run pytest --reuse-db I will got SQL failure because pytest re-runs all migrations

Use --no-migrations then with --reuse-db (but I would assume that's the case already)?

Use -vvv to see when migrations are run.

I also suggest trying to debug this locally, where you can better add breakpoints etc.. ;)
(but for CI you could also add printing of course - hopefully -vvv gives more insight though)
(#725 adds a django_debug_sql ini option that might be useful - I've just rebased it on master)

@allan-simon
Copy link
Author

I will try -vvv, for --no-migrations, no I didn't run it because I was not aware of the options (I don't know how i missed it ^^' )

@blueyed
Copy link
Contributor

blueyed commented Aug 20, 2019

I would expect --reuse-db to not cause any migrations to be run.
If that's the case (maybe even only for the RunPython ones) it looks like a bug / something to improve I would assume.

What Django version are you using, and what kind of DB?

@allan-simon
Copy link
Author

allan-simon commented Aug 20, 2019

postgresql 10 and django 1.11 , python3.6

@blueyed
Copy link
Contributor

blueyed commented Aug 20, 2019

Also note that it will be run for the first usage of --reuse-db (since there is none to reuse).

This is the current behavior:

diff --git i/tests/test_db_setup.py w/tests/test_db_setup.py
index 0b3d516..ba96b14 100644
--- i/tests/test_db_setup.py
+++ w/tests/test_db_setup.py
@@ -406,3 +406,17 @@ class Migration(migrations.Migration):
         )
         assert result.ret == 0
         result.stdout.fnmatch_lines(["*mark_migrations_run*"])
+
+        # Runs migration with first "--reuse-db" usage.
+        result = testdir.runpytest_subprocess(
+            "--reuse-db", "--tb=short", "-v", "-s"
+        )
+        assert result.ret == 0
+        assert "mark_migrations_run" in result.stdout.str()
+
+        # Does not run them on second "--reuse-db" usage.
+        result = testdir.runpytest_subprocess(
+            "--reuse-db", "--tb=short", "-v", "-s"
+        )
+        assert result.ret == 0
+        assert "mark_migrations_run" not in result.stdout.str()

(with this test: https://github.com/blueyed/pytest-django/blob/37b9216655baa21ae4ccf0b28069c07eb4473cb3/tests/test_db_setup.py#L356-L408)

@allan-simon
Copy link
Author

ok right now i'm focusing on the "without --reuse-db" as here the behaviour should be always consistent (i.e DROP database, and recreate everything) , and even here I'm seeing the bug i'm describing

what is weird is that, the RunPython seems to be run in the correct order of their migrations, and then pytest starts to run the actual tests, but then all the following tests are reporting "ERROR" with the same stacktrace that is in the migration as if the migration was run N time , but the print I added in RunPython code directly only appears once at migration time not at test time

@allan-simon
Copy link
Author

currently (still without --reuse-db) I seem to be able to trigger the bug if

  1. i clean everything (I use a docker for that, to be sure to start from scratch)
  2. i run the tests once (without --reuse-db) , => I control+c the test , but at least the first tests where executing correctly
  3. I re-run the tests, now i see something weird , the database seems correctly drop/recreated according to this output :
acts/tests.py::DuplicateTest::test_duplicate_products Creating test database for alias 'read-only' ('test_vagrant')...
Got an error creating the test database: database "test_vagrant" already exists

Destroying old test database for alias 'read-only' ('test_vagrant')...

however at the migration step, I see far less migration being applied than I actually have (they are still run in order though) then migrations steps stop and the tests start to run, and the first tests are in Errror, and actually they contains the stack trace of the last migration that run.

i.e it tries to run migrations, one of these migrations fails with an exception, (due to the diff postgres schema / django model at the point in time) it goes directly to running the test and only there it shows the exceptions (and duplicated)

what is even weirder is that some tests still succeed (but a very low minority)

@allan-simon
Copy link
Author

I start to wonder if it's due to the fact I have two database, "default" and "read-only" (in test they both actually correspond to the same database) and I wonder if there's some kind of race condition that RunPython are not run on "read-only" on which the drop database is done (but pytest-django do the magic test_%s ) but run on "default" that is still in database "vagrant" that has not been dropped ?

@allan-simon
Copy link
Author

It seems to be that during the migrations sometimes "default" is chosen, sometimes "readonly" is chosen

if I add in lib/python3.6/site-packages/django/db/models/query.py (line 156)

  class QuerySet(object):                                                                                                                                                                                                                    
      """                                                                                                                                                                                                                                    
      Represents a lazy database lookup for a set of objects.                                                                                                                                                                                
      """                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                             
      def __init__(self, model=None, query=None, using=None, hints=None):                                                                                                                                                                    
          self.model = model                                                                                                                                                                                                                 
          self._db = using                                                                                                                                                                                                                   
          print(using)                

I got for the following migration :

def copy_from_userworkflow_to_invoice(apps, schema_editor):                                                                                                                                                                                  
    db_alias = schema_editor.connection.alias                                                                                                                                                                                                
    UserWorkflow = apps.get_model('useracts', 'UserWorkflow')                                                                                                                                                                                
    for uw in UserWorkflow.objects.using(db_alias).filter(invoice__isnull=False):                                                                                                                                                            
        invoice = uw.invoice                                                                                                                                                                                                                 
        invoice.userworkflow = uw                                                                                                                                                                                                            
        invoice.save()                                                                                                                                                                                                                       
                                                                                                                                                                                                                                             
                                                                                                                                                                                                                                             
def copy_from_invoice_to_userworkflow(apps, schema_editor):                                                                                                                                                                                  
    db_alias = schema_editor.connection.alias                                                                                                                                                                                                
    Invoice = apps.get_model('billing', 'Invoice')                                                                                                                                                                                           
    for invoice in Invoice.objects.using(db_alias).filter(userworkflow__isnull=False):                                                                                                                                                       
        uw = invoice.userworkflow                                                                                                                                                                                                            
        uw.invoice = invoice                                                                                                                                                                                                                 
        uw.save()                                                                                                                                                                                                                            
                                                                                                                                                                                                                                             
                                                                                                                                                                                                                                             
class Migration(migrations.Migration):                                                                                                                                                                                                       
                                                                                                                                                                                                                                             
    dependencies = [                                                                                                                                                                                                                         
        ('useracts', '0009_auto_20150616_1256'),                                                                                                                                                                                             
        ('billing', '0001_initial'),                                                                                                                                                                                                         
    ]                                                                                                                                                                                                                                        
                                                                                                                                                                                                                                             
    operations = [                                                                                                                                                                                                                           
        migrations.AddField(                                                                                                                                                                                                                 
            model_name='invoice',                                                                                                                                                                                                            
            name='userworkflow',                                                                                                                                                                                                             
            field=models.ForeignKey(related_name='userworkflows', on_delete=django.db.models.deletion.SET_NULL, to='useracts.UserWorkflow', null=True),                                                                                      
            preserve_default=True,                                                                                                                                                                                                           
        ),                                                                                                                                                                                                                                   
        migrations.RunPython(                                                                                                                                                                                                                
            copy_from_userworkflow_to_invoice,                                                                                                                                                                                               
            copy_from_invoice_to_userworkflow                                                                                                                                                                                                
        ),                                                                                                                                                                                                                                   
    ]  

the following output

None
None
default
None
None
None

if I run a second time i got

None
None
read-only
None
None
None

@allan-simon
Copy link
Author

ok ok so I think i finally found the solution
https://pytest-django.readthedocs.io/en/latest/database.html#tests-requiring-multiple-databases
which points to
https://docs.djangoproject.com/en/dev/topics/testing/advanced/#topics-testing-primaryreplica

so using ['TEST']['MIRROR'] = 'default' seems to fix the problem

however I think ( I don't know how easy to it is to implement) that if pytest-django does explicitly not support several databases it should:

  1. warns you if you have a DATABASES with several keys
  2. if several databases are defined pick 'default' in a deterministic fashion (and or rely on a config to use a specific other one)

what do you think ?

anyway thanks for your help, it really helped me to be on the right track :)

@blueyed
Copy link
Contributor

blueyed commented Aug 20, 2019

I've not read up on your comments yet, but do you use transactional_db in your tests?

@allan-simon
Copy link
Author

git grep transactional_db returns 0 result in my codebase

@blueyed
Copy link
Contributor

blueyed commented Aug 20, 2019

git grep transactional_db returns 0 result in my codebase

There are other ways to use it, e.g. with a live_server.

However, your investigation appears to be spot on.

Not sure currently though what we can do about it, and if Django would behave the same (i.e. then it would not really be a bug in pytest-django, but just something we could help with, but maybe we're using Django's internals not correctly here).

See also #76 and https://github.com/pytest-dev/pytest-django/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+multiple+databases.

It would be great to have this improved/fixed indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants