Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected order of tests using parameterized fixtures #2846

Closed
rbierbasz opened this issue Oct 17, 2017 · 23 comments
Closed

Unexpected order of tests using parameterized fixtures #2846

rbierbasz opened this issue Oct 17, 2017 · 23 comments
Labels
topic: parametrize related to @pytest.mark.parametrize type: bug problem that needs to be addressed

Comments

@rbierbasz
Copy link

rbierbasz commented Oct 17, 2017

Consider following code:

import pytest

@pytest.fixture(scope="function", params=(1, 2, 3))
def f1(request):
    pass
    
@pytest.fixture(scope="function", params=('a', 'b', 'c'))
def f2(request):
    pass

def test(f1, f2):
    pass

The order of the tests meets my expectations:

============================= test session starts =============================
platform win32 -- Python 2.7.12, pytest-3.2.3, py-1.4.34, pluggy-0.4.0 -- c:\users\rbierbasz\.virtualenvs\core\scripts\python.exe
cachedir: .cache
rootdir: C:\Users\rbierbasz\repos\galaxy-core, inifile:
collected 9 items

test_order.py::test[1-a] PASSED
test_order.py::test[1-b] PASSED
test_order.py::test[1-c] PASSED
test_order.py::test[2-a] PASSED
test_order.py::test[2-b] PASSED
test_order.py::test[2-c] PASSED
test_order.py::test[3-a] PASSED
test_order.py::test[3-b] PASSED
test_order.py::test[3-c] PASSED

========================== 9 passed in 0.04 seconds ===========================

But when I changed fixtures' scope from function to module:

import pytest

@pytest.fixture(scope="module", params=(1, 2, 3))
def f1(request):
    pass
    
@pytest.fixture(scope="module", params=('a', 'b', 'c'))
def f2(request):
    pass

def test(f1, f2):
    pass

it seems to get random:

============================= test session starts =============================
platform win32 -- Python 2.7.12, pytest-3.2.3, py-1.4.34, pluggy-0.4.0 -- c:\users\rbierbasz\.virtualenvs\core\scripts\python.exe
cachedir: .cache
rootdir: C:\Users\rbierbasz\repos\galaxy-core, inifile:
collected 9 items

test_order.py::test[1-a] PASSED
test_order.py::test[1-b] PASSED
test_order.py::test[2-b] PASSED
test_order.py::test[2-a] PASSED
test_order.py::test[2-c] PASSED
test_order.py::test[1-c] PASSED
test_order.py::test[3-c] PASSED
test_order.py::test[3-b] PASSED
test_order.py::test[3-a] PASSED

========================== 9 passed in 0.04 seconds ===========================

The same happens for class and session scopes. It leads to fixtures being set up and torn down more times than necessary.

@nicoddemus
Copy link
Member

Definitely it seems a bug, thanks for the report.

Our parametrization code has some standing bugs, it would be nice for us to focus on that part of the code when we get the chance.

@nicoddemus nicoddemus added topic: parametrize related to @pytest.mark.parametrize type: bug problem that needs to be addressed labels Oct 18, 2017
@bilderbuchi
Copy link
Contributor

Have you got any pointers where to start looking?

@nicoddemus
Copy link
Member

I would start debugging here:

pytest/_pytest/python.py

Lines 731 to 732 in 5631a86

def parametrize(self, argnames, argvalues, indirect=False, ids=None,
scope=None):

And here:

pytest/_pytest/fixtures.py

Lines 166 to 192 in 5631a86

def reorder_items(items):
argkeys_cache = {}
for scopenum in range(0, scopenum_function):
argkeys_cache[scopenum] = d = {}
for item in items:
keys = OrderedDict.fromkeys(get_parametrized_fixture_keys(item, scopenum))
if keys:
d[item] = keys
return reorder_items_atscope(items, set(), argkeys_cache, 0)
def reorder_items_atscope(items, ignore, argkeys_cache, scopenum):
if scopenum >= scopenum_function or len(items) < 3:
return items
items_done = []
while 1:
items_before, items_same, items_other, newignore = \
slice_items(items, ignore, argkeys_cache[scopenum])
items_before = reorder_items_atscope(
items_before, ignore, argkeys_cache, scopenum + 1)
if items_same is None:
# nothing to reorder in this scope
assert items_other is None
return items_done + items_before
items_done.extend(items_before)
items = items_same + items_other
ignore = newignore

@ghost
Copy link

ghost commented Oct 24, 2017

Hi,
I would like to provide a fix for this, but want to make sure I understood the problem correctly.

From preliminary research it seems like tests order is determined in reorder_items function in fixtures.py, which invokes reorder_items_atscope, which in turn calls slice_items.

If comments in the code stand true, the desired order of tests that use parametrized fixtures with different scopes is as follows: session > module > class > function?
Is this the only requirement for determining tests order?

I am asking this because current solution is not easy to grasp and seems somewhat overly complicated if we just want to have tests ordered by their scope.

@nicoddemus Any clarification on the subject would be more than welcome.

@nicoddemus
Copy link
Member

@kchomski-reef thanks for the interest.

You are half right: the purpose is to reorder test itemswithin a scope to minimize fixture setup/teardown, when those fixtures are parametrized.

For example, consider this session fixture:

@pytest.fixture(scope='session', params=[1, 2])
def session_fix(request):
    # let's pretend this actually does some heavy weight setup/teardown 
    return request.param

Consider this test file:

def test_normal_1(): pass

def test_uses_fix_1(session_fix): pass
def test_uses_fix_2(session_fix): pass

def test_normal_2(): pass

If we don't do any reordering, the tests would execute in this order:

test_normal_1
test_uses_fix_1[session_fix(1)]
test_uses_fix_1[session_fix(2)]
test_uses_fix_2[session_fix(1)]
test_uses_fix_2[session_fix(2)]
test_normal_2

IOW, this would cause session_fix to be created with parameter 1 for test_uses_fix_1, then be destroyed so a new session_fix can be created with parameter 2 and execute test_uses_fix_1 again with the new fixture. The same happens for test_uses_fix_2 and for every other test which uses those fixtures.

A key fact to understand all this and that is not clear in the documentation is that at any given time only a single instance of session_fix can exist, that's why it needs to be destroyed between each test function call which parametrizes it.

The reorder algorithm thus tries to reorder the tests to minimize fixture setup/teardown:

test_normal_1
test_uses_fix_1[session_fix1]
test_uses_fix_2[session_fix1]
test_uses_fix_1[session_fix2]
test_uses_fix_2[session_fix2]
test_normal_2

This order now allow us to create session_fix(1) once, execute test_uses_fix_1 and test_uses_fix_2 with that fixture, then create session_fix(2) and execute the tests, and so on.

The same is valid for the other scopes other than function.

Let me know if something is not clear on my explanation.

@bilderbuchi
Copy link
Contributor

From looking at the code the first time, you could try this hunch of mine: The source of the non-determinism you saw could be that you are iterating through dicts.
This call could maybe use a sorted wrapped around the generator expression in the argument, otherwise you are feeding the contents of a dict to OrderedDict:

newargkeys = OrderedDict.fromkeys(k for k in argkeys if k not in ignore)

Alternatively, you could replace this {} (which becomes the argkeys above afaict) with an OrderedDict:
argkeys_cache[scopenum] = d = {}

@bilderbuchi
Copy link
Contributor

OK, I just put my money where my mouth is and tested this myself - it doesn't work. 😞
Also, I observe the exact same (wrong) order with the repro above, and it stays the same on successive pytest runs, so it's probably not a dict issue as I guessed above.

@ghost
Copy link

ghost commented Oct 25, 2017

@nicoddemus Thank you for such a detailed explanation.
I'll try to do some more debugging and I'll get back to you if I come up with a possible solution or if I'll have more questions.

@nicoddemus
Copy link
Member

Sounds good, thanks!

@ghost
Copy link

ghost commented Oct 25, 2017

@nicoddemus: After reading your explanation, looking more into the code and checking back OP's examples there doesn't seem to be a bug at all.

Taking into account what you wrote:

the purpose is to reorder test items within a scope to minimize fixture setup/teardown, when those fixtures are parametrized.

and

A key fact to understand all this and that is not clear in the documentation is that at any given time only a single instance of session_fix can exist, that's why it needs to be destroyed between each test function call which parametrizes it.

the 2nd example posted by OP seems to be perfectly valid - the order seems random at first, but after a closer look it can be seen that order of tests is optimal to avoid unnecessary fixtures setups and teardowns:

# OP 2nd example results
test_order.py::test[1-a] PASSED
test_order.py::test[1-b] PASSED
test_order.py::test[2-b] PASSED
test_order.py::test[2-a] PASSED
test_order.py::test[2-c] PASSED
test_order.py::test[1-c] PASSED
test_order.py::test[3-c] PASSED
test_order.py::test[3-b] PASSED
test_order.py::test[3-a] PASSED

f1 is created 4 times and f2 is created 6 times for a total of 10.

If we would like to have the same order for class, module and session fixtures as we have for function scoped fixtures:

# OP 1st example results
test_order.py::test[1-a] PASSED
test_order.py::test[1-b] PASSED
test_order.py::test[1-c] PASSED
test_order.py::test[2-a] PASSED
test_order.py::test[2-b] PASSED
test_order.py::test[2-c] PASSED
test_order.py::test[3-a] PASSED
test_order.py::test[3-b] PASSED
test_order.py::test[3-c] PASSED

then it's not optimal, because f1 is created only 3 times, but f2 is created 9 times, for a total of 12. It's OK for function scoped fixtures because they get setup and tore down every time, so we simply don't care, but for higher scoped fixtures current solution seems to work perfectly fine and as expected.

@bilderbuchi
Copy link
Contributor

Ha, awesome reasoning! Didn't see the forest for the trees!

@nicoddemus
Copy link
Member

nicoddemus commented Oct 25, 2017

@kchomski-reef indeed your reasoning seems correct to me, thanks for taking a look into this.

@rbierbasz-gog what do you think?

@rbierbasz
Copy link
Author

@kchomski-reef thanks for the great insight. I haven't checked reorder_items code yet, but your reasoning seems correct. I only think that the assumptions on which this functionality is based are naive - that all paremetrized fixtures have the same weight. In real world f1 fixture from my example can have much more expensive setup/teardown, or there can be more fixtures depending on it. It would be great if I could control the order somehow. I tried indirect parametrization:

import itertools
import pytest

f1_params = (1, 2, 3)
@pytest.fixture(scope="module", params=f1_params)
def f1(request):
    pass
    
f2_params = ('a', 'b', 'c')
@pytest.fixture(scope="module", params=f2_params)
def f2(request):
    pass

@pytest.mark.parametrize('f1,f2', itertools.product(f1_params, f2_params), indirect=True)
def test(f1, f2):
    pass

But it breaks fixtures' scopes (#570). The only working solutions for me now is moving one parametrized fixture to higher scope:

import pytest

@pytest.fixture(scope="session", params=(1, 2, 3))
def f1_meta(request):
    return request.param

@pytest.fixture(scope="module")
def f1(f1_meta):
    pass
    
@pytest.fixture(scope="module", params=('a', 'b', 'c'))
def f2(request):
    pass

def test(f1, f2):
    pass

But it seems hacky and it's not scalable.
Is there any way of achieving it? Some hook maybe?

@ghost
Copy link

ghost commented Oct 26, 2017

@rbierbasz-gog I totally agree. I came up to the same conclusion while investigating this issue.

It would be really nice to have a possibility to indicate the importance/weight of the fixture and reorder tests based on this importance. Something like additional keyword argument with default value maybe.

@nicoddemus do you think it is a possible feature candidate?

@nicoddemus
Copy link
Member

I think we can cook up a hook that allows users to completely customize fixture ordering, we just have to think of a good interface for that.

Anybody up for opening an issue with an initial proposal for discussion?

@rbierbasz
Copy link
Author

Setting aside a new hook for the moment - pytest_collection_modifyitems allows to reorder tests, which is enough for my needs:

def pytest_collection_modifyitems(session, config, items):
    items.sort(key=lambda x: x._genid)

It's still hacky (accessing private attribute of Item object) 😄

@bilderbuchi
Copy link
Contributor

In any case I think this issue can be closed.

@nicoddemus
Copy link
Member

Closing then.

Anybody up for opening an issue with an initial proposal for discussion?

That invitation still stands though! We would need a few use cases to come up with a hook that is useful for those cases.

@satishmovvar
Copy link

@nicoddemus @kchomski-reef, All,

I seems to have a use case where I wanted to control how the tests get created with parametrized fixtures; The main reason I need this feature is because my test results are dependent on previous test result. At time I need to save previous tests result and then compare it

I want to sequence the tests in a predictable order as originally described in this issue even when the fixtures scopes are at class/module level.

OP 1st example results

test_order.py::test[1-a] PASSED
test_order.py::test[1-b] PASSED
test_order.py::test[1-c] PASSED
test_order.py::test[2-a] PASSED
test_order.py::test[2-b] PASSED
test_order.py::test[2-c] PASSED
test_order.py::test[3-a] PASSED
test_order.py::test[3-b] PASSED
test_order.py::test[3-c] PASSED

@satishmovvar
Copy link

satishmovvar commented Jan 26, 2018

Below is a little more concrete scenario why my tests are need to be in a particular sequence->
I am testing volume function on an amplifier. I need to check change in volume level has changed the output gain; I have different speakers outputs to test across.
I store output level measured with previous test and compare it with next test. So now the order of fixture usage/test execution is very important and I want to control so that I can predict the outcome.

  1. setInputVoltage & creat ListenerObj
  2. setInputFreq
  3. setVolume
  4. listenAudio
    4.a check the output freq of the audio compare with input (test all channels -varChannel)
    4.b test if previous volumeLevel was present then test (test all channels -varChannel)
    else go to next test and compare previous setting
@pytest.fixture(scope="module", autouse=True)
def listener(request):
	# listenr object is capable of listening
	listenerObj = listener()
	request.module.listener = listenerObj
	
	return listener

@pytest.fixture(scope="module", params=[110, 220], autouse=True)
def setInputVoltage(request):
	#set input voltage
	return request.param


@pytest.fixture(scope="class", params=[100, 200, 500], autouse=True)
def setInputFreq(request):
	# set input frequency
	return request.param
	
@pytest.fixture(scope="class", params=[10, 20, 30, 40])
def setVolume(request):
	# set volume 
	return request.param

@pytest.fixture(scope="class")
def listenAudio(request, listener):
	# measures the output all channels out as list
	listener.listen()
	def fin():
		listener.store()
	request.addfinalizer(fin)

	
@pytest.fixture(scope="function", params=[1,2,3])
def varChannel(request):
	# declaring it as fixture to create muliple tests for each channel
	return request.param

def test_amplifierIsOn():
	
@pytest.mark.usefixtures("listenAudio", "setVolume")
class Test_AudioLevel:
	def test_outputFreq(self, setInputFreq):
		assert 
		
	def test_volumeLevel(request, setVolume, varChannel):
		if (listner.storedData):
			assert listener.currentdata[varChannel] > listner.storedData[varChannel] 

@gavincyi
Copy link

gavincyi commented Apr 12, 2018

Totally agree with @rbierbasz-gog. The equal-weighted cost of setup/teardown all fixtures is not a pragmatic assumption while the users would like to customize the ordering if they really consider the performance.

Could we have one more parameter, e.g. priority (between 0 and 100), for the users the customize the cost of setup/teardown fixture? For example,

@pytest.fixture(scope="session", params=[1, 2, 3], priority=100)
def f1(request):
    pass

@pytest.fixture(scope="session", params=[a, b, c], priority=50)
def f2(request):
    pass

def test_f(f1, f2):
    pass

The expected order is then

test_order.py::test[1-a] PASSED
test_order.py::test[1-b] PASSED
test_order.py::test[1-c] PASSED
test_order.py::test[2-a] PASSED
test_order.py::test[2-b] PASSED
test_order.py::test[2-c] PASSED
test_order.py::test[3-a] PASSED
test_order.py::test[3-b] PASSED
test_order.py::test[3-c] PASSED

@nicoddemus
Copy link
Member

@satishmovvar and gavincyi perhaps we should create a new issue?

@gavincyi
Copy link

Good point to have a new issue rather than stuck in a closed ticket. #3393 created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: parametrize related to @pytest.mark.parametrize type: bug problem that needs to be addressed
Projects
None yet
Development

No branches or pull requests

5 participants