modify compare.py to consider patches #24

nikomatsakis · 2016-11-29T20:47:30Z

This is not smart enough to track the time for each incremental change separately. I didn't do that because it would have been harder. =) But also because I think it'd be better to retool so that both process.sh and compare.py are using the same code (i.e., so we are measuring the same thing that the server is).

cc @nnethercote

This is not smart enough to track the time for each incremental change separately. I didn't do that because I think it'd be better to retool so that both `process.sh` and `compare.py` are using the same code (i.e., so we are measuring the same thing that the server is).

nnethercote · 2016-11-30T00:09:29Z

I tried the patch. I get this error on regex-0.1.80.

Traceback (most recent call last):
  File "./compare.py", line 82, in <module>
    times2 = run_test(dir, rustc2)
  File "./compare.py", line 41, in run_test
    make(patches, make_env)
  File "./compare.py", line 25, in make
    subprocess.check_call('make all%s > /dev/null 2>&1' % patch, env=make_env, shell=True)
  File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'make all@030-compile_one > /dev/null 2>&1' returned non-zero exit status 2

TBH, I don't like the new "patches" design. Previously things were simple: for every benchmark you could do make clean ; make ; make touch ; make and measure the final make. I have local scripts for running the benchmarks under Cachegrind and DHAT that depend on this; they are now also broken for syntex. I think it's better if each benchmark just has one measurement coming out of it.

nikomatsakis · 2016-12-01T20:36:09Z

@nnethercote

I get this error on regex-0.1.80.

Huh, I thought it was working. I'll check it out.

Previously things were simple: for every benchmark you could do make clean ; make ; make touch ; make and measure the final make.

Yes, but that model did not gracefully handle the needs of incremental compilation, nor model the cumulative effects of stepping through a series of patches. In any case, the end model is still quite simple. Each test still has only one measurement. What has changed is that there is no longer one test per directory.

Regarding the py script, I think i'll investigate rewriting it to just be a front-end for process.sh. My feeling is that we ought to run process.sh for the tests and save the results into some temporary directory, and then compare those. That way there is only one way to gather data, and any changes will be automatically propagated. (It also avoids measuring things like the time it takes for make to get started, though I don't think our compilation is fast enough for that to matter -- but it might matter if we start adding runtime benchmarks that report the results of running cargo bench, which I want to do.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modify compare.py to consider patches #24

modify compare.py to consider patches #24

nikomatsakis commented Nov 29, 2016

nnethercote commented Nov 30, 2016

nikomatsakis commented Dec 1, 2016

modify compare.py to consider patches #24

Are you sure you want to change the base?

modify compare.py to consider patches #24

Conversation

nikomatsakis commented Nov 29, 2016

nnethercote commented Nov 30, 2016

nikomatsakis commented Dec 1, 2016