New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental analysis: caching Mutations for multiple runs #1085
Comments
The problem here is that the code and tests are interdependent. It is not that tests are always at fault for an escaping mutation. Sometime the code isn't as properly written as it should to be well-tested by the very same test. Another problem is that any change in any part of the code may cause a new mutation to escape, just as well as an old mutation to be caught. Now, consider there's a per mutation per line of code cache. Under which terms we're going to invalidate it? OK, we can invalidate it if either the test or the subject changes. But what if a mutation is caused by another file, which is technically not covered by the test? This is a great idea, but I have too many questions with too little answers so I'm not even sure where we should begin to implement it. |
From a 10,000 foot view, I would assume the code coverage becomes the determining factor. That is to say that if a file has changed from the last run, that infection would automatically compare to the code coverage and run all mutations that touch the given file based on code coverage. I believe this would be the most 'make sense' way to invalidate the cache. Infection is already smart enough to skip files based off their coverage. I think it would be a simple enough task to just 'add' these files to the skipped files as if they weren't covered. |
I just took a quick look at the code. I'm not highly knowledgable about the ends and outs, so there may be a better place, but a brief look at the system, using the concept I mentioned above, makes me think that the The only problem with this would ensuring that the mutation gets generated if another file changed and this file is covered by the same test. |
I think there is only two-three things that are cacheable unfortunately:
Anything else it not cacheable. IMO for large codebases, if you want the score you do it in a night build that can take hours or dozen of hours, but otherwise infection should most certainly be use incrementally. For now this can be done by restricting it on the changed source files only. Maybe this can be improved though |
@theofidry , How do you suggest restricting it? Perhaps I'm missing something or not seeing what you are seeing. |
with the |
@theofidry , Thank You. I will look into that. An additional thought did occur to me: since the |
Yes maybe we could have an |
Several ideas for the inspiration from @hcoles #1549 (comment)
|
Is your feature request related to a problem? Please describe.
When running on very large projects, it can take time to generate the mutations in addition to running the mutations. As an example, I have a project that has roughly 380 files and 26,867 lines of code. This project takes approximately 20 minutes to run start to finish. I have a second project with roughly 5,000 files and 900,000 lines of code. While I'm not sure the exact time the second project would take, a rough calculation puts us at over 4 hours to process infection.
Describe the solution you'd like
PHP-CS-Fixer has the ability to specify a cache file. This is a cache file tracks the paths of all the files in the project and the last modified time. When you run PHP-CS-Fixer a second time, it compares the modified times of the files to what is stored in the cache file. If the file has changed, then it will test the file again.
I believe a similar feature would be useful in infection. I think it would be useful to actually have two caches in infection. The first cache would be for generating mutations. The system would only process the file to generate the mutations if the files have changed since the last time infection ran.
The second place a cache would be useful is when running the mutations. The system would only run mutations for files that have changed since the last time infection was ran.
Describe alternatives you've considered
Haven't really thought that far ahead.
The text was updated successfully, but these errors were encountered: