-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify groups argument #797
Conversation
so you want |
lib/parallel_tests/grouper.rb
Outdated
end | ||
|
||
specified_items, items = items.partition do |item, _size| | ||
specified_specs.any? { |pattern| item =~ /#{pattern}/ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be ==
since docs don't talk about regex ?
specified_specs.any? { |pattern| item =~ /#{pattern}/ } | |
specified_specs.any? { |specified_spec| item == specified_spec } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes thanks for catching will make that change
lib/parallel_tests/grouper.rb
Outdated
@@ -28,7 +28,31 @@ def in_even_groups_by_size(items, num_groups, options = {}) | |||
raise 'Number of isolated processes must be less than total the number of processes' | |||
end | |||
|
|||
if isolate_count >= 1 | |||
if options[:specify_groups] | |||
specify_spec_processes = options[:specify_groups].split('|') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all _spec_
-> _test_
since this is a general base class and not only rspec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just do the ,
split here too so there is only 1 value going into the function
options[:specify_groups].split('|').map { |group| group.split(',') }
and then to get all use all.flatten(1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i edited this a bit in my subsequent commit thats coming, let me know what you think
lib/parallel_tests/grouper.rb
Outdated
end | ||
|
||
if (specified_specs - specified_items.map(&:first)).any? | ||
raise 'Could not find all specs from --specify-spec-processes in main selected files & folders' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should say which to make it easier to debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will add
lib/parallel_tests/grouper.rb
Outdated
specify_spec_processes.each_with_index do |specify_spec_process, i| | ||
groups[i] = specify_spec_process.split(',') | ||
end | ||
return groups if specify_spec_processes.count == num_groups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to check that no files were left out raise "forgot something ?" if (items - specificed).any?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point will add
lib/parallel_tests/grouper.rb
Outdated
return groups if specify_spec_processes.count == num_groups | ||
group_features_by_size(items_to_group(items), groups[specify_spec_processes.count..-1]) | ||
# Don't sort all the groups, only sort the ones not specified in specify_groups | ||
sorted_groups = groups[specify_spec_processes.count..-1].map { |g| g[:items].sort } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract variable from specify_spec_processes.count..-1
and reuse
should talk about that it disables all other sorting/grouping and ideally raise at the end of option parsing if a non-working combo is set |
@grosser I made all the edits asked. Please resolve the ones that you're happy with. There's still the outstanding question of there possibly being lots of files passed in to |
Actually, just realized I missed one comment
I DON'T want other specs not specified in The way I was envisioning it, for those select processes, one would have full control of those processes/the specs run in them. Even if they were chunked and happened to be a little light, in comparison to the other processes, they'd still stand alone. so if the directory structure is huge with 1000s of specs
and you ran
The groups would look like
If you think there should be an option to shuffle those other specs into these groups, if they are looking light, let me know, happy to add that kind of thing in |
lib/parallel_tests/grouper.rb
Outdated
@@ -28,7 +28,39 @@ def in_even_groups_by_size(items, num_groups, options = {}) | |||
raise 'Number of isolated processes must be less than total the number of processes' | |||
end | |||
|
|||
if isolate_count >= 1 | |||
if options[:specify_groups] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of code here ... break it out into a method ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do 👍
lib/parallel_tests/grouper.rb
Outdated
specified_items_found, items = items.partition do |item, _size| | ||
all_specified_tests.any? { |specified_spec| item == specified_spec } | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specified_items_found, items = items.partition do |item, _size| | |
all_specified_tests.any? { |specified_spec| item == specified_spec } | |
end | |
specified_items_found = items.select! { |item, _size| all_specified_tests.include?(item) } || [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want them partitioned? I support using partition because, we will have extracted the specified tests from the items array, then we can order the specified tests as they are specified, and, order the items in the remaining parallel processes.
By using the !
we'd be losing the other tests that aren't specified in specified_groups, the ones that will be ordered in the remaining parallel processes
lib/parallel_tests/grouper.rb
Outdated
all_specified_tests.any? { |specified_spec| item == specified_spec } | ||
end | ||
|
||
specified_specs_not_found = all_specified_tests - specified_items_found.map(&:first) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specified_specs_not_found = all_specified_tests - specified_items_found.map(&:first) | |
specified_specs_not_found = all_specified_tests - specified_items_found.keys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can do .keys
here because specified_items_found
is an array of arrays with filename and size like this
[["spec/test1.rb", 1], ["spec/test2.rb", 2], ["spec/test3.rb", 3]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be a hash with the suggested change above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true...dang I think I need to update some of the unit tests...I dont think their data structures match what they would be at this stage...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it back, I think items at this point should be an array of arrays like this
[["spec/parallel_tests/cli_spec.rb", 15103],
["spec/parallel_tests/cucumber/failure_logger_spec.rb", 1676],
["spec/parallel_tests/cucumber/runner_spec.rb", 2508],
["spec/parallel_tests/cucumber/scenarios_spec.rb", 7811],
["spec/parallel_tests/grouper_spec.rb", 5028],
["spec/parallel_tests/pids_spec.rb", 706],
["spec/parallel_tests/rspec/failures_logger_spec.rb", 409],
["spec/parallel_tests/rspec/logger_base_spec.rb", 815],
["spec/parallel_tests/rspec/runner_spec.rb", 6801],
["spec/parallel_tests/rspec/runtime_logger_spec.rb", 3358],
["spec/parallel_tests/rspec/summary_logger_spec.rb", 429],
["spec/parallel_tests/spinach/runner_spec.rb", 403],
["spec/parallel_tests/tasks_spec.rb", 7345],
["spec/parallel_tests/test/runner_spec.rb", 21006],
["spec/parallel_tests/test/runtime_logger_spec.rb", 1245],
["spec/rails_spec.rb", 1464]]
lib/parallel_tests/grouper.rb
Outdated
@@ -28,7 +28,39 @@ def in_even_groups_by_size(items, num_groups, options = {}) | |||
raise 'Number of isolated processes must be less than total the number of processes' | |||
end | |||
|
|||
if isolate_count >= 1 | |||
if options[:specify_groups] | |||
specify_test_process_groups = options[:specify_groups].split('|') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer a single data structure:
specify_test_process_groups = options[:specify_groups].split('|') | |
specify_test_process_groups = options[:specify_groups].split('|').map { |g| g.split(',') } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need them separate because we need to have a count of the number of groups, and to be able to differentiate the groups. We only will use all_specified_tests
to extract them from the items variable
lib/parallel_tests/grouper.rb
Outdated
raise "Could not find #{specified_specs_not_found} from --specify-groups in the main selected files & folders" | ||
end | ||
|
||
if specify_test_process_groups.count == num_groups && items.flatten.any? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why we'd need to flatten
if specify_test_process_groups.count == num_groups && items.flatten.any? | |
if specify_test_process_groups.count == num_groups && items.any? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since items is an array of arrays, just in case it shows up like [[]]
instead of []
when empty, that was what I was thinking
But maybe when its empty it'd just be []
so no need to flatten?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, might be good as is
lib/parallel_tests/grouper.rb
Outdated
groups[i] = specify_test_process.split(',') | ||
end | ||
|
||
return groups if specify_test_process_groups.count == num_groups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should fail if there are no groups left, but tests left ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I think the return early above will capture that scenario, this one
if specify_test_process_groups.count == num_groups && items.flatten.any?
THIS return early scenario talked about here is, if you pass in specify_groups, and the main files/folders you pass in, all those collected files, exactly match the ones in specify groups...that case should be allowed, and we will return early in that case the organized specify_groups as your groups
lib/parallel_tests/grouper.rb
Outdated
sorted_groups = groups[regular_group_starting_process..-1].map { |g| g[:items].sort } | ||
groups[regular_group_starting_process..-1] = sorted_groups |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorted_groups = groups[regular_group_starting_process..-1].map { |g| g[:items].sort } | |
groups[regular_group_starting_process..-1] = sorted_groups | |
groups[regular_group_starting_process..-1].each { |g| g[:items].sort! } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm not a big fan that we have mixed types in the groups then, so maybe better to have the same type in there from the beginning ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe start by extracting the range to a variable and I'll take a closer look when you think it's ready again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just copying how the other groups do this step, at the end of their blocks
instead of
sorted_groups = groups[regular_group_starting_process..-1].map { |g| g[:items].sort }
groups[regular_group_starting_process..-1] = sorted_groups
it could be 1 line
groups[regular_group_starting_process..-1].map! { |g| g[:items].sort }
At my organization, we heavily use this gem. We sometimes run with 40-50 parallel processes with giant VMs in AWS.
Our test suite has a lot of data-intensive tests. We have plans to refactor these at an architecture level, but at the moment, it is a fact of life.
When using this gem, its possible for these data-intensive tests to be shuffled so they run alongside each other. This makes them a bit more flaky, but, more importantly, it makes them take much longer when they are competing with each other for data.
I wanted to use
--single
and--isolate-n
to help with this issue, however, the specs in--single
are still shuffled into parallel processes by size. There is not full control over them. So it was still possible for these tests to still collide with each other.Maybe this is an anti-pattern, but I thought it would be a good idea to create a feature to fully control a subset of parallel processes and which tests run in them (and in what order), while still respecting the other grouping schemes for the other tests.
Let me know if you think this would be a good feature to have! Happy to make edits and such.
Checklist
master
(if not - rebase it).code introduces user-observable changes.