Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New directives syntax #819

Draft
wants to merge 34 commits into
base: main
Choose a base branch
from
Draft

New directives syntax #819

wants to merge 34 commits into from

Conversation

b-butler
Copy link
Member

Description

This PR removes and changes the supported directive names in favor of a more generic structure which has less ambiguous setting and more closely allows for modern scheduler requests. One example is moving np to n_processes which provides some self-documenting effect. We also anchor to n_processes as a basis for most other directives such as n_threads_per_process.

Motivation and Context

There have been multiple issues related to directives behavior or confusion based on naming np and nranks for instance. This clears them up, and simplifies the logic to support various workflows or appropriately disallows them when unlikely to work on a scheduler.

Checklist:

This is useful for various group/bundling resource aggregation.
We use these directives internally to help with scheduling. Even if used
in templates, using the per-process first allows for more control over
resource request and computing the totals later is cheap and simple.
Uses new directives and -per-task options.
directives["gpus"] = directives["processes"] * directives["gpus_per_process"]
if (memory := directives["memory_per_cpu"]) is not None:
directives["memory"] = directives["cpus"] * memory
else:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Georgia Tech's slurm schedular also uses "memory_per_gpu", so if this is the appropriate place, do we need to add the below here also?

elif (memory := directives["memory_per_gpu"]) is not None:
            directives["memory"] = directives["gpus"] * memory

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory_per_gpu is not a directive. In this new scheme processes is the base unit on which gpus and memory are based. If your users need a specific amount of memory per GPU, they could set:

directives={'processes': N, 'gpus_per_process': G, 'memory_per_process': memory_per_gpu * G}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I got that wrong. The directive is (currently) memory_per_cpu, so you would have to calculate that based on the number of cpus per gpu.

I originally suggested memory_per_cpu because that provided a 1:1 mapping to SLURM. However, @b-butler explained that the complexities of bundling and groups essentially requires computing an aggregate memory and the dividing to issue the proper SLURM request. Thus, we could change the user facing directive to memory_per_process.

In either case, the information will be there for the Georgia Tech's template to produce --memory-per-gpu if that is a hard requirement.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--memory-per-gpu is a variable in the Georgia Tech's submission script. We can enter it in the custom any custom value as log as we can import the in the header block values (example https://github.com/bcrawford39GT/signac_numpy_tutorial/blob/main/signac_numpy_tutorial/project/templates/phoenix.sh), without Signac auto placing a default --mem option that can not be removed, This is the main issue, meaning if we add anything the default --mem option is also added, causing 2 conflicting variables. Maybe there can be a variable to not autoprint --mem?

I think the memory_per_process option would be fine instead of --memory-per-cpu and --memory-per-gpu, as long as they can import the value in the header block values (example https://github.com/bcrawford39GT/signac_numpy_tutorial/blob/main/signac_numpy_tutorial/project/templates/phoenix.sh) for manipulation if needed, and Signac does not write --mem automatically. Alternatively, maybe you are saying memory_per_process will be automatically written to --mem ?

If original --memory-per-cpu and --memory-per-gpu it is not feasible to do that is fine. I was under the impression it was OK, not an issue, and in progress.

We will try and adapt whatever is feasible to do. I am trying to understand the viable options or direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new default template will always write --mem-per-cpu to the slurm file, not --mem. We can put this in a {% block memory %} so that child scripts can override it if needed.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. As long as the user has an option to stop the default writing of the new default --mem-per-cpu also that should be fine. We just can not specify it 2x+ or it will error out as overspecified

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we didn't need memory at all, but we do. A number of clusters that flow supports default to a ridiculously small amount of memory per rank. Users therefore need the ability to request more. It is the responsibility of the user to request an appropriate amount that does not result in extra charges.

The systems that flow needs to both a) combine memory requests across bundles/groups and b) provide the --mem-per-cpu needed for srun to correctly launch jobs are complex.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We can work with the final solution that works for everyone. Just wanted to state the barriers to things, so you are aware.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish Memory was free also, on GPUs too! No need to buy commercial cards then

@b-butler
Copy link
Member Author

Sorry for the delay, I have uploaded my current changes.

Some things that need to be done:

  • Regenerate the template test standard (the generation script currently fails and needs to be debugged)
  • Add in support for OMP which requires modifying the launcher syntax.
  • Testing on clusters to validate correctness
  • Adding new tests of the expected behavior

Currently, the code once debugged supports homogenous MPI, heterogeneous nonMPI, and mixed MPI/nonMPI submissions.

@joaander joaander mentioned this pull request May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants