Community calls

This page holds (temporarily) the agenda and minutes of the bi-weekly community conference calls.

July 12, 2022

Call skipped due to low participance.

Participants

Agenda

Frequency of this meeting
- Suggestion to reduce to monthly.
ReFrame repository has been moved to https://github.com/reframe-hpc/reframe
The CSCS checks have been separated from the main repo, in https://github.com/eth-cscs/cscs-reframe-tests
Gpu Burn test is now a library test, as of #2503
ReFrame 3.11.2 released (Release Notes)
ReFrame 3.12.0 released.
Latest features and bugfixes:
- Allow setting fixture variables from the command line #2515
- Add --mode option to GitLab CI pipeline command #2514
- Check that PBS output is written back to working directory before setting the job as completed #2519
- Working on making tests container-runtime agnostic #2396
- Performance improvements in test case generation #2544
Upcoming plans (https://github.com/reframe-hpc/reframe/milestone/81):
- Move towards ReFrame 4.0 (Tentative backlog)
- Support more flexible ways of configuration #1725
- Convert more CSCS tests to library tests

May 3, 2022

Participants

Vasileios Karakasis (CSCS)
Kenneth Hoste (HPC-UGent)
Victor Holanda (CSCS)
Carlos Rosales-Fernandez (AWS)
Theofilos Manitaras (CSCS)

Agenda

Development updates

ReFrame 3.11.0 released on April 13.
Key new features:
- New --distribute option that allows distributing single-node jobs over a set of nodes. It can also be combined with the -J option, for example to submit jobs to fill a reservation: --distribute=all -J reservation=cool. The current valid partition is always taken into account.
- Extended syntax for valid_systems and valid_prog_environs that allows selecting systems and environments based on features and properties.
- New CustomBuild build backend that delegates the building of test code entirely to users. If you use it, be aware of the side effects of your build scripts!
- Explicitly mark variables and parameters as loggable.
- New library tests merged in.
Future directions:
- Move the repo out of eth-cscs domain and separate the CSCS tests.
- Continue work on test libraries
- Backlog for 3.12 (tentative): https://github.com/eth-cscs/reframe/projects/36

Action Items

Set up a separate meeting with EESSI community on defining common systems/environment properties and features (Victor will make a Doodle and post it in the #confcalls channel).

April 5, 2022

Participants

Vasileios Karakasis (CSCS)
Ake Sandgren (UMEA)
Rafael Sarmiento (CSCS)
Eirini Koutsaniti (CSCS)
Theofilos Manitaras (CSCS)
Carlos Rosales Fernandez (Amazon)

Agenda

Developments updates

OSU microbenchmarks as a library tests are merged
Almost done with the extended syntax of valid_systems and valid_prog_environs (https://github.com/eth-cscs/reframe/pull/2479)
- We had to reimplement how valid systems/environments are selected in order to make it work with fixtures
- The implementation fixes also the bug with --skip-{system|prgenv}-check options when using fixtures.
Still WIP: Distributing a set of tests over multiple nodes (https://github.com/eth-cscs/reframe/pull/2458)
v3.11.0 is planned for Wed. 13/4, since we need to have the two major features above merged.
April 19 call will be skipped.
Ake: When do you plan to split the repo and the site-specific tests?
- We do plan to focus on it as soon as 3.11.0 is out.

March 22, 2022

Participants

Vasileios Karakasis (CSCS)
Victor Holanda (CSCS)
Theofilos Manitaras (CSCS)
Simon Bradford (Univ. Birmingham)

Agenda

We will delay 3.11.0 for two weeks (work got stuck due to limited availability of the team), but an rc release will be done today.
Draft PRs
- Syntax extensions for valid_systems and valid_prog_environs: https://github.com/eth-cscs/reframe/pull/2479
- OSU microbenchmarks library test (https://github.com/eth-cscs/reframe/pull/2421)
  - Still requires a bit of fine tuning, but it will soon be ready to merge.
- Generating node-pinned tests (https://github.com/eth-cscs/reframe/pull/2458)
  - We needed to address some limitations on how we can dynamically generate tests
  - https://github.com/eth-cscs/reframe/pull/2470
  - https://github.com/eth-cscs/reframe/pull/2474

February 22, 2022

Attendees

Vasileios Karakasis (CSCS)
Theofilos Manitaras (CSCS)
Eirini Koutsaniti (CSCS)
Jg Piccinali (CSCS)
Kenneth Hoste (HPC-UGent)
Åke Sandgren (Umeå Univ)
Rafael Sarmiento (CSCS)
Carlos Rosales (Amazon)
Richard Henwood (Arm)
Simon Branford (Univ. of Birmingham)

Agenda

We will skip 3.10.2 and target 3.11.0 for March 22; two dev releases in-between.
- Bug fixes
  - Fixed weird behaviour when overriding hooks within the same test (https://github.com/eth-cscs/reframe/pull/2436)
  - Fixed sub-configuration selection when running tests (https://github.com/eth-cscs/reframe/pull/2438)
  - Do not set up Spack shell support (https://github.com/eth-cscs/reframe/pull/2424)
- Enhancements
  - Control which attributes, variables or parameters can be logged (https://github.com/eth-cscs/reframe/pull/2428); current behaviour can cause problems with Logstash and lose records.
  - Remove pipeline timings from output.
- OSU library test and the associated CSCS tests PR (under review): https://github.com/eth-cscs/reframe/pull/2421
- Next sprint: https://github.com/eth-cscs/reframe/milestone/76
Community feedback
- Extension of the valid_systems and valid_prog_environs syntax is still work in progress. What if we supported basic compiler abstractions as in Spack here?
  - Vasileios: There are no plans for compiler auto-detection and auto-generation of the environments configuration section.
  - Kenneth: this could quickly become a time-consuming task, since also compiler versions, etc. are relevant
  - Kenneth: this seems like an opportunity for a common Python library that could be leveraged by ReFrame, Spack, EasyBuild, ...
    - kind of similar to archspec (cfr. -mtune & co options that archspec knows about, but compiler flags for OpenMP is out-of-scope there...
  - Richard: Delegate the compilation task fully onto Spack and use the compiler info to generate the ReFrame config on-the-fly. Then ReFrame tests are monkey-patched to parametrise them over the various specs.
- Use cases of running a test session continuously until a time limit is reached: https://github.com/eth-cscs/reframe/issues/619
  - could be used for burn-in testing, simulate user workload, ...
  - also related to exploring range of combinations for multi-node tests, since often not enough tests are generated to actually fill a system
Meeting frequency
AOB

February 8, 2022

Attendees

Vasileios Karakasis (CSCS)
Victor Holanda (CSCS)
Theofilos Manitaras (CSCS)
Jg Piccinali (CSCS)
Stefan Wolfsheimer (SURF0
Kenneth Hoste (HPC-UGent)
Åke Sandgren (Umeå Univ.)
Ben Fulton (Indiana Univ.)
Caspar van Leeuwen (SURF)
Rafael Sarmiento (CSCS)
Carlos Rosales (Amazon)

Agenda

Development updates
- ReFrame 3.10.0 is out: https://github.com/eth-cscs/reframe/releases/tag/v3.10.0
- ReFrame 3.10.1 planned for today: https://github.com/eth-cscs/reframe/milestone/74?closed=1
- Next sprint: https://github.com/eth-cscs/reframe/milestone/75
- Added new labels to tag each issue with the framework part it refers to
- We plan to migrate the repo under github.com/reframe-hpc.
Community feedback on use cases
- Do you use or plan to use ReFrame to test and deploy software stack, e.g., using Spack/EasyBuild?
  - Feedback: This is an interesting feature for both Spack and EasyBuild for exploring different build configurations, but it's not likely to be used for deploying the software stack.
- Towards relaxing valid_systems and valid_prog_environs: https://github.com/eth-cscs/reframe/issues/1987
  - Key challenge here is to integrate also the resources that can be defined in the configuration, which are accessed now through extra_rerources inside the test.
  - There are three types of system-related attributes: features, key/value properties and scheduler resources.
- Submit single node job automatically on every node of a reframe partition: https://github.com/eth-cscs/reframe/issues/2334
  - would be very useful to find "bad nodes" in a given reservation
  - automatically submit a separate copy of a test to each node
  - for now, nothing combinatorial (explodes quickly after 2 nodes...)
  - combinatorial combos could be pick N out of M possibilities at random, or strided throughout set of 100 nodes (1-10, 11-20, etc.)
    - selection mechanism is really needed when running 16-node tests out of 100 available nodes
  - Caspar: could tests somehow indicate that they want to use flexible allocation?
    - example: gpuburn to check thermal throtlling of GPUs ("hardware test")
    - tests that aim to validate working software are probably less interesting to run with flexible allocation
    - idea: --flex-alloc-singlenode=idle:testXYZ,testABC => only run these 2 specific single node tests across all nodes
  - Theo: Should the tests in such scenario share a single-stage directory so as to avoid redundant builds?
  - Åke: This case should be addressed by fixtures, where the build part of the test is a fixture and you only dynamically parametrise the run test.
Maintenance of scheduler backends
AOB

January 11, 2022

Agenda

Welcome and introductions
- Briefly introduce yourself and where are you using (or planning to use) ReFrame?
Development status
- Team & contributions
  - Core team (@ekouts, @rsarm, @teojgo, @vkarak, @victorusu)
  - Contributions are more than welcome!
- Development model
  - Release train model: A new release every two weeks; releases are not delayed; whatever is ready and merged gets released
  - Semantic versioning: <major>.<minor>.<patch>
    - Patch-level bumps (every two weeks): bug fixes and new features (no deprecations)
    - Minor version bumps (every 6–8 weeks): introduction of major features (deprecations are allowed, but backward compatibility is ensured)
    - Major version bumps: backward compatibility may be broken.
- Upcoming major features scheduled for 3.10.
  - Asynchronous builds (https://github.com/eth-cscs/reframe/pull/2194)
  - New test naming scheme (https://github.com/eth-cscs/reframe/pull/2355)
Outlook for HPC Test library
- Proof-of-concept in hpctestlib/ (documentation: https://reframe-hpc.readthedocs.io/en/stable/hpctestlib.html)
- Continue with creating library tests from our microbenchmarks
- Still unclear: community contributions, library location (different repo?), moving to stable
Discuss issues that need resolution (feature requests, bugs)
Discuss interesting use cases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Community calls

July 12, 2022

Participants

Agenda

May 3, 2022

Participants

Agenda

Action Items

April 5, 2022

Participants

Agenda

March 22, 2022

Participants

Agenda

February 22, 2022

Attendees

Agenda

February 8, 2022

Attendees

Agenda

January 11, 2022

Agenda

Clone this wiki locally