You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently NU user can random login to 4 different nodes. in order to get bridge mode work, we need to start idmtools-slurm-bridge util in each of nodes since they are not share the process between nodes.
But this create few issues:
User does not know they need to run idmtools-slurm-bridge util on each login node
Even they can start idmtools-slurm-bridge on each node, the slurm-bridge.pid file is stored in the same location for all nodes (default is ~/.idmtools/singularity-bridge/). This creates another issue when second time run idmtools-slurm-bridge in different node, it will ask you to delete existing slurm-bridge.pid even there is no process ever run in this node which really confuse user.
What I did for 2nd issue is to just delete id file anyway in new node. so I end up started idmtools-slurm-bridge on each node(4 nodes total). but my slurm-bridge.pid saved the last one's pid.
The text was updated successfully, but these errors were encountered:
We should add documentation that users need to connect to same node. At NU, currently users cannot control this, but in future, maybe we can document/work with sysadmins to find ways to guarantee ssh access by node.
Currently NU user can random login to 4 different nodes. in order to get bridge mode work, we need to start idmtools-slurm-bridge util in each of nodes since they are not share the process between nodes.
But this create few issues:
What I did for 2nd issue is to just delete id file anyway in new node. so I end up started idmtools-slurm-bridge on each node(4 nodes total). but my slurm-bridge.pid saved the last one's pid.
The text was updated successfully, but these errors were encountered: