Logging in to Supercomputing Wales
OverviewTeaching: 20 min
Exercises: 15 minQuestions
Understand how to log in to the Supercomputing Wales hubs
Understand the difference between the login node and each cluster’s head node.
Your username is your institutional ID prefixed by
b. for Bangor users,
c. for Cardiff users and
s. for Swansea users. External collaborators will have a username beginning with
Aberystwyth and Swansea users (and their external collaborators) should log in to the Swansea Sunbird system by typing:
$ ssh email@example.com
Bangor and Cardiff Users (and their external collaborators) should log in to the Cardiff Hawk system by typing:
$ ssh firstname.lastname@example.org
If you use Windows and haven’t installed the Git bash shell, you can instead use PuTTY
and enter either
hawklogin.cf.ac.uk in the hostname box.
These figures may still be subject to some change and might have been sourced from out of date documents.
|Partition||Number of Nodes||Cores per node||RAM||Other|
|Swansea GPU||4||40||382GB||2x Nvidia V100 (5120 core, 16GB RAM)|
|Swansea Data Lake||1||72||1500GB||Installed with Swansea system, and intended for e.g. Hadoop and Elastic Stack users. Not integrated with the main Sunbird system; contact Support or your RSE team for access details.|
|Cluster||Number of Nodes||Cores per node||RAM||Other|
|Cardiff Compute AMD||64||64||256GB||AMD EPYC CPUs, not fully operational|
|Cardiff High Memory||26||40||382GB|
|Cardiff GPU||13||40||382GB||2x Nvidia P100 (3584 core, 16GB RAM)|
|Cardiff Data Lake||2||?||?||Will be installed later. Intended for Hadoop and Elastic Stack users.|
Aberystwyth and Swansea users are expected to use the Swansea system and will need to make a case for why they would need to use the Cardiff system. Bangor and Cardiff users are expected to use Cardiff, and external users are expected to use the same system as the owner of the project of which they are a member.
Slurm is the management software used on Supercomputing Wales. It lets you submit (and monitor or cancel) jobs to the cluster and chooses where to run them.
Other clusters might run different job management software such as LSF, Sun Grid Engine or Condor, although they all operate along similar principles.
How busy is the cluster?
sinfo command tells us the state of the cluster. It lets us know what nodes are available, how busy they are and what state they are in.
Clusters are sometimes divided up into partitions. This might separate some nodes which are different to the others (e.g. they have more memory, GPUs or different processors).
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST compute* up 3-00:00:00 1 fail scs0042 compute* up 3-00:00:00 1 drain* scs0004 compute* up 3-00:00:00 2 mix scs[0018,0065] compute* up 3-00:00:00 86 alloc scs[0001-0003,0005-0017,0019-0035,0043-0046,0049-0064,0066-0072,0097-0122] compute* up 3-00:00:00 32 idle scs[0036-0041,0047-0048,0073-0096] development up 30:00 1 fail scs0042 development up 30:00 1 drain* scs0004 development up 30:00 2 mix scs[0018,0065] development up 30:00 86 alloc scs[0001-0003,0005-0017,0019-0035,0043-0046,0049-0064,0066-0072,0097-0122] development up 30:00 32 idle scs[0036-0041,0047-0048,0073-0096] gpu up 2-00:00:00 4 idle scs[2001-2004]
computemeans that this is the default partition.
AVAILtells us if the partition is available.
TIMELIMITtells us if there’s a time limit for jobs
NODESis the number of nodes in this partition in this particular state.
STATEdescribes what these nodes are doing:
drainmeans that the node will become unavailable once the jobs currently running on it complete.
downmeans that the node is is powered off or otherwise unavailable for use.
allocmeans that the node is fully allocated; all CPU cores are in use running users’ software.
mixmeans that some of the CPU cores on a node are allocated to a user, and others are available for use.
idlemeans that the node is not currently allocated, and is available for use.
Logging into Supercomputing Wales
If you haven’t already:
- In your web browser go to My Supercomputing Wales and log in with your university username and password.
- Click on “Reset SCW Password” and choose a new password for logging into the HPC. Your username is displayed in the “Account summary” box on the main page. Its usually
s.and your normal university login details.
- Log in to
hawklogin.cf.ac.ukusing your SSH client.
- Run the
sinfocommand to see how busy things are.
sinfo --long, what extra information does this give?
ssh hawklogin.cf.ac.ukto log in to the system
sinfoshows partitions and how busy they are.
slurmtopshows another view of how busy the system is.