The SW (Linux) Cluster
The Centre for Advanced Computing operates a cluster of X86 based multicore machines running Linux.This page explains essential features of this cluster and is meant as a basic guide for its usage.
Type of Hardaware
This cluster consists of X86 multicore nodes made by Dell and IBM (both based on Intel x5670 or E7-4860). All nodes run CentOS Linux and share a file system. Access is handled by Grid Engine. The server nodes are called sw0011...sw0054.
Why these Systems?
The main emphasis in these systems is a high floating-point performance for a modest number of processes / threads. Since commercial software such as Fluent and Abaqus offer support for Linux only, this cluster was originally acquired to offer recent versions of these software packages. In addition, the higher single-core performance of these nodes allows for an efficient use of license seats which usually a priced per-core.
Who Should Use This Cluster?
The software cluster runs on the Linux operating system and should be used by anyone who wants to run applications that are available on that platform. Runs that require more than 32 Gbyte of memory need to request this explicitly to avoid mis-scheduling.
We suggest you use this cluster if:
This cluster might not be suitable if
If you think your application could run more efficiently on these machines, please contact us (email@example.com) to discuss any concerns and let us assist you in getting started.
Note that we have to enforce dedicated cores or CPUs to avoid sharing and context switching overheads. No "overloading" can be allowed.
Using the Cluster
[Hartmut.Hartmut-HP] → ssh firstname.lastname@example.org |-----------------------------------------------------------------| | This system is for the use of authorized users only. | | Individuals using this computer system without authority, or in | | excess of their authority, are subject to having all of their | | activities on this system monitored and recorded by system | | personnel. | | | | In the course of monitoring individuals improperly using this | | system, or in the course of system maintenance, the activities | | of authorized users may also be monitored. | | | | Anyone using this system expressly consents to such monitoring | | and is advised that if such monitoring reveals possible | | evidence of criminal activity, system personnel may provide the | | evidence of such monitoring to law enforcement officials. | |-----------------------------------------------------------------| Last login: Thu May 12 16:07:11 2016 from officefw This is sflogin0, a dedicated login host on the HPCVL grid SunFire E2900, 24 x 1.8 GHz UltraSPARC-IV+, 192 GB RAM ------------------------------------------------------------------ * Please contact HPCVL User Support with questions about usage http://www.hpcvl.org/contact-us Email email@example.com * For system status updates, see https://www.hpcvl.org/protectedpages/users-section * Available packages can be listed with "use -l" hasch@sflogin0$
The file systems for all of our clusters are shared, so you will be using the same home directory as when you are using the M9000 servers or the standard login node sfnode0. swlogin1 can be used for compilation, program development, and testing only, not for production jobs.
Intel Compiler Suite
The best compiler to use is the Intel Compiler Suite. This includes compilers for Fortran, C, and C++, as well as MPI and OpenMP support, debuggers and development suite. This software resides in /opt/ics. The versions are:
This compiler suite needs to be activated before use. The command is
In many cases, especially for public domain software, the preferable compiler is gnu C/C++/Fortran. The system version of these is:
Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-188.8.131.52/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC)
No special activation is needed to use these, as they reside in a system director. A newer version of this compiler set is available in /opt/gcc-4.8.3 and can be access using the command
If MPI is required, it can be loaded through
For applications that cannot be re-compiled (for instance, because the source code is not accessible), a pre-compiled Linux version (x64 for Redhat will do the trick) needs to be obtained.
As mentioned earlier, program runs for user and application software on the login node are allowed only for test purposes or if interactive use is unavoidable. In the latter case, please get in touch to let us know what you need. Production jobs must be submitted through the Grid Engine load scheduler.
You need to add the following two lines to your script for your job to be scheduled to the Linux SW cluster exclusively:
#$ -q abaqus.q #$ -l qname=abaqus.q
The abaqus name for the queue that is added here derives from the initial software Abaqus that was (and still is) run on this cluster.
Note that your jobs will run on dedicated threads, i.e. typically up to 12 processes can be scheduled to a single node. The Grid Engine will do the scheduling, i.e. there is no way for the user to determine which processes run on which cores.
General information about using HPCVL facilities can be found in our FAQ pages. We also supply user support (please send email to firstname.lastname@example.org or contact us directly), so if you experience problems, we can assist you.