Difference between revisions of "HowTo:adf"
(→Where can I get more detailed information ?)
Revision as of 16:29, 26 May 2016
ADF stands for "Amsterdam Density Functional" and denotes a package of programs that uses Density Functional Theory (DFT) for electronic and molecular structure calculations. The package is geared towards chemists and physicists with an interest in the structure of molecules and solids.
The ADF package consists of two main components:
Unlike most other molecular/solid/electronic structure codes, ADF employs "Slater-type" basis sets, ie, functions that have an exponential behaviour, which are more suitable for the description of chemical systems than the more commonly employed "Gaussian type" ones. The downside of this are computational difficulties that may be circumvented by numerical integration. Since DFT depends largely on numerical integration anyhow, the "Slater approach" is particularly well-suited for DFT code.
ADF is arguably the best DFT code available at this time for transition metal compounds and solids.
Location of the program and setup
The present version of ADF is 2016.101. The programs in the ADF package reside in /opt/adf. To use ADF on our machines, it is required that you read our licensing agreement and sign a statement. You will then be made a member of a Unix group adf, which enables you to run the software.
ADF requires the sourcing of a setup script to function properly:
This script sets environment variables that are necessary for proper program execution and are used for the system to find executables and data files such as basis sets. Among others, SCMLICENSE is used by the license manager of the program to find a machine specific license file.
The above settings is best applied through a call to usepackage on our system. Issuing the command
One of the settings is the environment variable SCM_TMPDIR which is required to redirect the temporary files that ADF uses to the proper scratch space, presently
where hpcXXXX stands for your username. If for some reason ADF does not terminate normally (e.g. a job gets cancelled), it leaves behind large scratch files which you may have to delete manually. To check if such files exist, type
ls -lt /scratch/hpcXXXX
Usually the scratch files are in sub-directories that start with kid_. Once you have determined that the scratch files are no longer needed (because the program that used them is not running any more), you can delete them by typing
rm -r /scratch/hpcXXXX/kid_*
Running ADF from a command line
Once program usage is set up through the "use" command, the program(s) can be run:
adf <in >out
title benzene BP/SZ bondorders tol=0.05 define cc=1.38476576 ccc=120.0 dih=0.0 hc=1.07212846 hcc= 120.0 dih2=180.0 end atoms Z-matrix C 0 0 0 C 1 0 0 cc C 2 1 0 cc ccc C 3 2 1 cc ccc dih C 4 3 2 cc ccc dih C 5 4 3 cc ccc dih H 2 1 3 hc hcc dih2 H 3 2 4 hc hcc dih2 H 4 3 5 hc hcc dih2 H 5 4 3 hc hcc dih2 H 6 5 4 hc hcc dih2 H 1 2 3 hc hcc dih2 end basis Type SZ Core None end symmetry NOSYM xc gga becke perdew end bondorder tol=0.05 printall noprint sfo
The input consists of several units, separated by blank lines, starting with a keyword, and ending with the statement END. For instance, the atoms in a molecules may be specified by issuing the keyword atoms, followed by one line with the atom name and "Z-matrix" relative coordinates for each atom, and closing with end (case insensitive).
Note: It is absolutely essential to have a good idea about the size and complexity of your calculations before you start a ADF job. Many of the methods have terrible scaling properties, i.e. the computational cost grows very quickly with the number of electrons, degrees of freedom, or number of basis functions used. We suggest you start with a small basis set and a cheap method, and then slowly increase those parameters.
Submitting (parallel) ADF jobs
In most cases, you will run ADF in batch mode.
Production jobs are submitted to our systems via the Grid Engine, which is a load-balancing software. To obtain details, read our Grid Engine FAQ. For an ADF batch job, this means that rather than issuing the above commands directly, you wrap them into a Grid Engine batch script. Here is an example for such a batch script:
#! /bin/bash #$ -S /bin/bash #$ -V #$ -cwd #$ -M MyEmailAdress@whatever.com #$ -m be #$ -o STD.out #$ -e STD.err #$ -pe shm.pe 12 adf -n $NSLOTS <sample.adf >sample.log
This script needs to be altered by replacing all the relevant items. It sets all the necessary environment variables (make sure you issued a "use adf" statement before using this), and then starts the program. The lines in the script that start with #$ are interpreted the Grid Engine load balancing software as directives for the execution of the program.
For instance the line "#$ -m be" tells the Grid Engine to notify the user via email when the job has started and when it is finished, while the line beginning with "#$ -M" tells the Grid Engine about the email address of the user.
The -o and -e lines determine whence the standard input and the standard error are to be redirected. Since the job is going to be executed in batch, no terminal is available as a default for these.
The ADF package is able to execute on several processors simultaneously in a distributed-memory fashion. This means that some tasks such as the calculation of a large number of matrix elements, or numerical integrations may be done in a fraction of the time it takes to execute on a single CPU. For this, the processors on the cluster need to be able to communicate. To this end ADF uses the MPI (Message Passing Interface), a well-established communication system.
Because ADF uses a specific version of the parallel system MPI (ClusterTools 7), executing the use adf command will also cause the system to "switch" to that version, which might have an impact on jobs that you are running from the same shell later. To undo this effect, you need to type use ct8 when you are finished using ADF and want to return to the production version of MPI (ClusterTools 8).
ADF parallel jobs that are to be submitted to Grid Engine will use the MPI parallel environment and queues already defined for the user.
Our sample script contains a line that determines the number of parallel processes to be used by ADF. The Grid Engine will start the MPI parallel environment (PE) with a given number of slots that you specify by modifying that line:
#$ -pe shm.pe ''number of processes''
where number of processes must be replaced (for instance, by 12 in our example above). It then determines the value of the environment variable NSLOTS which is used in the "adf" line of the sample script. This way, the system allocates exactly the number of processors that are used for the adf run, and no mismatch can occur.
Once properly modified, the script (let's call it "adf.sh") can be submitted to the Grid Engine by typing
The advantage to submit jobs via a load balancing software is that the software will automatically find the resources required and put the job onto a node that has a low load. This will help executing the job faster. Note that the usage of Grid Engine for all production jobs on HPCVL clusters is mandatory. Production jobs that are submitted outside of the load balancing software will be terminated by the system administrator.
Luckily, there is an easier way to do all this: We are supplying a small perl script called that can be called directly, and will ask a few basic questions, such as the name for the job to be submitted and the number of processes to be used in the job. Simply type
and answer the questions. The script expects a ADF input file with "file extension" .adf to be present and will do everything else automatically. This is meant for simple ADF job submissions. More complex job submissions are better done manually.
We require users of ADF to sign a statement in which they state that they are informed about the terms of the license to be included in the Gaussian user group named "adf". Please fax the completed statement to (613) 533-2015 or scan/email to firstname.lastname@example.org.