This is a quick introduction to the usage of the screening software PyRx that is installed on our clusters. It is meant as an initial pointer to more detailed information. It also explains a few specific details about local usage.
What is PyRx ?
PyRx is a Virtual Screening software for Computational Drug Discovery that can be used to screen libraries of compounds against potential drug targets. It is a GUI that uses a large body of established open source software such as:
Version, Location and Access
The present version of the program is 0.9.8 and 0.9.4 (somewhat modified), and it is available on the Linux platform in its 64 bit version. Therefore, all the relevant executables are in /global/software/PyRx/0.9.8 and /global/software/PyRx/0.9.4. Documentation can be found at at the main PyRx site.
You can run PyRx only on the CAC login nodes. From there, the setup for PyRx is very simple. It is only necessary type :
module load PyRx/098
This will enter the proper directory into your PATH and off you go.
Issuing the command
will pop up the GUI. All operations are performed from within that interface. At a minimum, you will have to specify a macromolecule and at least one compound that you want to "dock". These molecules can be specified in several formats such as pdb, pdbq, cif, mol2. You can Import or Load molecules from the
File -> Load Molecule
File -> Import ...
The actual Analysis is performed using various tabs on the GUI. As an example we outlined the steps using the "Vina Wizard" which runs a software called "Autodock Vina" for the Analysis:
Vina Wizard -> Start Here -> (select /global/software/pyrx/0.9.4/bin/vina) -> Start (highlight Ligands and Macromolecule(s)) -> Forward (adjust values for Search Space) -> Forward (check results in bottom window)
There's of course a lot more to it. But the authors of the software claim that it is intuitive enough that you can figure anything out while doing it. Your mileage may vary.
Production runs (cluster mode)
NOTE: This sections is obsolete, it needs a complete overhaul
If you are screening hundreds (or even thousands) of molecules using PyRx the time required may be too much for interactive usage. PyRx offers some basic interface with a scheduler, but the default settings are too non-specific to work with our systems.
For the Vina Wizard, we have provided a work-around that allows you to work through a large number of runs using the machines on the SW cluster in parallel. Before you are trying to do this [through our Slurm wiki page] to learn how jobs are submitted to our production clusters.
The procedure for this starts off the same as for the interactive approach:
Vina Wizard -> Start Here -> (select Cluster(Portable Batch System)) -> Start (highlight Ligands and Macromolecule(s)) -> Forward (adjust values for Search Space) -> Forward
However, in this case the "Cluster" setting was selected and as a result, the program is not actually running any docking software, but rather generates a large number of scripts in a directory
where "MACRO" stands for the name of the macromolecule you are using, and "~" is short for the name of your home directory. To run the actual analysis on our cluster, you now need to go into that directory and execute a "perl" script that we have provided for this purpose:
cd ~/.PyRx_workspace/Macromolecules/MACRO PyRxVinaArray.pl
This will generate two new sub-directories "jobs" and "logs" and copy the scripts mentioned earlier, then produce a job for our scheduler "Grid Engine", and submit it. Using the qstat command,, you should then be seing something like:
$ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 952371 0.50734 runVina.sh hpcXXXX r 11/17/2015 12:59:29 firstname.lastname@example.org 8 64 952371 0.50734 runVina.sh hpcXXXX r 11/17/2015 12:48:59 email@example.com 8 60 952371 0.50734 runVina.sh hpcXXXX r 11/17/2015 12:58:29 firstname.lastname@example.org 8 62 952371 0.50734 runVina.sh hpcXXXX r 11/17/2015 12:58:59 email@example.com 8 63 952371 0.50734 runVina.sh hpcXXXX r 11/17/2015 13:05:59 firstname.lastname@example.org 8 65 952371 0.50734 runVina.sh hpcXXXX r 11/17/2015 13:09:29 email@example.com 8 66 952371 0.50734 runVina.sh hpcXXXX qw 11/17/2015 09:30:03 8 67-511:1
As you can see, it's working on 6 "Vina" jobs simultaneously, with 8 processors each for a total of 48.
Once the "qstat" command does not show anything anymore, the analyses are finished, and you can go back to your PyRX GUI:
-> Forward (check results in bottom window)
Note that this works only for the analysis with Vina. If you want to do something similar with a different analysis (for instance Autodock4), please get in touch with us. We can probably come up with a solution.