PredictProtein Machine Image

From Rost Lab Open
Revision as of 12:33, 27 July 2015 by Andrea (talk | contribs) (How do I install anything else on the machine?)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


The PredictProtein Machine Image (PPMI) is a self-contained solution for protein feature prediction. The image contains everything you need to get started: a fully functional Debian system and a wide selection of prediction methods. Supporting databases can be downloaded at The image may be used in small scale analysis on a single machine (real or virtual) or in large scale high-throughput analysis on an arbitrary number of server instances in the cloud. You may extend the image with packages (such as gridengine-client) and tools as needed.

Most prediction methods on the image are integrated into predictprotein(1). Please refer to the manual page of predictprotein (man predictprotein) for a list of the included methods.

The following methods are provided in addition:

  • snapfun(1) - a method for evaluating effects of single amino acid substitutions on protein function


The PredictProtein Machine Image can be downloaded from

How to Run the PredictProtein Machine Image

Please choose one of our help pages matching your virtualization solution:

In general you have to:

  1. Download a PredictProtein Machine Image (PPMI).
  2. Prepare databases. We recommend that you:
    1. Download a database image.
    2. Have a drive/volume with at least 20G free space.
    3. Uncompress database archive on rostlab-data image to free space.
    4. Make available (e.g. bind-mount) 'data' directory of archive to /usr/share/rostlab-data/data.
  3. Boot the machine image and type man ppmi in a terminal. Read also the manual pages man predictprotein and man snapfun. Run the examples therein.


Here is a quick example showing how to run PredictProtein from the image:

 predictprotein --seqfile /usr/share/predictprotein/example/tquick.fasta --output-dir /tmp/$USER/pp

The above command will write the output to /tmp/$USER/pp.

Find out more: run man predictprotein and man snapfun in a terminal.

Output format

All Rost Lab methods give a brief description of their output format in this section. For further details please refer to the references and the source or script of the tool.


  • Laszlo Kajan, Yachdav, G., Steinegger, M., Mirdita, M., Vicedo, E. and Burkhard, R. (2012). PredictProtein Machine Image Server for Virtualized and Cloud Computing. NAR Web Server Issue (submitted), XXX(X) XXX-X.

When PPMI is based on Bio-Linux:

  • Field, D., Tiwari, B., Booth, T., Houten, S., Swan, D., Bertrand, N. and Thurston, M. 2006. Open Software for biologists: from famine to feast. Nature Biotechnology 24, 801 - 803.

In case you find the PPMI image and the tools within useful please cite:

  • PPMI, see above
  • in case of Bio-Linux-based version: Bio-Linux, see above
  • PredictProtein, see REFERENCES on the man page predictprotein(1)
  • the references for the tools you used, see REFERENCES on the man page of each tool


How do I download the PredictProtein Machine image?

The PredictProtein Virtual Machine image is available from the PredictProtein website. Note that registration is required.

What do I need to run the PredictProtein Machine image?

You would need:

  • At least 20Gb of free space
  • At least 2Gb of free memory
  • In case you are not booting your machine into the image you would need a virtualization software such as KVM/QEMU or VMWare

How do I install the PredictProtein Machine image?

No installation is required. See the Booting the Machine Image section for instructions.

What protein features are predicted by methods included on the PredictProtein Machine Image?

The list of features predicted by methods on the PredictProtein Machine Image keeps growing as we develop and refine our methods. Please open the ppmi(7) man page on the image in order to see what is included.

The list of features at present:

  • disordered regions
  • effects of amino acid substitutions
  • protein-protein interaction sites
  • sub-cellular localization
  • multiple sequence alignment
  • evolutionary profiles
  • sequence motifs
  • low-complexity regions
  • nuclear localization signals
  • regions lacking regular structure
  • unstructured loops
  • bacterial transmembrane beta barrels
  • protein secondary structure
  • solvent accessibility
  • globular regions
  • transmembrane helices
  • coiled-coil regions
  • structural switch regions
  • disulfide-bonds

How do i contact the maintainer?

email assistant <at> rostlab <dot> org

Is it safe to use the PredictProtein Machine Image in a public cloud?

Using the PredictProtein Machine Image in a public cloud is as safe as using any properly made machine image. However you first have to change the password for the user account that comes with the downloaded machine image.

Please do not use the PredictProtein Machine Image in a public cloud without changing the default password.

How do I install anything else on the machine?

Your system is configured for apt-get, including the repository at Rostlab. It assumes that the current version is 'stable' -- which is not true at the time of writing. For now you have to

  1. change the sources list:
    1. /etc/apt/sources.list
    2. /etc/apt/sources.list.d/rostlab.list
  2. update your package list
    • sudo apt-get update
    • sudo apt-get install rostlab-debian-keyring
    • sudo apt-get update

Then you can use apt-get install to add packages you need.

Copyright and License


Click here to read the full text of the license online.

The most recent version of the license can be found at /usr/share/doc/ppvmi/copyright on the image itself.

Commercial users or users to whom the above license does not apply please contact Biosof Sales <>.