How to Run the PredictProtein Machine Image on OpenNebula

From Rost Lab Open


This procedure assumes that:

  1. you have an OpenNebula account, e.g. at
  2. you have at least 30 Gb temporary storage space to download the machine and data images

Preparing the PredictProtein Machine Image for First Use

  1. Download PredictProtein Machine Image to your local drive. You will need the Bio-Linux-based KVM QEMU (Generic Raw Format) format version.
    • Choose the Debian-based version in case you do not need a graphical user interface. This image is only half the size of the Bio-Linux-based image.
  2. Download a database image to your local drive.
    • The 'raw' format and the tarball are almost the same. You can mount the raw format and unpack it to a formatted drive or you can upload the tarball and then unpack it from there.
  3. Login into your OpenNebula management (e.g. at LRZ)
  4. Upload main PP image to OpenNebula cloud
    1. go to Virtual Resources -> Images
    2. use (+) to add an image from uploaded data (Type OS)
  5. Prepare for PP data: use either of these methods:
    • Upload PP raw data image to OpenNebula (test) cloud
      1. go to Virtual Resources -> Images
      2. use (+) to add an image from uploaded data (Type DATABLOCK)
    • Have a larger storage disk in the template you make next and copy the tar ball onto that disk.
  6. Create a template for the virtual machine
    1. go to Virtual Resources -> Templates
    2. Create a new template (give it a name) with these properties:
      • on General: Don't use more virtual CPUs than real ones
      • on Storage: Choose the main PP image created/uploaded above, specify 'vda' as a 'target'
      • on Storage: add a second disk, choose volatile and give it 100 Gb of space -> you will unpack the PP data to that disk
      • on Storage: add a third disk, choose the empty data image you created above for storing results
      • if you use the raw data image: on Storage: add another disk and choose the PP data image created/uploaded above, no need to specify a target
      • on Input/Output: specify keymap 'us'
  7. Instantiate the virtual machine
    1. go to Virtual Resources -> Virtual Machines
    2. Hit (+) to create a new virtual machine
    3. Choose your template from the list and hit "create"
    • Alternatively you can directly instantiate from Virtual Resources -> Templates -> Instantiate
  8. Configure in the machine that is running
    1. note the IP of your instance
    2. log into your instance (in case of biolinux you first have to go in via VNC and set a password for ppuser)
    3. to check what has been loaded, you can do 'cat /proc/partitions' and e.g. 'sudo fdisk -l /dev/vdc'
    4. run the following commands:
      1. sudo mkdir /mnt/resultStore
      2. sudo mount /dev/vdc /mnt/resultStore
      3. sudo mount /dev/vdb /mnt/local-storage/
      4. sudo mkdir /mnt/local-storage/rostlab-data
      5. sudo chown -R ppuser.ppuser /mnt/local-storage/rostlab-data/
      6. cd /mnt/local-storage/rostlab-data
        • tarball version:
          1. scp user@host:/path/rostlab-data.txz .
          2. tar -xvJf rostlab-data.txz
        • raw image version:
          1. tar -xvJf /dev/vdd (assuming that the raw data is on /dev/vdd)
      7. sudo mount --bind /mnt/local-storage/rostlab-data /usr/share/rostlab-data

Using the PredictProtein Machine Image

  1. Open a terminal and run man ppmi to get started. You will find usage examples in the manual pages referenced there, e.g.:
    • man predictprotein
    • man snapfun