Hauptseminar 'Protein and Disease (WS13/14)'

Type  Seminar (2 SWS)
ECTS 4.0
Lecturer Burkhard Rost
Time Monday, 12:00 - 13:30
Room MI 01.09.034
Language English

Application / Registration

Application is organised centrally for all bioinformatics seminars. After you have been assigned to our seminar, we will distribute the topics.


Topics related to the research interests of the group: protein sequence analysis, sequence based predictions, protein structure prediction and analysis; interaction networks.


The Pre-meeting will be held on July 26th, 10 am in Room MI 01.09.034

The rules and hints for preparation of the seminar discussed  in the pre-meeting are also summarised in our Checklist and on these slides.

Final Schedule

Topics (Supervisor):


21.10 Iva Ivanova 1000 genomes, Neandertals, Chimps (A. Dong)
28.10 Michael Schneider G-protein coupled receptors: a major class of drug targets (E. Kloppmann)
4.11. Panda Raharja-Liu Normal mode analysis/Elastic network models (E. Kloppmann)
11.11. Quirin Heiß Predicting subcellular localization using functional hierarchies (T. Goldberg)
18.11 Angela Hempfer Hunting disease SNPs (A. Dong)
25.11 Benjamin Ölke Robustness and evolvability of proteins (M. Hecht)
2.12 Johannes Rest Intrinsically Disordered Proteins in Human Diseases (E. Vicedo)

Jonas Rädle

Protein 3D structure prediction from evolutionary sequence variation (T. Hopf)


Yann Spöri

Membrane protein 3D structure and function prediction from genomic sequencing (T. Hopf)

13.1. Silvana Wolf Nuclear import and sorting of proteins (T. Goldberg)
20.1. Maria Schelling HIV mutational pathways (L. Richter)
27.1. cancelled  
3.2. Peter Kreitmaier Improved localization prediction from evolutionary profiles (T. Goldberg)

Conditional Random Fields for Named-Entity Recognition (J. Cejuela)
Influence of alignments on prediction of effects of mutations (A. Schafferhans)

Time Slots:
21.10., 28.10., 4.11.,  11.11., 18.11., 25.11., 2.12., 9.12., 16.12., 13.1., 20.1., 27.1., 3.2.

Description of Topics

Influence of alignments on prediction of effects of mutations

Dr. Andrea Schafferhans

Methods predicting protein features based on sequence often rely on multiple sequence alignments. This talk shall summarise recent studies on the effect of different input alignments for predicting the effect of mutations.


  • Hicks,S. et al. (2011) Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Human mutation, 32, 661-8.

  • Acharya,V. and Nagarajaram,H. a (2012) Hansa: an automated method for discriminating disease and neutral human nsSNPs. Human mutation, 33, 332-7.
  • Masica,D.L. et al. (2012) Phenotype-optimized sequence ensembles substantially improve prediction of disease-causing mutation in cystic fibrosis. Human mutation.


G-protein coupled receptors: a major class of drug targets

Dr. Edda Kloppmann

G-protein coupled receptors (GPCRs) comprise a large and important group of integral membrane receptors that activate signalling cascades upon receiving signals from outside the cell. GPCRs are involved in numerous diseases and account for approximately 40% of all pharmaceutical drugs. This talk shall introduce the structure and function of GPCRs and their role in (computational) drug design.


  • tba


Normal mode analysis/Elastic network models

Dr. Edda Kloppmann

Several experimental techniques can provide information on the structure and dynamics of proteins. However, experimental methods are often time-consuming and do not provide a complete picture of the dynamic properties of proteins. Structural bioinformatics can complement experimental methods. Normal mode analysis (NMA) has been used successfully to study large global motions of proteins. Elastic network models (ENMs) significantly reduce the memory requirements for NMA. This talk shall introduce NMA with a particular focus on ENMs and their application in structural biology.


  • B Brooks & M Karplus. Harmonic dynamics of proteins: Normal modes and fluctuations in bovine pancreatic trypsin inhibitorPNAS (1983) 80: 6571-6575.
  • M M Tirion. Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic Analysis. Phys Rev Lett (1996) 77: 1905-1908.
  • K Hinsen. Analysis of Domain Motions by Approximate Normal
    Mode Calculations.
     Proteins (1998) 33:417–429.
  • S A Wieninger, E H Serpersu & G M Ullmann. ATP binding enables broad antibiotic selectivity of aminoglycoside phosphotransferase(3')-IIIa: an elastic network analysisJ Mol Biol (2011) 409:450-465.


1000 genomes, Neandertals, and Chimps

Dr. Arthur Dong

These papers are some of the landmark papers and offer fascinating stories of human evolution. Our focus here is to understand what makes us human (as distinguished from Chimps and Neandertals) and the common variation among human populations. Such common SNPs provide a background for the investigation of rare, disease-causing SNPs.


  • A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (28 October 2010)
  • A Draft Sequence of the Neandertal Genome. Science 7 May 2010: Vol. 328 no. 5979 pp. 710-722
  • Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69-87 (1 September 2005)


Hunting disease SNPs

Dr. Arthur Dong

The first paper is a broad overview of common diseases where genome-wide SNP hunting is possible. The second is really a biology paper, but has a nice bioinformatics part to extend the experimental results.


  • Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature447, 661-678 (7 June 2007)
  • IFITM3 restricts the morbidity and mortality associated with influenza. Nature 484, 519–523 (26 April 2012)


Intrinsically Disordered Proteins in Human Diseases

Diplom. Biol. Esmeralda Vicedo

Intrinsically disordered proteins (IDPs) lack stable tertiary and/or secondary structures under physiological conditions in vitro.IDPs are involved in regulation, signaling, and control and their functions are tuned via alternative splicing and posttranslational modifications.Numerous IDPs are associated with human diseases, including cancer, cardiovascular disease, amyloidoses, neurodegenerative diseases, and diabetes. Overall, intriguing interconnections among intrinsic disorder, cell signaling, and human diseases suggest that protein conformational diseases may result not only from protein misfolding, but also from misidentification, missignaling, and unnatural or nonnative folding.


  • Intrinsically Disordered Proteins in Human Diseases: Introducing the D2 Concept; Uversky VN, Oldfield CJ, Dunker AK.;Annu Rev Biophys. 2008;37:215-46.

  • Intrinsically disordered proteins from A to Z I; Uversky VN;The International Journal of Biochemistry & Cell Biology. 2011;43:1090-1103.

Protein 3D structure prediction from evolutionary sequence variation

Thomas Hopf

The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Yet, a major challenge is to distinguish true residue coevolution from the noisy set of observed correlations.

This talk should outline the concept of correlated mutation analysis to infer evolutionary constraints. Starting from the limitations of local statistical models, it should introduce the global maximum entropy model by Marks et al., and show how this model can be used to compute protein 3D structures from sequence alone.

  • Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, et al. (2011) Protein 3D Structure Computed from Evolutionary Sequence Variation. PLoS ONE 6(12): e28766

Membrane protein 3D structure and function prediction from genomic sequencing

Thomas Hopf

Up to 30% of all human proteins are integral membrane molecules which play vital roles in cell-cell communication, tissue organization and transport. Yet, despite their outstanding relevance as drug targets and considerable advances in experimental structure determination, most membrane protein 3D structures remain unknown.

Building upon the previous topic (Protein 3D structure prediction from evolutionary sequence variation), this talk should introduce the adaption of EVfold to the prediction of alpha-helical transmembrane proteins. In addition to the method details, the talk should outline the use of EVfold_membrane to predict the structure of unsolved membrane proteins, and how to learn about their oligomerization, functional sites, conformational changes and the impact of genetic variation.

  • T. A. Hopf, L. J. Colwell, R. Sheridan, B. Rost, C. Sander & D. S. Marks (2012). Three-dimensional structures of membrane proteins from genomic sequencing. Cell. 149 (7), 1607-21


Predicting subcellular localization using functional hierarchies

Tatyana Goldberg

Identification of a protein’s subcellular localization is an important step towards elucidating its function. In this seminar, a machine-learning-based method for predicting localization in prokaryotes and eukaryotes shall be presented. The method is original in incorporating a hierarchical ontology of subcellular localization classes. Furthermore, it uses predicted features like the secondary structure of a protein and evolutionary information in form of sequence profiles to improve prediction accuracy considerably.


  • Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson JD. Molecular Biology of the Cell. New York: Garland Science, 2002; Chapter: "Intracellular compartments and protein sorting".

Improved localization prediction from evolutionary profiles

Tatyana Goldberg



Nuclear import and sorting of proteins

Tatyana Goldberg

Quantitative experimental analyses of the nuclear interior reveal a morphologically distinct membrane-less compartments. The translocation of proteins from the cytosol into the nucleus, and their subsequent association with the nuclear sub-compartments represent two distinct levels of cellular regulation. At the first level, nuclear import and export of proteins is largely governed by the nuclear pore complexes and specific cargo molecules. In contrast, at the second level of regulation, the mechanism of protein sorting into nuclear sub-compartments is not well understood. This seminar shall introduce prediction models for the nuclear protein import and sorting, and shed light into the underlying mechanisms of translocation.

  • Mehdi AM, Sehgal MS, Kobe B, Bailey TL, Bodén M (2011). A probabilistic model of nuclear import of proteins. Bioinformatics 1;27(9):1239-46.  www.ncbi.nlm.nih.gov/pubmed/21372083
  • Bauer DC, Willadsen K, Buske FA, Lê Cao KA, Bailey TL, Dellaire G, Bodén M (2011). Sorting the nuclear proteome. Bioinformatics 1;27(13):i7-14. www.ncbi.nlm.nih.gov/pubmed/21685104

Conditional Random Fields for Named-Entity Recognition

Juan Miguel Cejuela

Conditional random fields (CRF) are popular methods in named-entity recognition (NER) and generally in sequential labeling tasks. This talk shall present the CRF models and their advantages in comparison to other popular models like hidden markov models (HMMs). An example case will focus on the recognition of protein names.

  • Andrew McCallum Charles Sutton. An Introduction to Conditional Random Fields for Relational Learning. In Lise Getoor and Ben Taskar, editors, Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning), chapter 4
  • John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01,




HIV Mutatioal Pathways

Lothar Richter

HIV is the causing agent for AIDS and still not curable. A lot of research has been done to understand the underlying molecular mechanisms and to elucidate the basis of aquired resistance to anti-viral drugs. The talk should cover some computational approaches to detect and to predict mutation pathways leading to resistance against various types of anti-viral agents. The given literature should serve as a starting point for more recent developments in the field.

  • Lawyer et al. AIDS Research and Therapy 2011, 8:26 http://www.aidsrestherapy.com/content/8/1/26
  • Richter, L, Augustin, R, and Kramer, S (2009). Finding Relational Associations in HIV Resistance Mutation Data
    In: Proceedings of the 19th International Conference on Inductive Logic Programming (ILP 2009), ed. by Luc De Raedt, Springer Verlag


Sequence/structure-based strategy for in-silico protein function prediction

Frank Wallrapp

The number of available protein sequences has increased exponentially with the advent of high-throughput genomic sequencing, creating a significant challenge for functional annotation. Since common in-silico function prediction tools only provide functional clues, there is a need for new strategies to predict the specific molecular function of unknown enzymes utilizing a general, more direct method. This talk shall introduce the novel sequence/structure based strategy for protein function prediction and critically evaluate its concept and applicability in reference to other state-of-the-art computational approaches.

  • Grant, M. A. (2010). Integrating computational protein function prediction into drug discovery initiatives. (D. Gurwitz, Ed.) Drug Development Research, 72(1), 4–16. doi:10.1002/ddr.20397
  • Kalyanaraman, C., & Jacobson, M. P. (2010). Studying Enzyme−Substrate Specificity in Silico: A Case Study of the Escherichia coli Glycolysis Pathway. Biochemistry, 49(19), 4003–4005. doi:10.1021/bi100445g
  • Wallrapp, F. H., Pan, J.-J., Ramamoorthy, G., Almonacid, D. E., Hillerich, B. S., Seidel, R., et al. (2013). Prediction of function for the polyprenyl transferase subgroup in the isoprenoid synthase superfamily. Proceedings of the National Academy of Sciences of the United States of America, 110(13), E1196–E1202. doi:10.1073/pnas.1300632110