Seminar 'Proteins and Disease' WS14/15

 Type Seminar (2 SWS) ECTS 4.0 Lecturer Burkhard Rost Time Monday, 12:00 - 13:30 Room MI 01.09.034 Language English

Application / Registration

Application is organised centrally for all bioinformatics seminars. After you have been assigned to our seminar, we will distribute the topics.

Content

Topics related to the research interests of the group: protein sequence analysis, sequence based predictions, protein structure prediction and analysis; interaction networks.

Pre-meeting

The Pre-meeting will be held on Friday, July 25, 11 am in Room MI 01.09.034

The rules and hints for preparation of the seminar discussed  in the pre-meeting are also summarised in our Checklist and on these slides.

Final Schedule

Date Topic Supervisor Student
Oct 13 Conditional random fields for named entity recognition Juan Miguel Cejuela Anna-Kathrin Kopetzki
Oct 20 Predicting the effect of mutations on protein–protein interactions Cancelled
Oct 27 Robustness and evolvability of proteins Maximilian Hecht Valérie Marot-Lassauzaie
Nov 3 Prediction of drug-drug interactions through protein-protein-interaction networks Tobias Hamp Maria Wörheide
Nov 10 Potassium channels Edda Kloppmann Joseph Schneider
Nov 17 Three-dimensional reconstruction of protein networks Andrea Schafferhans René Schneider
Nov 24 Protein disorder — a breakthrough invention of evolution? Tatyana Goldberg Kerstin Dörner
Dec 1 Nuclear import and sorting of proteins Tatyana Goldberg Meshal Ansari
Dec 8 Mass-spectrometry-based draft of the human proteome Edda Kloppmann Philippe Barias
Dec 15 Structure comparisons and structure searches with TopMatch and TopSearch Andrea Schafferhans Simon Weck
Christmas break
Jan 12 HIV Mutational Pathways Lothar Richter Andre Ofner
Jan 19 Protein localization prediction from evolutionary profiles Tatyana Goldberg Carsten Uhlig
Jan 26

PolyPhobius: Prediction of transmembrane helices in protein sequences

Cancelled

Description of Topics

Conditional random fields for mutation mention recognition (NER applied to mutations)

Juan Miguel Cejuela

We will look into one specific problem of text mining and research discovery: the recognition of mutations in publications. Simple mentions of mutations such as SNPs like "A32G" are easy to recognize with regular expressions or rule-based approaches (see for example the system MutationFinder). However, more complex mentions that use natural language still pose an unsolved challenge (e.g. "a homozygous complex mutation in Family B that consisted of a deletion of GC at codon 48 (exon 2) with the insertion of 25 bp (c.143–144delGCins25)"). Moreoever, implementation and evaluation of mutation finders in full-text articles is also scarse to nonexistent. In this seminar, we will look into the latest state-of-the-art systems for mutation mention recognition, [2]. These systems are machine-learning-based and typically apply Conditional Random Fields (CRFs) [1] for named-entity recognition. We will look at the challenges of recognizing natural mentions and different types of mutations: single substitutions, multiple substitutions, insertions, deletions, translocations, etc. Finally, we will discuss the importance of building succesful text mining systems for mutation recognition. Note that this seminar may require some programming (in Java or C++) to have a hands-on experience in constructing and applying CRFs.

1. Lafferty, J., McCallum, A., Pereira, F. (2001). "Conditional random fields: Probabilistic models for segmenting and labeling sequence data". Proc. 18th International Conf. on Machine Learning. Morgan Kaufmann. pp. 282–289.
2. Chih-Hsuan Wei, Bethany R. Harris, Hung-Yu Kao, and Zhiyong Lu. (2013). "tmVar: a text mining approach for extracting sequence variants in biomedical literature" Bioinformatics (2013) 29 (11)

Prediction of drug-drug interactions through protein-protein-interaction networks

Tobias Hamp

Knowing how different drugs influence each other in the body is of obvious importance for preventing side-effects and ensuring therapeutic effectiveness. However, classical detection techniques often fall short in this respect. This talk presents a new computational approach that leverages the data found in protein-protein interaction networks and other sources to predict so far unknown drug-drug interactions in human.

Predicting the effect of mutations on protein–protein interactions

Tobias Hamp

Knowing whether - or even how - two proteins interact is not enough. In order to design new and better drug therapies, we need to be able to strengthen or weaken interactions. This seminar presents a recent independent assessment of computational methods that predict the effect of single nucleotide polymorphisms on binding affinity in protein-protein interactions.

• R. Moretti, D. Baker et al. (2013). Community-wide evaluation of methods for predicting the effect of mutations on protein–protein interactions. Proteins 81(11):1980-1987.   http://dx.doi.org/10.1002/prot.24356

Protein localization prediction from evolutionary profiles

Identification of a protein’s subcellular localization is an important step towards elucidating its function. In this seminar, a machine-learning-based methods for predicting localization in prokaryotes and eukaryotes shall be presented. The methods incorporate a hierarchical ontology of subcellular localization classes. The predictions are derived from evolutionary infromation (Loctree2/3) as well as from the powerful sequence homology-based BLAST (Loctree3).

Literature:

Nuclear import and sorting of proteins

Quantitative experimental analyses of the nuclear interior reveal a morphologically distinct membrane-less compartments. The translocation of proteins from the cytosol into the nucleus, and their subsequent association with the nuclear sub-compartments represent two distinct levels of cellular regulation. At the first level, nuclear import and export of proteins is largely governed by the nuclear pore complexes and specific cargo molecules. In contrast, at the second level of regulation, the mechanism of protein sorting into nuclear sub-compartments is not well understood. This seminar shall introduce prediction models for the nuclear protein import and sorting, and shed light into the underlying mechanisms of translocation.

Literature:

• Mehdi AM, Sehgal MS, Kobe B, Bailey TL, Bodén M (2011). A probabilistic model of nuclear import of proteins. Bioinformatics 1;27(9):1239-46.  www.ncbi.nlm.nih.gov/pubmed/21372083
• Bauer DC, Willadsen K, Buske FA, Lê Cao KA, Bailey TL, Dellaire G, Bodén M (2011). Sorting the nuclear proteome. Bioinformatics 1;27(13):i7-14. www.ncbi.nlm.nih.gov/pubmed/21685104

Protein disorder — a breakthrough invention of evolution?

The regions in proteins that do not adopt regular three-dimensional structures in isolation are called disordered regions. In this seminar the functional and structural aspects of disordered proteins shall be discussed. Though only one literature source is provided, the student is expected to use and refer to in his presentation to additional sources for a detailed understanding of protein disorder.

Literature:

• Schlessinger A, Schaefer C, Vicedo E, Schmidberger M, Punta M, Rost B  (2011). Protein disorder--a breakthrough invention of evolution? Curr Opin Struct Biol. Jun;21(3):412-8 http://www.ncbi.nlm.nih.gov/pubmed/21514145
• ...

Mass-spectrometry-based draft of the human proteome

Using mass-spectrometry, researchers from TUM have produced an almost complete inventory of the human proteome. This information is now freely available in the ProteomicsDB database, which is a joint development of TUM and software company SAP. The database includes information for example on the types, distribution, and abundance of proteins in various cells and tissues as well as in body fluids. The talk shall briefly introduce mass-spectrometry and then focus on the results of the publication below and the ProteomicsDB.

Literature:

• Mathias Wilhelm, Judith Schlegl, Hannes Hahne, Amin Moghaddas Gholami, Marcus Lieberenz, Mikhail M. Savitski, Emanuel Ziegler, Lars Butzmann, Siegfried Gessulat, Harald Marx, Toby Mathieson, Simone Lemeer, Karsten Schnatbaum, Ulf Reimer, Holger Wenschuh, Martin Mollenhauer, Julia Slotta-Huspenina, Joos-Hendrik Boese, Marcus Bantscheff, Anja Gerstmair, Franz Faerber & Bernhard Kuster, Mass-spectrometry-based draft of the human proteome; Nature, DOI: 10.1038/nature13319

PolyPhobius: Prediction of transmembrane helices in protein sequences

PolyPhobius uses hidden markov models (HMMs) to predict transmembrane helices in protein sequences. This talk shall introduce transmembrane proteins, HMMs and sequence-based transmembrane helix prediction at the example of PolyPhobius.

Literature:

• Alberts
• Bioinformatics
• Lukas Käll, Anders Krogh and Erik Sonnhammer. An HMM posterior decoder for sequence feature prediction that includes homology information. Bioinformatics, 21 (Suppl 1):i251-i257, June 2005.

Robustness and evolvability of proteins

Maximilian Hecht

Mutations are the catalysts of evolution. Phenotypes need to be robust against mutation in order to prevail. On the other hand, species need to be able to adapt their phenotypes to changing selection pressure. Therefore, robustness seems to be the opposite of evolvability. This topic is aimed at explaining the complex relationship between robustness and evolvability in proteins in the light of tolerating mutations at certain positions while being sensitive at others. The given literature is merely a starting point for further reading and should not be considered complete.

Literature:

• Draghi, J.A., et al. (2010) Mutational robustness can facilitate adaptation, Nature, 463, 353-355.
• Kowarsch, A., et al. (2010) Correlated mutations: a hallmark of phenotypic amino acid substitutions, PLoS Comput Biol, 6.
• McLaughlin, R.N., Jr., et al. (2012) The spatial architecture of protein function and adaptation, Nature, 491, 138-142.

Predicting functional effects of sequence variants

Maximilian Hecht

Elucidating the effects of naturally occurring genetic variation on the wild-type cellular function is one of the major challenges in personalized medicine. This talk shall explain how variant effects can be predicted and how this can help to further our understanding of naturally occuring variation and disease. The given literature is merely a starting point for further reading and should not be considered complete.

Literature:

• Bromberg, Y., & Rost, B. (2007). SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic acids research, 35(11), 3823-3835.
• Cline, M. S., & Karchin, R. (2011). Using bioinformatics to predict the functional impact of SNVs. Bioinformatics, 27(4), 441-448.
• Hecht, M., Bromberg, Y., & Rost, B. (2013). News from the protein mutability landscape. Journal of molecular biology, 425(21), 3937-3948.
• Bromberg, Y., Kahn, P. C. & Rost, B. (2013). Neutral and weakly nonneutral sequence variants may define individuality. Proc Natl Acad Sci U S A 110, 14255-60.

Potassium channels

Dr. Edda Kloppmann

Potassium channels form pores accross membranes selective for K+-ions. They constitute a major class of ion channels and occur in most organisms.

Literature:

• Y Jiang, A Lee, J Chen, V Ruta, M Cadene, BT Chait & R MacKinnon. X-ray structure of a voltage-dependent K+ channel. Nature (2003) 423: 33–41.
• ..

Structure comparisons and structure searches with TopMatch and TopSearch

TopMatch and TopSearch are twin tools to compare/align protein structures and to search the database of protein structures for structural matches. The talk should give an introduction to the alignment method and the special features of comparing entire structure complexes.

Literature:

• Markus Wiederstein, Markus Gruber, Karl Frank, Francisco Melo, Manfred J. Sippl, Structure-Based Characterization of Multiprotein Complexes, Structure, Volume 22, Issue 7, 8 July 2014, Pages 1063-1070, ISSN 0969-2126, http://dx.doi.org/10.1016/j.str.2014.05.005, http://www.sciencedirect.com/science/article/pii/S0969212614001452
• Sippl, M.J., Wiederstein, M.
Detection of spatial correlations in protein structures and molecular complexes
(2012) Structure, 20 (4), pp. 718-728. http://www.sciencedirect.com/science/article/pii/S096921261200055X
• Andrea Schafferhans, Burkhard Rost, Taking Structure Searches to the Next Dimension, Structure, Volume 22, Issue 7, 8 July 2014, Pages 938-939, ISSN 0969-2126, http://dx.doi.org/10.1016/j.str.2014.06.007.(http://www.sciencedirect.com/science/article/pii/S0969212614001828, http://authors.elsevier.com/a/1PKIy3SNvbaSIa)

Three-dimensional reconstruction of protein networks provides insight into human genetic disease

A recent study shows how reconstructing protein-protein interactions in 3D and mapping genetic variations onto the structures may help to understand disease mechanisms. The talk should give an overview of the 3D reconstruction and the lessons learned from mapping variation.

HIV Mutatioal Pathways

Lothar Richter

HIV is the causing agent for AIDS and still not curable. A lot of research has been done to understand the underlying molecular mechanisms and to elucidate the basis of aquired resistance to anti-viral drugs. The talk should cover some computational approaches to detect and to predict mutation pathways leading to resistance against various types of anti-viral agents. The given literature should serve as a starting point for more recent developments in the field.

• Lawyer et al. AIDS Research and Therapy 2011, 8:26 http://www.aidsrestherapy.com/content/8/1/26
• Richter, L, Augustin, R, and Kramer, S (2009). Finding Relational Associations in HIV Resistance Mutation Data
In: Proceedings of the 19th International Conference on Inductive Logic Programming (ILP 2009), ed. by Luc De Raedt, Springer Verlag