Seminar 'Proteins and Disease' WS15/16

Type  Seminar (2 SWS)
ECTS 4.0
Lecturer Burkhard Rost, Maximilan Hecht
Time Monday, 12:00 - 13:30
Room MI 01.09.034
Language English

Application / Registration

Application is organised centrally for all bioinformatics seminars. After you have been assigned to our seminar, we will distribute the topics.

Content

Topics related to the research interests of the group: protein sequence analysis, sequence based predictions, protein structure prediction and analysis; interaction networks.

Pre-meeting

The Pre-meeting will be held on Jul 21sr at 11 am in Room MI 01.09.034

The rules and hints for preparation of the seminar discussed  in the pre-meeting are also summarised in our Checklist and on these slides (update Jul 21st).

Final Schedule
 

Final Schedule

Date Topic Supervisor Student
Oct 19
Biological Databases
Cejuela Wentzig
Oct 26
Predicting subcellular localization using functional hierarchies
Goldberg Wirth
 
Protein localization prediction from evolutionary profiles
Goldberg Werner
Nov 2
Protein disorder — a breakthrough invention of evolution?
Goldberg Maier
 
Mass-spectrometry-based draft of the human proteome
Kloppmann Ren
Nov 9
CRISPR/Cas
 
Reeb/Richter Sturm
 
Single Cell Sequencing
Reeb Hadziahmetovic
Nov 16
Robustness and evolvability of proteins
Hecht Madin
 
Predicting functional effects of sequence variants
Hecht Zwiebel
Nov 23
PolyPhobius: Prediction of transmembrane helices in protein sequences
Reeb Giurgiu
 

Observations in Fold Space

Schafferhans

de Motte
Nov 30
HIV Mutational Pathways
Richter Sigl
 
Conditional random fields for named entity recognition
Cejuela

Gilicze

 

 

 

 

Description of Topics

This list is preliminary and will be extended by 1-3 more topics.

Conditional random fields for mutation mention recognition (NER applied to mutations)

Juan Miguel Cejuela

We will look into one specific problem of text mining and research discovery: the recognition of mutations in publications. Simple mentions of mutations such as SNPs like "A32G" are easy to recognize with regular expressions or rule-based approaches (see for example the system MutationFinder). However, more complex mentions that use natural language still pose an unsolved challenge (e.g. "a homozygous complex mutation in Family B that consisted of a deletion of GC at codon 48 (exon 2) with the insertion of 25 bp (c.143–144delGCins25)"). Moreoever, implementation and evaluation of mutation finders in full-text articles is also scarse to nonexistent. In this seminar, we will look into the latest state-of-the-art systems for mutation mention recognition, [2]. These systems are machine-learning-based and typically apply Conditional Random Fields (CRFs) [1] for named-entity recognition. We will look at the challenges of recognizing natural mentions and different types of mutations: single substitutions, multiple substitutions, insertions, deletions, translocations, etc. Finally, we will discuss the importance of building succesful text mining systems for mutation recognition. Note that this seminar may require some programming (in Java or C++) to have a hands-on experience in constructing and applying CRFs.

  1. Lafferty, J., McCallum, A., Pereira, F. (2001). "Conditional random fields: Probabilistic models for segmenting and labeling sequence data". Proc. 18th International Conf. on Machine Learning. Morgan Kaufmann. pp. 282–289.
  2. Chih-Hsuan Wei, Bethany R. Harris, Hung-Yu Kao, and Zhiyong Lu. (2013). "tmVar: a text mining approach for extracting sequence variants in biomedical literature" Bioinformatics (2013) 29 (11)

 

Biological Databases

Juan Miguel Cejuela

Huge volumes of primary data are archived in numerous open-access databases, and with new generation technologies becoming more common in laboratories. This seminar shall give an overview of different Databases, how to access them and problems associated.

  • Arthur M. Lesk. Introduction to bioinformatics (Third Edition) Oxford University Press

Predicting subcellular localization using functional hierarchies

Tatyana Goldberg

Identification of a protein’s subcellular localization is an important step towards elucidating its function. In this seminar, a machine-learning-based method for predicting localization in prokaryotes and eukaryotes shall be presented. The method is original in incorporating a hierarchical ontology of subcellular localization classes. Furthermore, it uses predicted features like the secondary structure of a protein and evolutionary information in form of sequence profiles to improve prediction accuracy considerably.

Literature:

  • Alberts B, Bray D, Lewis J, Raff M, Roberts K, Watson JD. Molecular Biology of the Cell. New York: Garland Science, 2002

 

Protein localization prediction from evolutionary profiles

Tatyana Goldberg

Identification of a protein’s subcellular localization is an important step towards elucidating its function. In this seminar, a machine-learning-based methods for predicting localization in prokaryotes and eukaryotes shall be presented. The methods incorporate a hierarchical ontology of subcellular localization classes. The predictions are derived from evolutionary infromation (Loctree2/3) as well as from the powerful sequence homology-based BLAST (Loctree3).

Literature:



Protein disorder — a breakthrough invention of evolution?

Tatyana Goldberg

The regions in proteins that do not adopt regular three-dimensional structures in isolation are called disordered regions. In this seminar the functional and structural aspects of disordered proteins shall be discussed. Though only one literature source is provided, the student is expected to use and refer to in his presentation to additional sources for a detailed understanding of protein disorder.

Literature:

  • Schlessinger A, Schaefer C, Vicedo E, Schmidberger M, Punta M, Rost B  (2011). Protein disorder--a breakthrough invention of evolution? Curr Opin Struct Biol. Jun;21(3):412-8 http://www.ncbi.nlm.nih.gov/pubmed/21514145
  • ...

 

Mass-spectrometry-based draft of the human proteome

Dr. Edda Kloppmann

Using mass-spectrometry, researchers from TUM have produced an almost complete inventory of the human proteome. This information is now freely available in the ProteomicsDB database, which is a joint development of TUM and software company SAP. The database includes information for example on the types, distribution, and abundance of proteins in various cells and tissues as well as in body fluids. The talk shall briefly introduce mass-spectrometry and then focus on the results of the publication below and the ProteomicsDB.

Literature:

  • Mathias Wilhelm, Judith Schlegl, Hannes Hahne, Amin Moghaddas Gholami, Marcus Lieberenz, Mikhail M. Savitski, Emanuel Ziegler, Lars Butzmann, Siegfried Gessulat, Harald Marx, Toby Mathieson, Simone Lemeer, Karsten Schnatbaum, Ulf Reimer, Holger Wenschuh, Martin Mollenhauer, Julia Slotta-Huspenina, Joos-Hendrik Boese, Marcus Bantscheff, Anja Gerstmair, Franz Faerber & Bernhard Kuster, Mass-spectrometry-based draft of the human proteome; Nature, DOI: 10.1038/nature13319
  • http://www.tum.de/en/about-tum/news/press-releases/short/article/31545/

 

PolyPhobius: Prediction of transmembrane helices in protein sequences

Jonas Reeb

PolyPhobius uses hidden markov models (HMMs) to predict transmembrane helices in protein sequences. This talk shall introduce transmembrane proteins, HMMs and sequence-based transmembrane helix prediction at the example of PolyPhobius.

Literature:

  • Alberts
  • Bioinformatics
  • Lukas Käll, Anders Krogh and Erik Sonnhammer. An HMM posterior decoder for sequence feature prediction that includes homology information. Bioinformatics, 21 (Suppl 1):i251-i257, June 2005.
  • Bernsel, A., Viklund, H., Falk, J., Lindahl, E., Von Heijne, G., & Elofsson, A. (2008). Prediction of membrane-protein topology from first principles. Proceedings of the National Academy of Sciences, 105(20), 7177–7181. doi:10.1073/pnas.0711151105

 

Robustness and evolvability of proteins

Maximilian Hecht

Mutations are the catalysts of evolution. Phenotypes need to be robust against mutation in order to prevail. On the other hand, species need to be able to adapt their phenotypes to changing selection pressure. Therefore, robustness seems to be the opposite of evolvability. This topic is aimed at explaining the complex relationship between robustness and evolvability in proteins in the light of tolerating mutations at certain positions while being sensitive at others. The given literature is merely a starting point for further reading and should not be considered complete.

Literature:

  • Draghi, J.A., et al. (2010) Mutational robustness can facilitate adaptation, Nature, 463, 353-355.
  • Kowarsch, A., et al. (2010) Correlated mutations: a hallmark of phenotypic amino acid substitutions, PLoS Comput Biol, 6.
  • McLaughlin, R.N., Jr., et al. (2012) The spatial architecture of protein function and adaptation, Nature, 491, 138-142.

CRISPR/Cas

Jonas Reeb / Lothar Richter

Clustered regularly interspaced short palindromic repeat (CRISPR) technology, a microbial defense system, has been developed based on its remarkable ability to bring the endonuclease Cas9 to specific locations within complex genomes by a short RNA, to precisely edit the genome, to build toolkits for synthetic biology, and to monitor DNA in live cells. This seminar is a presentation of the underlying principles and possible applications.

  • http://www.cell.com/nucleus-CRISPR
  • Mali, P., Esvelt, K. M., & Church, G. M. (2013). Cas9 as a versatile tool for engineering biology. Nature Methods, 10(10), 957–963. doi:10.1016/j.biotechadv.2011.08.021.Secreted
  • Sander, J. D., & Joung, J. K. (2014). CRISPR-Cas systems for editing, regulating and targeting genomes. Nature Biotechnology, 32(4), 347–55. doi:10.1038/nbt.2842
  • Hsu, P. D., Lander, E. S., & Zhang, F. (2014). Development and Applications of CRISPR-Cas9 for Genome Engineering. Cell, 157(6), 1262–1278. doi:10.1016/j.cell.2014.05.010

 

Single Cell Sequencing

Jonas Reeb

Whereas genome approaches in many case are extended to meta-genome approaches also an specialization towards the opposite direction exists. Single cell sequencing acknowledge the fact of diversity in tissue and cell populations. This talk will present this new approach.

  • Kolodziejczyk, A. A., Kim, J. K., Svensson, V., Marioni, J. C., & Teichmann, S. A. (2015). The Technology and Biology of Single-Cell RNA Sequencing. Molecular Cell, 58(4), 610–620. doi:10.1016/j.molcel.2015.04.005
  • Buettner, F., Natarajan, K. N., Casale, F. P., Proserpio, V., Scialdone, A., Theis, F. J., … Stegle, O. (2015). Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nature Biotechnology, 33(2). doi:10.1038/nbt.3102
  • Omics, S. (2015). Computational and analytical challenges in single-cell transcriptomics. Nature Publishing Group, 16(January 2014), 133–145. doi:10.1038/nrg3833

 

Predicting functional effects of sequence variants

Maximilian Hecht

Elucidating the effects of naturally occurring genetic variation on the wild-type cellular function is one of the major challenges in personalized medicine. This talk shall explain how variant effects can be predicted and how this can help to further our understanding of naturally occuring variation and disease. The given literature is merely a starting point for further reading and should not be considered complete.

Literature:

  • Bromberg, Y., & Rost, B. (2007). SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic acids research35(11), 3823-3835.
  • Cline, M. S., & Karchin, R. (2011). Using bioinformatics to predict the functional impact of SNVs. Bioinformatics27(4), 441-448.
  • Hecht, M., Bromberg, Y., & Rost, B. (2013). News from the protein mutability landscape. Journal of molecular biology425(21), 3937-3948.
  • Bromberg, Y., Kahn, P. C. & Rost, B. (2013). Neutral and weakly nonneutral sequence variants may define individuality. Proc Natl Acad Sci U S A 110, 14255-60.

 

Observations in Fold Space

Andrea Schafferhans

HIV Mutational Pathways

Lothar Richter

HIV is the causing agent for AIDS and still not curable. A lot of research has been done to understand the underlying molecular mechanisms and to elucidate the basis of aquired resistance to anti-viral drugs. The talk should cover some computational approaches to detect and to predict mutation pathways leading to resistance against various types of anti-viral agents. The given literature should serve as a starting point for more recent developments in the field.

  • Lawyer et al. AIDS Research and Therapy 2011, 8:26 http://www.aidsrestherapy.com/content/8/1/26
  • Richter, L, Augustin, R, and Kramer, S (2009). Finding Relational Associations in HIV Resistance Mutation Data
    In: Proceedings of the 19th International Conference on Inductive Logic Programming (ILP 2009), ed. by Luc De Raedt, Springer Verlag