Large-scale analysis of thermostable, mammalian proteins provides insights into the intrinsically disordered proteome.

TitleLarge-scale analysis of thermostable, mammalian proteins provides insights into the intrinsically disordered proteome.
Publication TypeJournal Article
Year of Publication2009
AuthorsGalea, CA, High, AA, Obenauer, JC, Mishra, A, Park, C-G, Punta, M, Schlessinger, A, Ma, J, Rost, B, Slaughter, CA, Kriwacki, RW
JournalJ Proteome Res
Date Published2009 Jan
KeywordsAnimals, Chromatography, Liquid, Computational Biology, Databases, Protein, Fibroblasts, Mass Spectrometry, Mice, NIH 3T3 Cells, Protein Conformation, Protein Folding, Proteome, Proteomics, Software, Temperature, Time Factors

Intrinsically disordered proteins are predicted to be highly abundant and play broad biological roles in eukaryotic cells. In particular, by virtue of their structural malleability and propensity to interact with multiple binding partners, disordered proteins are thought to be specialized for roles in signaling and regulation. However, these concepts are based on in silico analyses of translated whole genome sequences, not on large-scale analyses of proteins expressed in living cells. Therefore, whether these concepts broadly apply to expressed proteins is currently unknown. Previous studies have shown that heat-treatment of cell extracts lead to partial enrichment of soluble, disordered proteins. On the basis of this observation, we sought to address the current dearth of knowledge about expressed, disordered proteins by performing a large-scale proteomics study of thermostable proteins isolated from mouse fibroblast cells. With the use of novel multidimensional chromatography methods and mass spectrometry, we identified a total of 1320 thermostable proteins from these cells. Further, we used a variety of bioinformatics methods to analyze the structural and biological properties of these proteins. Interestingly, more than 900 of these expressed proteins were predicted to be substantially disordered. These were divided into two categories, with 514 predicted to be predominantly disordered and 395 predicted to exhibit both disordered and ordered/folded features. In addition, 411 of the thermostable proteins were predicted to be folded. Despite the use of heat treatment (60 min at 98 degrees C) to partially enrich for disordered proteins, which might have been expected to select for small proteins, the sequences of these proteins exhibited a wide range of lengths (622 +/- 555 residues (average length +/- standard deviation) for disordered proteins and 569 +/- 598 residues for folded proteins). Computational structural analyses revealed several unexpected features of the thermostable proteins: (1) disordered domains and coiled-coil domains occurred together in a large number of disordered proteins, suggesting functional interplay between these domains; and (2) more than 170 proteins contained lengthy domains (>300 residues) known to be folded. Reference to Gene Ontology Consortium functional annotations revealed that, while disordered proteins play diverse biological roles in mouse fibroblasts, they do exhibit heightened involvement in several functional categories, including, cytoskeletal structure and cell movement, metabolic and biosynthetic processes, organelle structure, cell division, gene transcription, and ribonucleoprotein complexes. We believe that these results reflect the general properties of the mouse intrinsically disordered proteome (IDP-ome) although they also reflect the specialized physiology of fibroblast cells. Large-scale identification of expressed, thermostable proteins from other cell types in the future, grown under varied physiological conditions, will dramatically expand our understanding of the structural and biological properties of disordered eukaryotic proteins.

Alternate JournalJ. Proteome Res.
PubMed ID19067583
PubMed Central IDPMC2760310
Grant List2R01CA082491 / CA / NCI NIH HHS / United States
5R21CA104568 / CA / NCI NIH HHS / United States
P30 CA021765-31 / CA / NCI NIH HHS / United States
R01 CA082491-08 / CA / NCI NIH HHS / United States