Subcellular Localization

LOCtree is a novel system of support vector machines (SVMs) that predict the subcellular localization of proteins, and DNA-binding propensity for nuclear proteins, by incorporating a hierarchical ontology of localization classes modeled onto biological processing pathways. Biological similarities are incorporated from the description of cellular components provided by the gene ontology consortium (GO). GO definitions have been simplified and tailored to the problem of protein sorting. Technically the ontology has been implemented using a decision tree with SVMs as the nodes. LOCtree, was extremely successful at learning evolutionary similarities among subcellular localization classes and was significantly more accurate than other traditional networks at predicting subcellular localization. Whenever available, LOCtree also reports predictions based on the following: 1) Nuclear localization signals found by PredictNLS. 2) Localization inferred using Prosite motifs and Pfam domains found in the protein, and 3) SWISS-PROT keywords associated with a protein. Localization is inferred in the last two cases using the entropy-based LOCkey algorithm.

PredictNLS is an automated tool for the analysis and in silico determination of Nuclear Localization Signals (NLS). In NLS discovery mode, PredictNLS searches a query protein for known and potential NLS's in NLSdb to determine if a protein is likely to be targeted to the nucleus. If the protein is determined to be nuclear, the program also reports if a known DNA binding motif is found. In Motif detection mode, the program can help you decide if a sequence motif is likely to act as a nuclear localization signal. The PredictNLS website also documents the largest collection of experimentally determined NLS's.

LOCkey is a database of subcellular localization of eukaryotic proteins inferred using SWISSPROT keywords. LOCkey was the first fully automated algorithm for inferring subcellular loclaization from database annotations. LOKey outperformed semi-automated methods relying on expert annotators in benchmark tests. NLSdb NLSdb is a database of nuclear localization signals (NLSs) and of nuclear proteins targeted to the nucleus by NLS motifs.NLSdb contains over 12500 predicted nuclear proteins and over 1500 DNA-binding proteins from six entirely sequenced eukaryotic proteomes (human, mouse, fly, worm, grass and yeast). ER/Golgi Localization: Analysis of experimentally characterized endoplasmic reticulum and Golgi apparatus retrieval motifs and estimates of their specificity to classify subcellular localization for the ER and Golgi. Further investigation of inferring ER and Golgi localization from homology-transfer sequence similarity of ER and Golgi localized proteins.