Supplement for
Loopy proteins appear conserved in evolution
Jinfeng Liu, Hepan Tan & Burkhard Rost
TOC:
· Table 1S: Number of NORS proteins predicted under different thresholds
· Table 2S: NORS predicted in proteomes
· Table 3S: NORS involved in protein-protein interactions listed in DIP
· Table 4S: Comparison between NORS and 'natively disordered regions'
· References for Supplement (all also quoted in manuscript)
Table 1S: Number of NORS proteins predicted under different thresholds
|
LenWina |
%secb |
accLenc |
nNORS_PDBd |
FP_PDBe |
nNors_Genf |
%Nors_Geng |
|
50 |
8 |
10 |
45 |
20 |
41757 |
23.0 |
|
50 |
8 |
15 |
19 |
5 |
32313 |
17.8 |
|
50 |
8 |
20 |
11 |
1 |
24927 |
13.7 |
|
50 |
10 |
10 |
50 |
24 |
44971 |
24.7 |
|
50 |
10 |
15 |
24 |
7 |
34311 |
18.9 |
|
50 |
10 |
20 |
12 |
2 |
26134 |
14.4 |
|
50 |
12 |
10 |
63 |
35 |
48047 |
26.4 |
|
50 |
12 |
15 |
27 |
9 |
36080 |
19.8 |
|
50 |
12 |
20 |
13 |
3 |
27174 |
14.9 |
|
50 |
14 |
10 |
73 |
44 |
51547 |
28.3 |
|
50 |
14 |
15 |
29 |
10 |
38181 |
21.0 |
|
50 |
14 |
20 |
14 |
4 |
28392 |
15.6 |
|
|
|
|
|
|
|
|
|
60 |
8 |
10 |
20 |
5 |
33857 |
18.6 |
|
60 |
8 |
15 |
7 |
0 |
27266 |
15.0 |
|
60 |
8 |
20 |
4 |
0 |
21653 |
11.9 |
|
60 |
10 |
10 |
25 |
6 |
37538 |
20.6 |
|
60 |
10 |
15 |
10 |
0 |
29653 |
16.3 |
|
60 |
10 |
20 |
5 |
0 |
23170 |
12.7 |
|
60 |
12 |
10 |
36 |
12 |
41532 |
22.8 |
|
60 |
12 |
15 |
16 |
2 |
32207 |
17.7 |
|
60 |
12 |
20 |
6 |
0 |
24760 |
13.6 |
|
60 |
14 |
10 |
47 |
22 |
44123 |
24.3 |
|
60 |
14 |
15 |
19 |
3 |
33825 |
18.6 |
|
60 |
14 |
20 |
7 |
0 |
25768 |
14.2 |
|
|
|
|
|
|
|
|
|
70 |
8 |
10 |
18 |
3 |
29935 |
16.5 |
|
70 |
8 |
15 |
7 |
0 |
24585 |
13.5 |
|
70 |
8 |
20 |
4 |
0 |
19789 |
10.9 |
|
70 |
10 |
10 |
21 |
5 |
33609 |
18.5 |
|
70 |
10 |
15 |
9 |
0 |
27065 |
14.9 |
|
70 |
10 |
20 |
4 |
0 |
21411 |
11.8 |
|
**70 |
12 |
10 |
23 |
5 |
36203 |
19.9 |
|
70 |
12 |
15 |
10 |
0 |
28791 |
15.8 |
|
70 |
12 |
20 |
4 |
0 |
22555 |
12.4 |
|
70 |
14 |
10 |
32 |
11 |
38243 |
21.0 |
|
70 |
14 |
15 |
13 |
2 |
30133 |
16.6 |
|
70 |
14 |
20 |
6 |
1 |
23450 |
12.9 |
|
|
|
|
|
|
|
|
|
80 |
8 |
10 |
15 |
4 |
26676 |
14.7 |
|
80 |
8 |
15 |
8 |
0 |
22323 |
12.3 |
|
80 |
8 |
20 |
3 |
0 |
18211 |
10.0 |
|
80 |
10 |
10 |
17 |
5 |
29841 |
16.4 |
|
80 |
10 |
15 |
9 |
0 |
24521 |
13.5 |
|
80 |
10 |
20 |
4 |
0 |
19667 |
10.8 |
|
80 |
12 |
10 |
20 |
5 |
31864 |
17.5 |
|
80 |
12 |
15 |
9 |
0 |
25885 |
14.2 |
|
80 |
12 |
20 |
4 |
0 |
20603 |
11.3 |
|
80 |
14 |
10 |
23 |
8 |
35380 |
19.5 |
|
80 |
14 |
15 |
11 |
1 |
28209 |
15.5 |
|
80 |
14 |
20 |
4 |
0 |
22135 |
12.2 |
|
|
|
|
|
|
|
|
|
90 |
8 |
10 |
8 |
1 |
23951 |
13.2 |
|
90 |
8 |
15 |
3 |
0 |
20263 |
11.1 |
|
90 |
8 |
20 |
1 |
0 |
16716 |
9.2 |
|
90 |
10 |
10 |
12 |
3 |
26239 |
14.4 |
|
90 |
10 |
15 |
5 |
0 |
21986 |
12.1 |
|
90 |
10 |
20 |
2 |
0 |
17897 |
9.8 |
|
90 |
12 |
10 |
14 |
3 |
28315 |
15.6 |
|
90 |
12 |
15 |
6 |
0 |
23442 |
12.9 |
|
90 |
12 |
20 |
2 |
0 |
18874 |
10.4 |
|
90 |
14 |
10 |
20 |
6 |
31272 |
17.2 |
|
90 |
14 |
15 |
9 |
1 |
25520 |
14.0 |
|
90 |
14 |
20 |
2 |
0 |
20271 |
11.1 |
|
|
|
|
|
|
|
|
|
100 |
8 |
10 |
4 |
1 |
21586 |
11.9 |
|
100 |
8 |
15 |
1 |
0 |
18533 |
10.2 |
|
100 |
8 |
20 |
1 |
0 |
15446 |
8.5 |
|
100 |
10 |
10 |
7 |
1 |
24088 |
13.2 |
|
100 |
10 |
15 |
3 |
0 |
20374 |
11.2 |
|
100 |
10 |
20 |
2 |
0 |
16718 |
9.2 |
|
100 |
12 |
10 |
11 |
2 |
26527 |
14.6 |
|
100 |
12 |
15 |
5 |
0 |
22183 |
12.2 |
|
100 |
12 |
20 |
2 |
0 |
17957 |
9.9 |
|
100 |
14 |
10 |
13 |
3 |
29108 |
16.0 |
|
100 |
14 |
15 |
6 |
0 |
23934 |
13.2 |
|
100 |
14 |
20 |
2 |
0 |
19150 |
10.5 |
aLenWin: Length of sequence window
b%Sec: Cutoff for percentage of secondary structure
caccLen: Cutoff for minimum length of continous exposed residues in the sequence window
dnNORS_PDB: number of NORS proteins predicted in PDBsub
eFP_PDB: number of false positives in previous column
fnNORS_Gen: number of NORS proteins predicted in all 31 proteomes
g%NORS_Gen: percentage of NORS proteins in all 31 proteomes.
** The threshold we used in the paper
Table 2S: NORS predicted in proteomes
|
Organism |
Number of proteins |
Percentage of NORS proteins |
Percentage of residues in NORS |
Archae bacteria |
|
|
|
Aeropyrum pernix K1 |
2694 |
13.1 |
6.8 |
|
Archaeoglobus fulgidus |
2383 |
1.1 |
0.4 |
|
Methanococcus jannaschii |
1735 |
0.6 |
0.3 |
|
Methanobacterium thermoautotrophicum |
1871 |
1.8 |
0.7 |
|
Pyrococcus abyssi |
1765 |
1.5 |
0.4 |
|
Pyrococcus horikoshii |
2064 |
3.0 |
1.1 |
Prokaryotes |
|
|
|
Aquifex aeolicus |
1522 |
1.4 |
0.4 |
|
Bacillus subtilis |
4099 |
1.3 |
0.5 |
|
Borrelia burgdorferi |
850 |
1.2 |
0.4 |
|
Campylobacter jejuni |
1731 |
1.6 |
0.5 |
|
Chlamydia pneumoniae |
1052 |
2.9 |
1.1 |
|
Chlamydia trachomatis |
894 |
2.9 |
1.0 |
|
Deinococcus radiodurans |
3103 |
4.8 |
1.9 |
|
Escherichia coli |
4285 |
4.8 |
0.6 |
|
Haemophilus influenzae |
1716 |
2.2 |
0.8 |
|
Helicobacter pylori |
1788 |
1.2 |
0.4 |
|
Mycoplasma genitalium |
470 |
3.0 |
0.9 |
|
M pneumoniae |
677 |
3.7 |
1.4 |
|
Mycobacterium tuberculosis |
3918 |
6.3 |
2.7 |
|
Neisseria meningitidis |
2081 |
3.4 |
1.4 |
|
Rickettsia prowazekii |
834 |
1.6 |
0.4 |
|
Synechocystis PCC6803 |
3169 |
2.8 |
1.1 |
|
Thermotoga maritima |
1846 |
1.6 |
0.5 |
|
Treponema pallidum |
1031 |
4.1 |
1.3 |
|
Ureaplasma urealyticum |
613 |
1.5 |
0.4 |
Eukaryotes |
|
|
|
Arabidopsis thaliana |
25445 |
20.3 |
7.2 |
|
Caenorhabditis elegans |
20011 |
17.5 |
6.8 |
|
Drosophila melanogaster |
14333 |
27.1 |
10.8 |
|
Saccharomyces cerevisiae |
6307 |
18.5 |
7.2 |
|
Mus musculus |
28097 |
29.1 |
13.8 |
|
Homo sapiens |
37313 |
30.2 |
14.9 |
Table 3S: NORS involved in protein-protein interactions listed in DIP [xxx 1]
|
DIP entry |
Interaction Range |
NORS region involved |
|
885 |
yjr022w:ygl173c(1123-1500) |
ygl173c(1280-1471) |
|
905 |
ykl074c(41-141):ylr116w |
ykl074c(20-201),ylr116w(296-476) |
|
1164 |
yor122c:ynl271c(1239-1328) |
ynl271c(28-110,225-315,1192-1363) |
|
1188 |
ydr176w(1-373):ygl013c(813-1063) |
ydr176w(157-237),ygl013c(881-1054) |
|
1189 |
ydr176w(1-373):ybl005w(765-976) |
ydr176w(157-237),ybl005w(850-961) |
|
1252 |
ydl150w:ykr025w(62-200) |
ykr025w(82-155) |
|
3245 |
yjl124c:yal019w(432-950) |
yal019w(442-517) |
|
3249 |
yjl124c:ycr077c(95-350) |
ycr077c(158-257) |
|
3255 |
yjl124c:yel060c(58-300) |
yel060c(15-185) |
|
3265 |
yjl124c:ylr362w(172-350) |
ylr362w(195-432) |
|
3284 |
ybl026w:ydr440w(121-450) |
ydr440w(65-195) |
|
3305 |
ybl026w:yor191w(506-900) |
yor191w(751-862) |
|
3307 |
ybl026w:ypl115c(606-850) |
ypl115c(738-884) |
|
3316 |
ybl026w:ypr032w(15-300) |
ypr032w(16-87) |
|
3317 |
yer112w:ybr289w(360-650) |
ybr289w(295-437) |
|
3327 |
yer112w:ygl173c(853-1500) |
ygl173c(872-986,1280-1471) |
|
3333 |
ylr438c-a:ycr077c(168-350) |
ycr077c(158-257) |
|
3338 |
ylr438c-a:ykr099w(57-750) |
ykr099w(456-723) |
|
3354 |
yer112w:yol004w(492-650) |
yol004w(522-670) |
|
3359 |
yer146w:ycr077c(132-350) |
ycr077c(158-257) |
|
3374 |
yer146w:yjl084c(494-850) |
yjl084c(528-983) |
|
3399 |
ynl147w:ycr077c(153-350) |
ycr077c(158-257) |
|
3414 |
yjr022w:ygl028c(129-300) |
ygl028c(124-304) |
|
3454 |
ynl118c:ybl054w(161-450) |
ybl054w(119-358) |
|
3459 |
ynl118c:ygl173c(890-1450) |
ygl173c(872-986,1280-1471) |
|
3461 |
ynl118c:yhr186c(680-1350) |
yhr186c(1076-1150) |
|
3699 |
yor181w(169-402):ydr388w |
yor181w(125-607) |
|
3905 |
ydr388w(87-422):ymr192w; ydr388w(177-461):ymr192w |
ydr388w(340-446) |
|
3994 |
ydr515w(188-321):ydr388w |
ydr515w(113-262) |
|
4017 |
ydr362c(2-355):ydr388w |
ydr362c(19-93) |
|
4036 |
ypl181w(302-507):ydr388w; ypl181w(204-504):ydr388w |
ypl181w(161-252,315-433) |
|
4048 |
yor181w(169-402):ycr009c |
yor181w(125-607) |
|
4051 |
ymr192w(76-300):ycr009c; ymr192w(87-422):ycr009c |
ymr192w(138-218) |
|
4100 |
yfl008w(762-936):ygr089w(159-497) |
ygr089w(295-449) |
|
4136 |
ymr124w(249-416):yer149c(803-863) |
ymr124w(263-373) |
|
4165 |
yel043w(762-936):ygr089w(214-392) |
yel043w(665-935),ygr089w(295-449) |
|
4209 |
ypl155c(1-180):ypl124w(495-702) |
ypl155c(1-163) |
Table 4S: Comparison between NORS and 'natively disordered regions' [xxx 2]
|
SWISS-PROT ID [xxx 3] |
Protein length |
Native disorder regions a |
NORS |
|
SANT_PLAFW |
640 |
51-627 |
40-640 |
|
SANT-PLAF7 |
593 |
68-557 |
38-593 |
|
VG48_HSVSA |
797 |
413-720 |
427-741 |
|
MLH_TETTH |
633 |
173-450 |
157-633 |
|
CYL1_HUMAN |
598 |
238-492 |
212-343, 386-595 |
|
NFH_MOUSE |
1087 |
523-761 |
403-1087 |
|
RTOA_DICDI |
400 |
77-303 |
40-400 |
|
SR75_HUMAN |
494 |
186-410 |
154-494 |
|
RPB1_CRIGR |
467 |
1610-1830 b |
1-467 |
|
LSTP_STAST |
480 |
56-223 |
26-233 |
|
YHFI_SALTY |
416 |
57-210 |
55-259 |
|
T2FA_DROME |
577 |
246-398 |
210-338 |
|
H1_PEA |
265 |
100-252 |
115-257 |
|
110K_PLAKN |
296 |
135-283 |
99-296 |
|
XYNA_RUMFL |
954 |
248-376 |
232-635 |
|
VIT2_CHICK |
1850 |
1142-1266 |
1078-1370 |
|
SNWA_DICDI |
685 |
398-521 |
356-514 |
|
FHL1_YEAST |
936 |
800-923 |
718-936 |
|
HYR1_CANAL |
937 |
621-741 |
373-520 |
|
ANK2_HUMAN |
3924 |
1778-1897 |
1743-1958 |
a: taken from the publication of Romero et al. [xxx 2]
b: probably an error in the original paper.
xxx 1 Xenarios, I., Salwinski, L., Duan, X. J., Higney, P., Kim, S. M. et al. (2002). DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucl. Acids Res., 30, 303-5..
xxx 2 Romero, P., Obradovic, Z., Kissinger, C. R., Villafranca, J. E., Garner, E. et al. (1998). Thousands of proteins likely to have long disordered regions. Pac. Symp. Biocomput., 437-448.
xxx 3 Bairoch, A. & Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucl. Acids Res., 28, 45-48.