Supplement for
Loopy proteins appear conserved in evolution

Jinfeng Liu, Hepan Tan & Burkhard Rost

 

 

TOC:

 

·          Table 1S: Number of NORS proteins predicted under different thresholds

·          Table 2S: NORS predicted in proteomes

·          Table 3S: NORS involved in protein-protein interactions listed in DIP

·          Table 4S: Comparison between NORS and 'natively disordered regions'

·          References for Supplement (all also quoted in manuscript)

 

 

 

 


 

Table 1S: Number of NORS proteins predicted under different thresholds

 

LenWina

%secb

accLenc

nNORS_PDBd

FP_PDBe

nNors_Genf

%Nors_Geng

50

8

10

45

20

41757

23.0

50

8

15

19

5

32313

17.8

50

8

20

11

1

24927

13.7

50

10

10

50

24

44971

24.7

50

10

15

24

7

34311

18.9

50

10

20

12

2

26134

14.4

50

12

10

63

35

48047

26.4

50

12

15

27

9

36080

19.8

50

12

20

13

3

27174

14.9

50

14

10

73

44

51547

28.3

50

14

15

29

10

38181

21.0

50

14

20

14

4

28392

15.6

 

 

 

 

 

 

 

60

8

10

20

5

33857

18.6

60

8

15

7

0

27266

15.0

60

8

20

4

0

21653

11.9

60

10

10

25

6

37538

20.6

60

10

15

10

0

29653

16.3

60

10

20

5

0

23170

12.7

60

12

10

36

12

41532

22.8

60

12

15

16

2

32207

17.7

60

12

20

6

0

24760

13.6

60

14

10

47

22

44123

24.3

60

14

15

19

3

33825

18.6

60

14

20

7

0

25768

14.2

 

 

 

 

 

 

 

70

8

10

18

3

29935

16.5

70

8

15

7

0

24585

13.5

70

8

20

4

0

19789

10.9

70

10

10

21

5

33609

18.5

70

10

15

9

0

27065

14.9

70

10

20

4

0

21411

11.8

**70

12

10

23

5

36203

19.9

70

12

15

10

0

28791

15.8

70

12

20

4

0

22555

12.4

70

14

10

32

11

38243

21.0

70

14

15

13

2

30133

16.6

70

14

20

6

1

23450

12.9

 

 

 

 

 

 

 

80

8

10

15

4

26676

14.7

80

8

15

8

0

22323

12.3

80

8

20

3

0

18211

10.0

80

10

10

17

5

29841

16.4

80

10

15

9

0

24521

13.5

80

10

20

4

0

19667

10.8

80

12

10

20

5

31864

17.5

80

12

15

9

0

25885

14.2

80

12

20

4

0

20603

11.3

80

14

10

23

8

35380

19.5

80

14

15

11

1

28209

15.5

80

14

20

4

0

22135

12.2

 

 

 

 

 

 

 

90

8

10

8

1

23951

13.2

90

8

15

3

0

20263

11.1

90

8

20

1

0

16716

9.2

90

10

10

12

3

26239

14.4

90

10

15

5

0

21986

12.1

90

10

20

2

0

17897

9.8

90

12

10

14

3

28315

15.6

90

12

15

6

0

23442

12.9

90

12

20

2

0

18874

10.4

90

14

10

20

6

31272

17.2

90

14

15

9

1

25520

14.0

90

14

20

2

0

20271

11.1

 

 

 

 

 

 

 

100

8

10

4

1

21586

11.9

100

8

15

1

0

18533

10.2

100

8

20

1

0

15446

8.5

100

10

10

7

1

24088

13.2

100

10

15

3

0

20374

11.2

100

10

20

2

0

16718

9.2

100

12

10

11

2

26527

14.6

100

12

15

5

0

22183

12.2

100

12

20

2

0

17957

9.9

100

14

10

13

3

29108

16.0

100

14

15

6

0

23934

13.2

100

14

20

2

0

19150

10.5

 

aLenWin:                         Length of sequence window

b%Sec:                              Cutoff for percentage of secondary structure

caccLen:                           Cutoff for minimum length of continous exposed residues in the sequence window

dnNORS_PDB:              number of NORS proteins predicted in PDBsub

eFP_PDB:                         number of false positives in previous column

fnNORS_Gen:               number of NORS proteins predicted in all 31 proteomes

g%NORS_Gen:             percentage of NORS proteins in all 31 proteomes.

**                                        The threshold we used in the paper


 

Table 2S: NORS predicted in proteomes

 

Organism

Number of proteins

Percentage of NORS proteins

Percentage of residues in NORS

Archae bacteria

 

 

 

Aeropyrum pernix K1

2694

13.1

6.8

Archaeoglobus fulgidus

2383

1.1

0.4

Methanococcus jannaschii

1735

0.6

0.3

Methanobacterium thermoautotrophicum

1871

1.8

0.7

Pyrococcus abyssi

1765

1.5

0.4

Pyrococcus horikoshii

2064

3.0

1.1

Prokaryotes

 

 

 

Aquifex aeolicus

1522

1.4

0.4

Bacillus subtilis

4099

1.3

0.5

Borrelia burgdorferi

850

1.2

0.4

Campylobacter jejuni

1731

1.6

0.5

Chlamydia pneumoniae

1052

2.9

1.1

Chlamydia trachomatis

894

2.9

1.0

Deinococcus radiodurans

3103

4.8

1.9

Escherichia coli

4285

4.8

0.6

Haemophilus influenzae

1716

2.2

0.8

Helicobacter pylori

1788

1.2

0.4

Mycoplasma genitalium

470

3.0

0.9

M pneumoniae

677

3.7

1.4

Mycobacterium tuberculosis

3918

6.3

2.7

Neisseria meningitidis

2081

3.4

1.4

Rickettsia prowazekii

834

1.6

0.4

Synechocystis PCC6803

3169

2.8

1.1

Thermotoga maritima

1846

1.6

0.5

Treponema pallidum

1031

4.1

1.3

Ureaplasma urealyticum

613

1.5

0.4

Eukaryotes

 

 

 

Arabidopsis thaliana

25445

20.3

7.2

Caenorhabditis elegans

20011

17.5

6.8

Drosophila melanogaster

14333

27.1

10.8

Saccharomyces cerevisiae

6307

18.5

7.2

Mus musculus

28097

29.1

13.8

Homo sapiens

37313

30.2

14.9

 

 

 


Table 3S: NORS involved in protein-protein interactions listed in DIP [xxx 1]

 

DIP entry

Interaction Range

NORS region involved

885

yjr022w:ygl173c(1123-1500)

ygl173c(1280-1471)

905

ykl074c(41-141):ylr116w

ykl074c(20-201),ylr116w(296-476)

1164

yor122c:ynl271c(1239-1328)

ynl271c(28-110,225-315,1192-1363)

1188

ydr176w(1-373):ygl013c(813-1063)

ydr176w(157-237),ygl013c(881-1054)

1189

ydr176w(1-373):ybl005w(765-976)

ydr176w(157-237),ybl005w(850-961)

1252

ydl150w:ykr025w(62-200)

ykr025w(82-155)

3245

yjl124c:yal019w(432-950)

yal019w(442-517)

3249

yjl124c:ycr077c(95-350)

ycr077c(158-257)

3255

yjl124c:yel060c(58-300)

yel060c(15-185)

3265

yjl124c:ylr362w(172-350)

ylr362w(195-432)

3284

ybl026w:ydr440w(121-450)

ydr440w(65-195)

3305

ybl026w:yor191w(506-900)

yor191w(751-862)

3307

ybl026w:ypl115c(606-850)

ypl115c(738-884)

3316

ybl026w:ypr032w(15-300)

ypr032w(16-87)

3317

yer112w:ybr289w(360-650)

ybr289w(295-437)

3327

yer112w:ygl173c(853-1500)

ygl173c(872-986,1280-1471)

3333

ylr438c-a:ycr077c(168-350)

ycr077c(158-257)

3338

ylr438c-a:ykr099w(57-750)

ykr099w(456-723)

3354

yer112w:yol004w(492-650)

yol004w(522-670)

3359

yer146w:ycr077c(132-350)

ycr077c(158-257)

3374

yer146w:yjl084c(494-850)

yjl084c(528-983)

3399

ynl147w:ycr077c(153-350)

ycr077c(158-257)

3414

yjr022w:ygl028c(129-300)

ygl028c(124-304)

3454

ynl118c:ybl054w(161-450)

ybl054w(119-358)

3459

ynl118c:ygl173c(890-1450)

ygl173c(872-986,1280-1471)

3461

ynl118c:yhr186c(680-1350)

yhr186c(1076-1150)

3699

yor181w(169-402):ydr388w

yor181w(125-607)

3905

ydr388w(87-422):ymr192w; ydr388w(177-461):ymr192w

ydr388w(340-446)

3994

ydr515w(188-321):ydr388w

ydr515w(113-262)

4017

ydr362c(2-355):ydr388w

ydr362c(19-93)

4036

ypl181w(302-507):ydr388w; ypl181w(204-504):ydr388w

ypl181w(161-252,315-433)

4048

yor181w(169-402):ycr009c

yor181w(125-607)

4051

ymr192w(76-300):ycr009c; ymr192w(87-422):ycr009c

ymr192w(138-218)

4100

yfl008w(762-936):ygr089w(159-497)

ygr089w(295-449)

4136

ymr124w(249-416):yer149c(803-863)

ymr124w(263-373)

4165

yel043w(762-936):ygr089w(214-392)

yel043w(665-935),ygr089w(295-449)

4209

ypl155c(1-180):ypl124w(495-702)

ypl155c(1-163)

 


 

Table 4S: Comparison between NORS and 'natively disordered regions' [xxx 2]

 

SWISS-PROT ID [xxx 3]

Protein length

Native disorder regions a

NORS

SANT_PLAFW

640

51-627

40-640

SANT-PLAF7

593

68-557

38-593

VG48_HSVSA

797

413-720

427-741

MLH_TETTH

633

173-450

157-633

CYL1_HUMAN

598

238-492

212-343,

386-595

NFH_MOUSE

1087

523-761

403-1087

RTOA_DICDI

400

77-303

40-400

SR75_HUMAN

494

186-410

154-494

RPB1_CRIGR

467

1610-1830 b

1-467

LSTP_STAST

480

56-223

26-233

YHFI_SALTY

416

57-210

55-259

T2FA_DROME

577

246-398

210-338

H1_PEA

265

100-252

115-257

110K_PLAKN

296

135-283

99-296

XYNA_RUMFL

954

248-376

232-635

VIT2_CHICK

1850

1142-1266

1078-1370

SNWA_DICDI

685

398-521

356-514

FHL1_YEAST

936

800-923

718-936

HYR1_CANAL

937

621-741

373-520

ANK2_HUMAN

3924

1778-1897

1743-1958

 

a: taken from the publication of Romero et al. [xxx 2]

b: probably an error in the original paper.

 

 


 

References for Supplement

 

xxx 1 Xenarios, I., Salwinski, L., Duan, X. J., Higney, P., Kim, S. M. et al. (2002). DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucl. Acids Res., 30, 303-5..

xxx 2 Romero, P., Obradovic, Z., Kissinger, C. R., Villafranca, J. E., Garner, E. et al. (1998). Thousands of proteins likely to have long disordered regions. Pac. Symp. Biocomput., 437-448.

xxx 3 Bairoch, A. & Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucl. Acids Res., 28, 45-48.