How to generate an HSSP file from alignment

From Rost Lab Open
Jump to: navigation, search

Disclaimer

Please note that the following instructions are an example only and were not widely tested on other systems. It is assumed that all required ( profphd) and recommended (blast2) packages were installed in system space of the target machine. Deviation from these assumptions may result in abnormal operation. Please use discretion when applying this example.

Generating an HSSP profile

Generating an HSSP profile requires converting a blast alignment to an HSSP profile. The following example shows how this is done for the PredictProtein server.

Debian/Ubuntu package providing below programs: 'librg-utils-perl'.

1. Convert BLAST output to a Single Alignment Format (SAF):

Either (recommended)

/usr/share/librg-utils-perl/blast2saf.pl fasta=<query_fasta_file> eSaf=1 \
 saf=<saf_formatted_file> <blast_output>

- or -

This method is being phased out. The above code is in better shape and has more features, like E-value filtering.

/usr/share/librg-utils-perl/blastpgp_to_saf.pl fileInBlast=<blast_output> fileInQuery=<sequence_file> \
 fileOutRdb=<alignment_formatted_as_table_output_file> fileOutSaf=<saf_formatted_file> \
 red=100 maxAli=3000 tile=0

2. Convert SAF format to HSSP:

/usr/share/librg-utils-perl/copf.pl <saf_formatted_file> formatIn=saf formatOut=hssp \
 fileOut=<hssp_formatted_file> exeConvertSeq=convert_seq 

3. Filter results to 80% redundancy:

/usr/share/librg-utils-perl/hssp_filter.pl red=80 <hssp_formatted_file> fileOut=<filtered_hssp_formatted_file>
  • blast2saf.pl can be used to reduce the number of sequences in the alignment like this: red=80 maxAli=3000 (maxAli is only effective when red != 100).