Workflow: retrieve sequence and perform pairwise alignment (sub-workflow process)
\"Perform pairwise alignment of protein sequences for pairs identified by structural similarity search. Step 1: retrieve sequence from blastdbcmd result Step 2: makeblastdb: ../Tools/14_makeblastdb.cwl Step 3: blastdbcmd: ../Tools/15_blastdbcmd.cwl Step 4: seqretsplit: ../Tools/16_seqretsplit.cwl Step 5: needle (Global alignment): ../Tools/17_needle.cwl Step 6: water (Local alignment): ../Tools/17_water.cwl\"
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
WATER_SCRIPT | File | water shell script (water) |
water shell script |
NEEDLE_SCRIPT | File | needle shell script (needle) |
needle shell script |
FOLDSEEK_EXTRACT_TSV | File [TSV] | foldseek extract tsv (foldseek easy-search) |
foldseek extract tsv |
WATER_RESULT_DIR_NAME | String [Directory name] | water result directory name (water) |
water result directory name |
NEEDLE_RESULT_DIR_NAME | String [Directory name] | needle result directory name (needle) |
needle result directory name |
ENTRY_BATCH_HIT_SPECIES | File | entry batch file (blastdbcmd) |
entry batch file for blastdbcmd |
ENTRY_BATCH_QUERY_SPECIES | File | entry batch file (blastdbcmd) |
entry batch file for blastdbcmd |
INDEX_DIR_NAME_HIT_SPECIES | String [Directory name] | index directory name (makeblastdb) |
blast index directory name for blastdbcmd |
INDEX_DIR_NAME_QUERY_SPECIES | String [Directory name] | index directory name (makeblastdb) |
blast index directory name for blastdbcmd |
INPUT_FASTA_FILE_HIT_SPECIES | File [FASTA] | input fasta file (makeblastdb) |
input fasta file for makeblastdb. Retrieve files in advance from uniprot. |
ALIGNMENT_QUERY_COLUMN_NUMBER | Integer | alignment query column number (needle and water) |
alignment column number (query species) for needle and water. Extract columns describing UniProt ID pairs (query IDs) from the TSV file read with the FOLDSEEK_EXTRACT_TSV parameter |
ALIGNMENT_TARGET_COLUMN_NUMBER | Integer | alignment target column number (needle and water) |
alignment column number (target species) for needle and water. Extract columns describing UniProt ID pairs (hit IDs) from the TSV file read with the FOLDSEEK_EXTRACT_TSV parameter |
INPUT_FASTA_FILE_QUERY_SPECIES | File [FASTA] | input fasta file (makeblastdb) |
input fasta file for makeblastdb. Retrieve files in advance from uniprot. |
BLASTDBCMD_LOGFILE_NAME_HIT_SPECIES | String [File name] | logfile name (blastdbcmd) |
logfile name. |
BLASTDBCMD_LOGFILE_NAME_QUERY_SPECIES | String [File name] | logfile name (blastdbcmd) |
logfile name. |
BLASTDBCMD_RESULT_FILE_NAME_HIT_SPECIES | String [File name] | blastdbcmd result file name (blastdbcmd) |
blastdbcmd result file name. |
SEQRETSPLIT_OUTPUT_DIR_NAME_HIT_SPECIES | String [Directory name] | output directory name (seqretsplit) |
output directory name for seqretsplit |
BLASTDBCMD_RESULT_FILE_NAME_QUERY_SPECIES | String [File name] | blastdbcmd result file name (blastdbcmd) |
blastdbcmd result file name. |
SEQRETSPLIT_OUTPUT_DIR_NAME_QUERY_SPECIES | String [Directory name] | output directory name (seqretsplit) |
output directory name for seqretsplit |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
blastdbcmd_hit_species |
../Tools/15_blastdbcmd.cwl
(CommandLineTool)
|
blastdbcmd command for blastdbcmd execution |
blastdbcmd command for blastdbcmd execution. Before executing, make sure all FASTA files (query species and target species) to be indexed are ready |
makeblastdb_hit_species |
../Tools/14_makeblastdb.cwl
(CommandLineTool)
|
makeblastdb command for blastdbcmd execution |
makeblastdb command for blastdbcmd execution. Before executing, make sure all FASTA files (query species and target species) to be indexed are ready |
seqretsplit_hit_species |
../Tools/16_seqretsplit.cwl
(CommandLineTool)
|
seqretsplit command for split fasta file |
seqretsplit command for split fasta file which is created by 15_blastdbcmd.cwl. Before executing, make sure the blastdbcmd result file is already created by 15_blastdbcmd.cwl |
blastdbcmd_query_species |
../Tools/15_blastdbcmd.cwl
(CommandLineTool)
|
blastdbcmd command for blastdbcmd execution |
blastdbcmd command for blastdbcmd execution. Before executing, make sure all FASTA files (query species and target species) to be indexed are ready |
makeblastdb_query_species |
../Tools/14_makeblastdb.cwl
(CommandLineTool)
|
makeblastdb command for blastdbcmd execution |
makeblastdb command for blastdbcmd execution. Before executing, make sure all FASTA files (query species and target species) to be indexed are ready |
seqretsplit_query_species |
../Tools/16_seqretsplit.cwl
(CommandLineTool)
|
seqretsplit command for split fasta file |
seqretsplit command for split fasta file which is created by 15_blastdbcmd.cwl. Before executing, make sure the blastdbcmd result file is already created by 15_blastdbcmd.cwl |
local_alignment_using_water |
../Tools/17_water.cwl
(CommandLineTool)
|
water command for water execution |
custom script for water execution. Before executing, make sure the split fasta files are already created by 16_seqretsplit.cwl |
global_alignment_using_needle |
../Tools/17_needle.cwl
(CommandLineTool)
|
needle command for needle execution |
custom script for needle execution. Before executing, make sure the split fasta files are already created by 16_seqretsplit.cwl |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
output_dir_hit_species | Directory | output directory (hit species) | |
output_water_result_dir | Directory | water result directory | |
output_dir_query_species | Directory | output directory (query species) | |
output_needle_result_dir | Directory | needle result directory | |
output_water_result_file | File[] | water result file | |
output_needle_result_file | File[] | needle result file | |
output_logfile_hit_species | File | logfile (hit species) | |
output_index_dir_hit_species | Directory | blast index directory (hit species) | |
output_logfile_query_species | File | logfile (query species) | |
output_index_file_hit_species | File | blast index file (hit species) | |
output_index_dir_query_species | Directory | blast index directory (query species) | |
output_index_file_query_species | File | blast index file (query species) | |
output_blastdbcmd_result_hit_species | File | blastdbcmd result (hit species) | |
output_split_fasta_files_hit_species | File[] | split fasta files (hit species) | |
output_blastdbcmd_result_query_species | File | blastdbcmd result (query species) | |
output_split_fasta_files_query_species | File[] | split fasta files (query species) |
https://w3id.org/cwl/view/git/ad71cdbde9ec1af0f73c8dcee0bb16db8bc09584/Workflow/11_retrieve_sequence_wf.cwl