Workflow: retrieve sequence and perform pairwise alignment (sub-workflow process)

Fetched 2025-09-09 09:57:22 GMT

\"Perform pairwise alignment of protein sequences for pairs identified by structural similarity search. Step 1: retrieve sequence from blastdbcmd result Step 2: makeblastdb: ../Tools/14_makeblastdb.cwl Step 3: blastdbcmd: ../Tools/15_blastdbcmd.cwl Step 4: seqretsplit: ../Tools/16_seqretsplit.cwl Step 5: needle (Global alignment): ../Tools/17_needle.cwl Step 6: water (Local alignment): ../Tools/17_water.cwl\"

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
WATER_SCRIPT File water shell script (water)

water shell script

NEEDLE_SCRIPT File needle shell script (needle)

needle shell script

FOLDSEEK_EXTRACT_TSV File [TSV] foldseek extract tsv (foldseek easy-search)

foldseek extract tsv

WATER_RESULT_DIR_NAME String [Directory name] water result directory name (water)

water result directory name

NEEDLE_RESULT_DIR_NAME String [Directory name] needle result directory name (needle)

needle result directory name

ENTRY_BATCH_HIT_SPECIES File entry batch file (blastdbcmd)

entry batch file for blastdbcmd

ENTRY_BATCH_QUERY_SPECIES File entry batch file (blastdbcmd)

entry batch file for blastdbcmd

INDEX_DIR_NAME_HIT_SPECIES String [Directory name] index directory name (makeblastdb)

blast index directory name for blastdbcmd

INDEX_DIR_NAME_QUERY_SPECIES String [Directory name] index directory name (makeblastdb)

blast index directory name for blastdbcmd

INPUT_FASTA_FILE_HIT_SPECIES File [FASTA] input fasta file (makeblastdb)

input fasta file for makeblastdb. Retrieve files in advance from uniprot.

ALIGNMENT_QUERY_COLUMN_NUMBER Integer alignment query column number (needle and water)

alignment column number (query species) for needle and water. Extract columns describing UniProt ID pairs (query IDs) from the TSV file read with the FOLDSEEK_EXTRACT_TSV parameter

ALIGNMENT_TARGET_COLUMN_NUMBER Integer alignment target column number (needle and water)

alignment column number (target species) for needle and water. Extract columns describing UniProt ID pairs (hit IDs) from the TSV file read with the FOLDSEEK_EXTRACT_TSV parameter

INPUT_FASTA_FILE_QUERY_SPECIES File [FASTA] input fasta file (makeblastdb)

input fasta file for makeblastdb. Retrieve files in advance from uniprot.

BLASTDBCMD_LOGFILE_NAME_HIT_SPECIES String [File name] logfile name (blastdbcmd)

logfile name.

BLASTDBCMD_LOGFILE_NAME_QUERY_SPECIES String [File name] logfile name (blastdbcmd)

logfile name.

BLASTDBCMD_RESULT_FILE_NAME_HIT_SPECIES String [File name] blastdbcmd result file name (blastdbcmd)

blastdbcmd result file name.

SEQRETSPLIT_OUTPUT_DIR_NAME_HIT_SPECIES String [Directory name] output directory name (seqretsplit)

output directory name for seqretsplit

BLASTDBCMD_RESULT_FILE_NAME_QUERY_SPECIES String [File name] blastdbcmd result file name (blastdbcmd)

blastdbcmd result file name.

SEQRETSPLIT_OUTPUT_DIR_NAME_QUERY_SPECIES String [Directory name] output directory name (seqretsplit)

output directory name for seqretsplit

Steps

ID Runs Label Doc
blastdbcmd_hit_species
../Tools/15_blastdbcmd.cwl (CommandLineTool)
blastdbcmd command for blastdbcmd execution

blastdbcmd command for blastdbcmd execution. Before executing, make sure all FASTA files (query species and target species) to be indexed are ready

makeblastdb_hit_species
../Tools/14_makeblastdb.cwl (CommandLineTool)
makeblastdb command for blastdbcmd execution

makeblastdb command for blastdbcmd execution. Before executing, make sure all FASTA files (query species and target species) to be indexed are ready

seqretsplit_hit_species
../Tools/16_seqretsplit.cwl (CommandLineTool)
seqretsplit command for split fasta file

seqretsplit command for split fasta file which is created by 15_blastdbcmd.cwl. Before executing, make sure the blastdbcmd result file is already created by 15_blastdbcmd.cwl

blastdbcmd_query_species
../Tools/15_blastdbcmd.cwl (CommandLineTool)
blastdbcmd command for blastdbcmd execution

blastdbcmd command for blastdbcmd execution. Before executing, make sure all FASTA files (query species and target species) to be indexed are ready

makeblastdb_query_species
../Tools/14_makeblastdb.cwl (CommandLineTool)
makeblastdb command for blastdbcmd execution

makeblastdb command for blastdbcmd execution. Before executing, make sure all FASTA files (query species and target species) to be indexed are ready

seqretsplit_query_species
../Tools/16_seqretsplit.cwl (CommandLineTool)
seqretsplit command for split fasta file

seqretsplit command for split fasta file which is created by 15_blastdbcmd.cwl. Before executing, make sure the blastdbcmd result file is already created by 15_blastdbcmd.cwl

local_alignment_using_water
../Tools/17_water.cwl (CommandLineTool)
water command for water execution

custom script for water execution. Before executing, make sure the split fasta files are already created by 16_seqretsplit.cwl

global_alignment_using_needle
../Tools/17_needle.cwl (CommandLineTool)
needle command for needle execution

custom script for needle execution. Before executing, make sure the split fasta files are already created by 16_seqretsplit.cwl

Outputs

ID Type Label Doc
output_dir_hit_species Directory output directory (hit species)
output_water_result_dir Directory water result directory
output_dir_query_species Directory output directory (query species)
output_needle_result_dir Directory needle result directory
output_water_result_file File[] water result file
output_needle_result_file File[] needle result file
output_logfile_hit_species File logfile (hit species)
output_index_dir_hit_species Directory blast index directory (hit species)
output_logfile_query_species File logfile (query species)
output_index_file_hit_species File blast index file (hit species)
output_index_dir_query_species Directory blast index directory (query species)
output_index_file_query_species File blast index file (query species)
output_blastdbcmd_result_hit_species File blastdbcmd result (hit species)
output_split_fasta_files_hit_species File[] split fasta files (hit species)
output_blastdbcmd_result_query_species File blastdbcmd result (query species)
output_split_fasta_files_query_species File[] split fasta files (query species)
Permalink: https://w3id.org/cwl/view/git/ad71cdbde9ec1af0f73c8dcee0bb16db8bc09584/Workflow/11_retrieve_sequence_wf.cwl