Workflow: transcriptome_cleanup

Fetched 2023-01-24 06:02:28 GMT

This workflow detect and remove vector, duplicate and contamination from a transcriptome fasta file

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
evalue Float
threads Integer
min_length Integer
vector_fsa File
trans_fsa_gz File
total_per_file Integer
vector_bp_cutoff Integer

Steps

ID Runs Label Doc
split_fasta
../../tools/python/split-fasta.cwl (CommandLineTool)
split_fasta

Split fasta file in multiple files

vector_blastn
../../tools/blast/blastn.cwl (CommandLineTool)
BlastN

NCBI BlastN Nucleotide-Nucleotide BLAST

vector_blastdb
../../tools/blast/makeblastdb.cwl (CommandLineTool)
makeblastdb

NCBI makeblastdb

vector_removal
../../tools/python/vector-removal.cwl (CommandLineTool)
vector_removal

This tools detect vectors from a Blast TSV file

collect_blastdb
../../tools/basic/files2dir.cwl (ExpressionTool)
files2dir

Group all input files in a directory

duplicate_blastn
../../tools/blast/blastn.cwl (CommandLineTool)
BlastN

NCBI BlastN Nucleotide-Nucleotide BLAST

duplicate_blastdb
../../tools/blast/makeblastdb.cwl (CommandLineTool)
makeblastdb

NCBI makeblastdb

duplicate_removal
../../tools/python/duplicate-removal.cwl (CommandLineTool)
duplicate_removal

This tools remove duplicate sequences

equal_seq_removal
../../tools/python/equal-removal.cwl (CommandLineTool)
equal_removal

This tools remove equal sequences

uncompress_no_vect
../../tools/basic/gzip.cwl (CommandLineTool)
gzip

Compress files

uncompress_noequal
../../tools/basic/gzip.cwl (CommandLineTool)
gzip

Compress files

collect_duplicate_blastdb
../../tools/basic/files2dir.cwl (ExpressionTool)
files2dir

Group all input files in a directory

Outputs

ID Type Label Doc
split_fasta_fsa File[]
Permalink: https://w3id.org/cwl/view/git/de6380d83f9209e95559e66cf64ded3bf0e410ea/workflows/Annotation/transcriptome-cleanup.cwl