Workflow: WGS and MT analysis for fastq files

Fetched 2024-04-20 03:25:35 GMT

rna / protein - qc, preprocess, filter, annotation, index, abundance

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
jobid String
m5nrBDB File
m5nrSCG File
filterLn Boolean
m5nrFull File[]
maxAmbig Integer
deviation Float
m5rnaFull File
sequences File
m5rnaClust File
m5rnaIndex Directory
derepPrefix Integer
filterAmbig Boolean
m5rnaPrefix String

Steps

ID Runs Label Doc
qcBasic
abundance abundance

abundace profiles from annotated files, for protein and/or rna

darkmatter
../Tools/extract_darkmatter.tool.cwl (CommandLineTool)
extract darkmatter

retrieve predicted proteins that have no similarity hits >extract_darkmatter.py -i <input> -s <sim 1> -s <sim 2> -m <clust map 1> -m <clust map 2> -o <outName>

preProcess preprocess fasta

Remove reads from fasta files based on sequence stats. Return fasta files with reads passed and reads removed.

indexSimSeq index sim seq

create sorted / filtered similarity file with feature sequences, and index by md5

rnaAnnotate rna annotation

RNAs - predict, cluster, identify, annotate

protAnnotate protein annotation

Proteins - predict, filter, cluster, identify, annotate

dereplication
../Tools/dereplication.tool.cwl (CommandLineTool)
dereplication

Keep only one of sequence sets with identical prefixes

Outputs

ID Type Label Doc
qcStatOut File
seqBinOut File
simSeqOut File
rnaSimsOut File
seqStatOut File
protSimsOut File
qcSummaryOut File
adapterPassed File
darkmatterOut File
lcaProfileOut File
md5ProfileOut File
rnaFeatureOut File
protFeatureOut File
rnaClustMapOut File
rnaClustSeqOut File
sourceStatsOut File
protClustMapOut File
protClustSeqOut File
preProcessPassed File
preProcessRemoved File
dereplicationPassed File
dereplicationRemoved File
protFilterFeatureOut File
Permalink: https://w3id.org/cwl/view/git/f5839797da8209a9d3e441023f88130219751020/CWL/Workflows/wgs-noscreen-fasta.workflow.cwl