Workflow: WGS and MT analysis for fastq files
rna / protein - qc, preprocess, filter, annotation, index, abundance
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
This workflow is Open Source and may be reused according to the terms of:
BSD 2-clause "Simplified" License
Note that the tools invoked by the workflow may have separate licenses.
Inputs
ID | Type | Title | Doc |
---|---|---|---|
jobid | String | ||
m5nrBDB | File | ||
m5nrSCG | File | ||
filterLn | Boolean | ||
indexDir | Directory | ||
m5nrFull | File[] | ||
maxAmbig | Integer | ||
deviation | Float | ||
indexName | String (Optional) | ||
m5rnaFull | File | ||
sequences | File | ||
m5rnaClust | File | ||
m5rnaIndex | Directory | ||
derepPrefix | Integer | ||
filterAmbig | Boolean | ||
m5rnaPrefix | String |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
qcBasic |
qc-basic.workflow.cwl
(Workflow)
|
||
abundance |
abundance-clca.workflow.cwl
(Workflow)
|
abundance |
abundace profiles from annotated files, for protein and/or rna |
orgScreen |
organism-screening.workflow.cwl
(Workflow)
|
screen out taxa |
Remove sequences which align against a reference set using bowtie2. The references are preformatted (index files) |
darkmatter |
../Tools/extract_darkmatter.tool.cwl
(CommandLineTool)
|
extract darkmatter |
retrieve predicted proteins that have no similarity hits >extract_darkmatter.py -i <input> -s <sim 1> -s <sim 2> -m <clust map 1> -m <clust map 2> -o <outName> |
preProcess |
preprocess-fasta.workflow.cwl
(Workflow)
|
preprocess fasta |
Remove reads from fasta files based on sequence stats. Return fasta files with reads passed and reads removed. |
indexSimSeq |
index_sim_seq.workflow.cwl
(Workflow)
|
index sim seq |
create sorted / filtered similarity file with feature sequences, and index by md5 |
rnaAnnotate |
rna-annotation.workflow.cwl
(Workflow)
|
rna annotation |
RNAs - predict, cluster, identify, annotate |
protAnnotate |
protein-filter-annotation.workflow.cwl
(Workflow)
|
protein annotation |
Proteins - predict, filter, cluster, identify, annotate |
dereplication |
../Tools/dereplication.tool.cwl
(CommandLineTool)
|
dereplication |
Keep only one of sequence sets with identical prefixes |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
qcStatOut | File | ||
seqBinOut | File | ||
simSeqOut | File | ||
rnaSimsOut | File | ||
seqStatOut | File | ||
protSimsOut | File | ||
qcSummaryOut | File | ||
adapterPassed | File | ||
darkmatterOut | File | ||
lcaProfileOut | File | ||
md5ProfileOut | File | ||
rnaFeatureOut | File | ||
protFeatureOut | File | ||
rnaClustMapOut | File | ||
rnaClustSeqOut | File | ||
sourceStatsOut | File | ||
orgScreenPassed | File | ||
protClustMapOut | File | ||
protClustSeqOut | File | ||
preProcessPassed | File | ||
preProcessRemoved | File | ||
dereplicationPassed | File | ||
dereplicationRemoved | File | ||
protFilterFeatureOut | File |
Permalink:
https://w3id.org/cwl/view/git/4e4d2e674bde612f98f2b0370445f8b2a47587df/CWL/Workflows/wgs-fasta.workflow.cwl