Workflow: Metagenomics workflow
Workflow for Metagenomics from raw reads to annotated bins. Steps: - workflow_illumina_quality.cwl: - FastQC (control) - fastp (quality trimming) - kraken2 (taxonomy) - bbmap contamination filter - SPAdes (Assembly) - QUAST (Assembly quality report) - BBmap (Read mapping to assembly) - Contig binning (OPTIONAL)
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
memory | Integer (Optional) | memory usage (MB) |
maximum memory usage in megabytes |
binning | Boolean (Optional) | Run binning workflow |
Run with contig binning workflow |
threads | Integer (Optional) | number of threads |
number of threads to use for computational processes |
identifier | String | identifier used |
Identifier for this dataset used in this workflow |
run_gtdbtk | Boolean | Run GTDB-Tk |
Run GTDB-Tk taxonomic bin classification when true |
deduplicate | Boolean (Optional) | Deduplicate reads |
Remove exact duplicate reads with fastp |
destination | String (Optional) | Output Destination |
Optional Output destination used for cwl-prov reporting. |
pacbio_reads | File[] (Optional) | pacbio reads |
file with PacBio reads locally |
nanopore_reads | File[] (Optional) | pacbio reads |
file with PacBio reads locally |
kraken_database | String | Kraken2 database |
Absolute path with database location of kraken2 |
filter_references | String[] | contamination reference file |
bbmap reference fasta file paths for contamination filtering |
illumina_forward_reads | String[] | forward reads |
forward sequence file path |
illumina_reverse_reads | String[] | reverse reads |
reverse sequence file path |
use_reference_mapped_reads | Boolean | Keep mapped reads |
Continue with reads mapped to the given reference |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
bbmap |
../bbmap/bbmap.cwl
(CommandLineTool)
|
BBMap |
Read filtering using BBMap against a (contamination) reference genome |
quast |
../quast/quast.cwl
(CommandLineTool)
|
QUAST: Quality Assessment Tool for Genome Assemblies |
Runs the Quality Assessment Tool for Genome Assemblies application |
spades |
../assembly/spades.cwl
(CommandLineTool)
|
spades genomic assembler |
Runs the spades assembler using a dataset file |
kraken2 |
../kraken2/kraken2.cwl
(CommandLineTool)
|
Kraken2 metagenomics read classification |
Kraken2 metagenomics read classification. |
kraken2_krona |
../krona/krona.cwl
(CommandLineTool)
|
Krona |
Visualization of Kraken2 report results. ktImportText -o $1 $2 |
compress_spades |
../bash/pigz.cwl
(CommandLineTool)
|
compress a file multithreaded with pigz | |
kraken2_compress |
../bash/pigz.cwl
(CommandLineTool)
|
compress a file multithreaded with pigz | |
workflow_binning |
workflow_metagenomics_binning.cwl
(Workflow)
|
Metagenomic Binning from Assembly |
Workflow for Metagenomics from raw reads to annotated bins.<br>
Summary
- MetaBAT2 (binning)
- CheckM (bin completeness and contamination)
- GTDB-Tk (bin taxonomic classification)
- BUSCO (bin completeness) |
workflow_quality |
workflow_illumina_quality.cwl
(Workflow)
|
Illumina read quality control, trimming and contamination filter. |
**Workflow for Illumina paired read quality control, trimming and filtering.**<br />
Multiple paired datasets will be merged into single paired dataset.<br />
Summary:
- FastQC on raw data files<br />
- fastp for read quality trimming<br />
- BBduk for phiX and (optional) rRNA filtering<br />
- Kraken2 for taxonomic classification of reads (optional)<br />
- BBmap for (contamination) filtering using given references (optional)<br />
- FastQC on filtered (merged) data<br /> |
sam_to_sorted_bam |
../samtools/sam_to_sorted-bam.cwl
(CommandLineTool)
|
sam to sorted bam |
samtools view -@ $2 -hu $1 | samtools sort -@ $2 -o $3.bam |
contig_read_counts |
../samtools/samtools_idxstats.cwl
(CommandLineTool)
|
samtools idxstats |
samtools idxstats - reports alignment summary statistics |
quast_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
spades_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
binning_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
kraken2_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
sorted_bam_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
bam_output | Directory (Optional) | BAM files |
Mapping results in indexed BAM format |
quast_output | Directory | QUAST |
Quast analysis output folder |
spades_output | Directory | SPAdes |
Metagenome assembly output by SPADES |
binning_output | Directory (Optional) | Binning output |
Binning outputfolders |
filtered_stats | Directory | Filtered statistics |
Statistics on quality and preprocessing of the reads |
kraken2_output | Directory | Kraken2 reports |
Kraken2 taxonomic classification reports |
https://w3id.org/cwl/view/git/b9097b82e6ab6f2c9496013ce4dd6877092956a0/cwl/workflows/workflow_metagenomics_assembly.cwl