Workflow: NonSpliced RNAseq workflow
Workflow for NonSpliced RNAseq data alignment with multiple aligners. Steps: - workflow_illumina_quality.cwl: - FastQC (control) - fastp (trimming) - bowtie2 (read mapping) - sam_to_sorted-bam - featurecounts (transcript read counts) - kallisto (transcript [pseudo]counts)
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
| ID | Type | Title | Doc |
|---|---|---|---|
| gtf | File (Optional) | GTF file |
GTF file location |
| memory | Integer (Optional) | Max memory |
maximum memory usage in megabytes |
| threads | Integer (Optional) | number of threads |
number of threads to use for computational processes |
| identifier | String | identifier used |
Identifier for this dataset used in this workflow |
| destination | String (Optional) | Output Destination |
Optional Output destination used for cwl-prov reporting. |
| filter_rrna | Boolean | Filter rRNA |
Filter rRNA from reads if true |
| forward_reads | String[] | forward reads |
forward sequence file locally |
| reverse_reads | String[] | reverse reads |
reverse sequence file locally |
| bowtie2-indexfolder | Directory | bowtie2 index |
Folder location of the bowtie2 index files. |
| kallisto-indexfolder | Directory (Optional) | kallisto index |
Folder location of the kallisto index file. |
| contamination_references | String[] | contamination reference file |
bbmap reference fasta file for contamination filtering |
Steps
| ID | Runs | Label | Doc |
|---|---|---|---|
| bowtie2 |
../bowtie2/bowtie2_align_simple.cwl
(CommandLineTool)
|
Bowtie2 alignment |
Align reads to indexed genome. Stripped simple version; only paired end reads and sam output. |
| kallisto |
../RNAseq/kallisto/kallisto_quant.cwl
(CommandLineTool)
|
kallisto quantification |
Pseudoalignment with the tool kallisto https://github.com/common-workflow-library/bio-cwl-tools/tree/release/Kallisto |
| featurecounts |
../RNAseq/featurecounts.cwl
(CommandLineTool)
|
Bowtie2 alignment |
Align reads to indexed genome. Stripped simple version; only paired end reads and sam output. |
| workflow_quality |
workflow_illumina_quality.cwl
(Workflow)
|
Illumina read quality control, trimming and contamination filter. |
**Workflow for Illumina paired read quality control, trimming and filtering.**<br />
Multiple paired datasets will be merged into single paired dataset.<br />
Summary:
- FastQC on raw data files<br />
- fastp for read quality trimming<br />
- BBduk for phiX and (optional) rRNA filtering<br />
- Kraken2 for taxonomic classification of reads (optional)<br />
- BBmap for (contamination) filtering using given references (optional)<br />
- FastQC on filtered (merged) data<br /> |
| sam_to_sorted-bam |
../samtools/sam_to_sorted-bam.cwl
(CommandLineTool)
|
sam to sorted bam |
samtools view -@ $2 -hu $1 | samtools sort -@ $2 -o $3.bam |
| bowtie2_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
| kallisto_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
|
| featurecounts_files_to_folder |
../expressions/files_to_folder.cwl
(ExpressionTool)
|
Transforms the input files to a mentioned directory |
Outputs
| ID | Type | Label | Doc |
|---|---|---|---|
| bowtie2_output | Directory | bowtie2 output |
bowtie2 mapping results folder. Contains sorted bam file, metrics file and mapping statistics (stdout). |
| filtered_stats | Directory | Filtered statistics |
Statistics on quality and preprocessing of the reads |
| kallisto_output | Directory | kallisto output |
kallisto results folder. Contains transcript abundances, run info and summary. |
| featurecounts_output | Directory | FeatureCounts output |
FeatureCounts results folder. Contains readcounts, summary and mapping statistics (stdout). |
https://w3id.org/cwl/view/git/b9097b82e6ab6f2c9496013ce4dd6877092956a0/cwl/workflows/workflow_RNAseq_NonSpliced.cwl
