Workflow: NonSpliced RNAseq workflow

Fetched 2024-07-17 01:15:47 GMT

Workflow for NonSpliced RNAseq data alignment with multiple aligners. Steps: - workflow_illumina_quality.cwl: - FastQC (control) - fastp (trimming) - bowtie2 (read mapping) - sam_to_sorted-bam - featurecounts (transcript read counts) - kallisto (transcript [pseudo]counts)

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
gtf File (Optional) GTF file

GTF file location

memory Integer (Optional) Max memory

maximum memory usage in megabytes

threads Integer (Optional) number of threads

number of threads to use for computational processes

identifier String identifier used

Identifier for this dataset used in this workflow

destination String (Optional) Output Destination

Optional Output destination used for cwl-prov reporting.

filter_rrna Boolean Filter rRNA

Filter rRNA from reads if true

forward_reads String[] forward reads

forward sequence file locally

reverse_reads String[] reverse reads

reverse sequence file locally

bowtie2-indexfolder Directory bowtie2 index

Folder location of the bowtie2 index files.

kallisto-indexfolder Directory (Optional) kallisto index

Folder location of the kallisto index file.

contamination_references String[] contamination reference file

bbmap reference fasta file for contamination filtering

Steps

ID Runs Label Doc
bowtie2
../bowtie2/bowtie2_align_simple.cwl (CommandLineTool)
Bowtie2 alignment

Align reads to indexed genome. Stripped simple version; only paired end reads and sam output.

kallisto
../RNAseq/kallisto/kallisto_quant.cwl (CommandLineTool)
kallisto quantification

Pseudoalignment with the tool kallisto https://github.com/common-workflow-library/bio-cwl-tools/tree/release/Kallisto

featurecounts
../RNAseq/featurecounts.cwl (CommandLineTool)
Bowtie2 alignment

Align reads to indexed genome. Stripped simple version; only paired end reads and sam output.

workflow_quality Illumina read quality control, trimming and contamination filter.

**Workflow for Illumina paired read quality control, trimming and filtering.**<br /> Multiple paired datasets will be merged into single paired dataset.<br /> Summary: - FastQC on raw data files<br /> - fastp for read quality trimming<br /> - BBduk for phiX and (optional) rRNA filtering<br /> - Kraken2 for taxonomic classification of reads (optional)<br /> - BBmap for (contamination) filtering using given references (optional)<br /> - FastQC on filtered (merged) data<br />

**All tool CWL files and other workflows can be found here:**<br> Tools: https://git.wur.nl/unlock/cwl/-/tree/master/cwl<br> Workflows: https://git.wur.nl/unlock/cwl/-/tree/master/cwl/workflows<br>

WorkflowHub: https://workflowhub.eu/projects/16/workflows?view=default

sam_to_sorted-bam
../samtools/sam_to_sorted-bam.cwl (CommandLineTool)
sam to sorted bam

samtools view -@ $2 -hu $1 | samtools sort -@ $2 -o $3.bam

bowtie2_files_to_folder
../expressions/files_to_folder.cwl (ExpressionTool)

Transforms the input files to a mentioned directory

kallisto_files_to_folder
../expressions/files_to_folder.cwl (ExpressionTool)

Transforms the input files to a mentioned directory

featurecounts_files_to_folder
../expressions/files_to_folder.cwl (ExpressionTool)

Transforms the input files to a mentioned directory

Outputs

ID Type Label Doc
bowtie2_output Directory bowtie2 output

bowtie2 mapping results folder. Contains sorted bam file, metrics file and mapping statistics (stdout).

filtered_stats Directory Filtered statistics

Statistics on quality and preprocessing of the reads

kallisto_output Directory kallisto output

kallisto results folder. Contains transcript abundances, run info and summary.

featurecounts_output Directory FeatureCounts output

FeatureCounts results folder. Contains readcounts, summary and mapping statistics (stdout).

Permalink: https://w3id.org/cwl/view/git/b9097b82e6ab6f2c9496013ce4dd6877092956a0/cwl/workflows/workflow_RNAseq_NonSpliced.cwl