Workflow: wf_trim_and_map_se.cwl

Fetched 2023-01-04 10:46:09 GMT

This workflow takes in appropriate trimming params and demultiplexed reads, and performs the following steps in order: trimx1, trimx2, fastq-sort, filter repeat elements, fastq-sort, genomic mapping, sort alignment, index alignment, namesort, PCR dedup, sort alignment, index alignment

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
read1 File
read_name String
a_adapters File
bam_suffix String
sort_names Boolean
trim_times String
dataset_name String
fastq_suffix String
trim_error_rate String
speciesGenomeDir Directory
repeatElementGenomeDir Directory
trimagain_overlap_length String
trimfirst_overlap_length String

Steps

ID Runs Label Doc
A_sort
sort.cwl (CommandLineTool)

This tool wraps samtools sort by coordinates (namesort flag is False by default). Usage: samtools sort [options...] [in.bam]

X_sort
sort.cwl (CommandLineTool)

This tool wraps samtools sort by coordinates (namesort flag is False by default). Usage: samtools sort [options...] [in.bam]

X_trim
trim_se.cwl (CommandLineTool)

This tool wraps cutadapt with default parameters set to single-end eCLIP processing defaults. Usage: cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq

A_index
samtools-index.cwl (CommandLineTool)

samtools-index.cwl is developed for CWL consortium

A_map_genome
star-genome.cwl (CommandLineTool)
X_sortlexico
namesort.cwl (CommandLineTool)

This tool wraps samtools sort, setting the by-name (-n) flag to be True by default. Usage: samtools sort -n <input.bam> <output.bam>

X_trim_again
trim_se.cwl (CommandLineTool)

This tool wraps cutadapt with default parameters set to single-end eCLIP processing defaults. Usage: cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq

A_map_repeats
star-repeatmapping.cwl (CommandLineTool)
get_a_adapters
file2stringArray.cwl (ExpressionTool)

Returns string array expression based on lines in a fasta file (SKIPS >).

index_rmdup_bam
samtools-index.cwl (CommandLineTool)

samtools-index.cwl is developed for CWL consortium

step_fastqc_trim
wf_fastqc.cwl (Workflow)

This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol)

X_barcodecollapsese
barcodecollapse_se.cwl (CommandLineTool)

The purpose of this command is to deduplicate BAM files based on the first mapping co-ordinate and the UMI attached to the read. It is assumed that the FASTQ files were processed with extract_umi.py before mapping and thus the UMI is the last word of the read name. e.g:

@HISEQ:87:00000000_AATT

where AATT is the UMI sequeuence.

Usage: umi_tools dedup -I infile.bam -S deduped.bam -L dedup.log

A_sort_trimmed_fastq
fastqsort.cwl (CommandLineTool)

Sorts FASTQ files by their read name. Sorted fastq files are required to keep mapping steps deterministic.

Usage: fastq-sort --id FASTQ_FILE > STDOUT

rename_mapped_genome
rename.cwl (CommandLineTool)
rename_mapped_repeats
rename.cwl (CommandLineTool)
step_gzip_sort_X_trim
gzip.cwl (CommandLineTool)
step_fastqc_trim_again
wf_fastqc.cwl (Workflow)

This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol)

rename_unmapped_repeats
rename.cwl (CommandLineTool)
A_sort_repunmapped_fastq
fastqsort.cwl (CommandLineTool)

Sorts FASTQ files by their read name. Sorted fastq files are required to keep mapping steps deterministic.

Usage: fastq-sort --id FASTQ_FILE > STDOUT

step_gzip_sort_X_trim_again
gzip.cwl (CommandLineTool)
step_gzip_sort_repunmapped_fastq
gzip.cwl (CommandLineTool)

Outputs

ID Type Label Doc
A_output_sorted_bam File
X_output_sorted_bam File
X_output_trim_again File[]
X_output_trim_first File[]
A_output_mapgenome_stats File
A_output_maprepeats_stats File
X_output_trim_again_metrics File
X_output_trim_first_metrics File
X_output_barcodecollapsese_bam File
A_output_sort_repunmapped_fastq File
A_output_mapgenome_star_settings File
X_output_trim_again_fastqc_stats File
X_output_trim_first_fastqc_stats File
A_output_maprepeats_star_settings File
X_output_trim_again_fastqc_report File
X_output_trim_first_fastqc_report File
X_output_barcodecollapsese_metrics File
A_output_mapgenome_mapped_to_genome File
A_output_maprepeats_mapped_to_genome File
Permalink: https://w3id.org/cwl/view/git/6b533898c395e9e5b9d0acc1587d0c68bc56abe0/cwl/wf_trim_and_map_se.cwl