Workflow: wf_trim_and_map_pe.cwl

Fetched 2023-01-05 07:31:29 GMT

This workflow takes in appropriate trimming params and demultiplexed reads, and performs the following steps in order: trimx1, trimx2, fastq-sort, filter repeat elements, fastq-sort, genomic mapping, sort alignment, index alignment, namesort, PCR dedup, sort alignment, index alignment

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
read1 File
read2 File
A_adapters File
a_adapters File
g_adapters File
sort_names Boolean
trim_times String
trim_error_rate String
speciesGenomeDir Directory
a_adapters_default File
g_adapters_default File
repeatElementGenomeDir Directory
trimagain_overlap_length String
trimfirst_overlap_length String

Steps

ID Runs Label Doc
X_sort
sort.cwl (CommandLineTool)

This tool wraps samtools sort by coordinates (namesort flag is False by default). Usage: samtools sort [options...] [in.bam]

X_trim
trim_pe.cwl (CommandLineTool)

This tool wraps cutadapt with default parameters set to paired-end eCLIP processing defaults. Usage: cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq

A_map_genome
star-genome.cwl (CommandLineTool)
X_sortlexico
namesort.cwl (CommandLineTool)

This tool wraps samtools sort, setting the by-name (-n) flag to be True by default. Usage: samtools sort -n <input.bam> <output.bam>

X_trim_again
trim_pe.cwl (CommandLineTool)

This tool wraps cutadapt with default parameters set to paired-end eCLIP processing defaults. Usage: cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq

A_map_repeats
star-repeatmapping.cwl (CommandLineTool)
get_A_adapters
file2stringArray.cwl (ExpressionTool)

Returns string array expression based on lines in a fasta file (SKIPS >).

get_a_adapters
file2stringArray.cwl (ExpressionTool)

Returns string array expression based on lines in a fasta file (SKIPS >).

get_g_adapters
file2stringArray.cwl (ExpressionTool)

Returns string array expression based on lines in a fasta file (SKIPS >).

X_barcodecollapsepe
barcodecollapse_pe.cwl (CommandLineTool)

This tool wraps barcodecollapsepe.py, a paired-end PCR duplicate removal script which reads in a .bam file where the first string left of : split of the read name is the barcode and merge reads mapped to the same position that have the same barcode. Assumes paired end reads are adjacent in output file (ie needs unsorted bams) Also assumes no multimappers in the bam file (otherwise behavior is undefined) Usage: python barcodecollapsepe.py --bam BAM --out_file OUT_FILE --metrics_file METRICS_FILE

step_fastqc_trim_R1
wf_fastqc.cwl (Workflow)

This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol)

step_fastqc_trim_R2
wf_fastqc.cwl (Workflow)

This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol)

A_sort_trimmed_fastq
fastqsort.cwl (CommandLineTool)

Sorts FASTQ files by their read name. Sorted fastq files are required to keep mapping steps deterministic.

Usage: fastq-sort --id FASTQ_FILE > STDOUT

rename_mapped_genome
rename.cwl (CommandLineTool)
rename_mapped_repeats
rename.cwl (CommandLineTool)
step_gzip_sort_X_trim
gzip.cwl (CommandLineTool)
get_a_adapters_default
file2stringArray.cwl (ExpressionTool)

Returns string array expression based on lines in a fasta file (SKIPS >).

get_g_adapters_default
file2stringArray.cwl (ExpressionTool)

Returns string array expression based on lines in a fasta file (SKIPS >).

A_sort_repunmapped_fastq
fastqsort.cwl (CommandLineTool)

Sorts FASTQ files by their read name. Sorted fastq files are required to keep mapping steps deterministic.

Usage: fastq-sort --id FASTQ_FILE > STDOUT

step_fastqc_trim_again_R1
wf_fastqc.cwl (Workflow)

This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol)

step_fastqc_trim_again_R2
wf_fastqc.cwl (Workflow)

This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol)

rename_unmapped_repeats_r1
rename.cwl (CommandLineTool)
rename_unmapped_repeats_r2
rename.cwl (CommandLineTool)
step_gzip_sort_X_trim_again
gzip.cwl (CommandLineTool)
step_gzip_sort_repunmapped_fastq
gzip.cwl (CommandLineTool)

Outputs

ID Type Label Doc
A_output_sorted_bam File
X_output_sorted_bam File
X_output_trim_again File[]
X_output_trim_first File[]
A_output_mapgenome_stats File
A_output_maprepeats_stats File
X_output_trim_again_metrics File
X_output_trim_first_metrics File
X_output_barcodecollapsepe_bam File
A_output_sort_repunmapped_fastq File[]
A_output_mapgenome_star_settings File
A_output_maprepeats_star_settings File
X_output_barcodecollapsepe_metrics File
A_output_mapgenome_mapped_to_genome File
X_output_trim_again_fastqc_stats_R1 File
X_output_trim_again_fastqc_stats_R2 File
X_output_trim_first_fastqc_stats_R1 File
X_output_trim_first_fastqc_stats_R2 File
A_output_maprepeats_mapped_to_genome File
X_output_trim_again_fastqc_report_R1 File
X_output_trim_again_fastqc_report_R2 File
X_output_trim_first_fastqc_report_R1 File
X_output_trim_first_fastqc_report_R2 File
Permalink: https://w3id.org/cwl/view/git/b2b95f58f96ee5b34bd9b342d0ecda63d135e278/cwl/wf_trim_and_map_pe.cwl