Workflow: wf_trim_and_map_pe.cwl
This workflow takes in appropriate trimming params and demultiplexed reads, and performs the following steps in order: trimx1, trimx2, fastq-sort, filter repeat elements, fastq-sort, genomic mapping, sort alignment, index alignment, namesort, PCR dedup, sort alignment, index alignment
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
read1 | File | ||
read2 | File | ||
A_adapters | File | ||
a_adapters | File | ||
g_adapters | File | ||
sort_names | Boolean | ||
trim_times | String | ||
trim_error_rate | String | ||
speciesGenomeDir | Directory | ||
a_adapters_default | File | ||
g_adapters_default | File | ||
repeatElementGenomeDir | Directory | ||
trimagain_overlap_length | String | ||
trimfirst_overlap_length | String |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
X_sort |
sort.cwl
(CommandLineTool)
|
This tool wraps samtools sort by coordinates (namesort flag is False by default). Usage: samtools sort [options...] [in.bam] |
|
X_trim |
trim_pe.cwl
(CommandLineTool)
|
This tool wraps cutadapt with default parameters set to paired-end eCLIP processing defaults. Usage: cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq |
|
A_map_genome |
star-genome.cwl
(CommandLineTool)
|
||
X_sortlexico |
namesort.cwl
(CommandLineTool)
|
This tool wraps samtools sort, setting the by-name (-n) flag to be True by default. Usage: samtools sort -n <input.bam> <output.bam> |
|
X_trim_again |
trim_pe.cwl
(CommandLineTool)
|
This tool wraps cutadapt with default parameters set to paired-end eCLIP processing defaults. Usage: cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq |
|
A_map_repeats |
star-repeatmapping.cwl
(CommandLineTool)
|
||
get_A_adapters |
file2stringArray.cwl
(ExpressionTool)
|
Returns string array expression based on lines in a fasta file (SKIPS >). |
|
get_a_adapters |
file2stringArray.cwl
(ExpressionTool)
|
Returns string array expression based on lines in a fasta file (SKIPS >). |
|
get_g_adapters |
file2stringArray.cwl
(ExpressionTool)
|
Returns string array expression based on lines in a fasta file (SKIPS >). |
|
X_barcodecollapsepe |
barcodecollapse_pe.cwl
(CommandLineTool)
|
This tool wraps barcodecollapsepe.py, a paired-end PCR duplicate removal script which reads in a .bam file where the first string left of : split of the read name is the barcode and merge reads mapped to the same position that have the same barcode. Assumes paired end reads are adjacent in output file (ie needs unsorted bams) Also assumes no multimappers in the bam file (otherwise behavior is undefined) Usage: python barcodecollapsepe.py --bam BAM --out_file OUT_FILE --metrics_file METRICS_FILE |
|
step_fastqc_trim_R1 |
wf_fastqc.cwl
(Workflow)
|
This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol) |
|
step_fastqc_trim_R2 |
wf_fastqc.cwl
(Workflow)
|
This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol) |
|
A_sort_trimmed_fastq |
fastqsort.cwl
(CommandLineTool)
|
Sorts FASTQ files by their read name. Sorted fastq files are required to keep mapping steps
deterministic. |
|
rename_mapped_genome |
rename.cwl
(CommandLineTool)
|
||
rename_mapped_repeats |
rename.cwl
(CommandLineTool)
|
||
step_gzip_sort_X_trim |
gzip.cwl
(CommandLineTool)
|
||
get_a_adapters_default |
file2stringArray.cwl
(ExpressionTool)
|
Returns string array expression based on lines in a fasta file (SKIPS >). |
|
get_g_adapters_default |
file2stringArray.cwl
(ExpressionTool)
|
Returns string array expression based on lines in a fasta file (SKIPS >). |
|
A_sort_repunmapped_fastq |
fastqsort.cwl
(CommandLineTool)
|
Sorts FASTQ files by their read name. Sorted fastq files are required to keep mapping steps
deterministic. |
|
step_fastqc_trim_again_R1 |
wf_fastqc.cwl
(Workflow)
|
This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol) |
|
step_fastqc_trim_again_R2 |
wf_fastqc.cwl
(Workflow)
|
This workflow takes in single-end reads, and performs the following steps in order: demux_se.cwl (does not actually demux for single end, but mirrors the paired-end processing protocol) |
|
rename_unmapped_repeats_r1 |
rename.cwl
(CommandLineTool)
|
||
rename_unmapped_repeats_r2 |
rename.cwl
(CommandLineTool)
|
||
step_gzip_sort_X_trim_again |
gzip.cwl
(CommandLineTool)
|
||
step_gzip_sort_repunmapped_fastq |
gzip.cwl
(CommandLineTool)
|
Outputs
ID | Type | Label | Doc |
---|---|---|---|
A_output_sorted_bam | File | ||
X_output_sorted_bam | File | ||
X_output_trim_again | File[] | ||
X_output_trim_first | File[] | ||
A_output_mapgenome_stats | File | ||
A_output_maprepeats_stats | File | ||
X_output_trim_again_metrics | File | ||
X_output_trim_first_metrics | File | ||
X_output_barcodecollapsepe_bam | File | ||
A_output_sort_repunmapped_fastq | File[] | ||
A_output_mapgenome_star_settings | File | ||
A_output_maprepeats_star_settings | File | ||
X_output_barcodecollapsepe_metrics | File | ||
A_output_mapgenome_mapped_to_genome | File | ||
X_output_trim_again_fastqc_stats_R1 | File | ||
X_output_trim_again_fastqc_stats_R2 | File | ||
X_output_trim_first_fastqc_stats_R1 | File | ||
X_output_trim_first_fastqc_stats_R2 | File | ||
A_output_maprepeats_mapped_to_genome | File | ||
X_output_trim_again_fastqc_report_R1 | File | ||
X_output_trim_again_fastqc_report_R2 | File | ||
X_output_trim_first_fastqc_report_R1 | File | ||
X_output_trim_first_fastqc_report_R2 | File |
https://w3id.org/cwl/view/git/b2b95f58f96ee5b34bd9b342d0ecda63d135e278/cwl/wf_trim_and_map_pe.cwl