Workflow: STAR-RNA-Seq alignment and transcript/gene abundance workflow
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
strand | |||
refFlat | File | ||
reference | File | ||
unaligned | https://w3id.org/cwl/view/git/bfcb5ffbea3d00a38cc03595d41e53ea976d599d/definitions/types/sequence_data.yml#sequence_data[] |
Raw data from rna sequencing; this custom type holds both the data file(s) and readgroup information. Data file(s) may be either a bam file, or paired fastqs. Readgroup information should be given as a series of key:value pairs, each separated by a space. This means that spaces within a value must be double quoted. The first key must be ID; consult the read group description in the header section of the SAM file specification for other, optional keys. Below is an example of an element of the input array: readgroup: \"ID:xxx PU:xxx SM:xxx LB:xxx PL:ILLUMINA CN:WUGSC\" sequence: fastq1: class: File path: /path/to/reads1.fastq fastq2: class: File path: /path/to/reads2.fastq OR bam: class: File path: /path/to/reads.bam |
|
cdna_fasta | File | ||
sample_name | String | ||
unzip_fastqs | Boolean (Optional) | ||
kallisto_index | File | ||
star_genome_dir | Directory | ||
agfusion_database | File | ||
trimming_adapters | File | ||
ribosomal_intervals | File | ||
fusioninspector_mode | |||
reference_annotation | File | ||
examine_coding_effect | Boolean (Optional) | ||
trimming_max_uncalled | Integer | ||
star_fusion_genome_dir | Directory | ||
trimming_min_readlength | Integer | ||
trimming_adapter_trim_end | String | ||
gene_transcript_lookup_table | File | ||
trimming_adapter_min_overlap | Integer | ||
agfusion_annotate_noncanonical | Boolean (Optional) |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
agfusion |
../tools/agfusion.cwl
(CommandLineTool)
|
A tool that annotates STAR gene fusion predictions | |
kallisto |
../tools/kallisto.cwl
(CommandLineTool)
|
Kallisto: Quant | |
mark_dup |
../tools/mark_duplicates_and_sort.cwl
(CommandLineTool)
|
Mark duplicates and Sort | |
sort_bam |
../tools/samtools_sort.cwl
(CommandLineTool)
|
samtools sort | |
index_bam |
../tools/index_bam.cwl
(CommandLineTool)
|
samtools index | |
stringtie |
../tools/stringtie.cwl
(CommandLineTool)
|
StringTie | |
index_cram |
../tools/index_cram.cwl
(CommandLineTool)
|
samtools index cram | |
bam_to_cram |
../tools/bam_to_cram.cwl
(CommandLineTool)
|
BAM to CRAM conversion | |
star_align_fusion |
../tools/star_align_fusion.cwl
(CommandLineTool)
|
STAR: align reads to transcriptome | |
star_fusion_detect |
../tools/star_fusion_detect.cwl
(CommandLineTool)
|
STAR-Fusion identify candidate fusion transcript | |
strandedness_check |
../tools/strandedness_check.cwl
(CommandLineTool)
|
runs how_are_we_stranded_here to determine RNAseq data strandedness |
Uses how_are_we_stranded_here, a python package for testing strandedness. Runs Kallisto and Rseqc (infer-experiment-py) to to check which direction reads align once mapped in transcripts. It first creates a Kallisto index (or uses a pre-made index) of your organism's transcriptome. It then maps a small subset of reads (default 200000) to the transcriptome and uses Kallisto's --genomebam argument to project pseudoalignments to the genome sorted BAM file. (Currently only Kallisto version 0.44.0 works well with how_are_we_stranded_here.) It finally runs RSeQC's infer_experiment.py to check which direction reads from the first and second pairs are aligned in relation to the transcript strand, and provides output with the likely strandedness of your data. |
transcript_to_gene |
../tools/transcript_to_gene.cwl
(CommandLineTool)
|
Kallisto: TranscriptToGene | |
generate_qc_metrics |
../tools/generate_qc_metrics.cwl
(CommandLineTool)
|
Picard: RNA Seq Metrics | |
cgpbigwig_bamcoverage |
../tools/bam_to_bigwig.cwl
(CommandLineTool)
|
cgpBigWig Converting BAM to BigWig | |
sequence_to_trimmed_fastq | sequence (bam or fastqs) to trimmed fastqs |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
cram | File | ||
chart | File | ||
metrics | File | ||
final_bam | File | ||
strand_info | File[] | ||
gene_abundance | File | ||
fusion_evidence | File | ||
star_fusion_log | File | ||
star_fusion_out | File | ||
star_junction_out | File | ||
bamcoverage_bigwig | File | ||
star_fusion_abridge | File | ||
star_fusion_predict | File | ||
coding_region_effects | File (Optional) | ||
transcript_abundance_h5 | File | ||
fusioninspector_evidence | File[] (Optional) | ||
stringtie_transcript_gtf | File | ||
transcript_abundance_tsv | File | ||
annotated_fusion_predictions | Directory | ||
stringtie_gene_expression_tsv | File |
https://w3id.org/cwl/view/git/bfcb5ffbea3d00a38cc03595d41e53ea976d599d/definitions/pipelines/rnaseq_star_fusion.cwl