STAR-RNA-Seq alignment and transcript/gene abundance workflow

Workflow: STAR-RNA-Seq alignment and transcript/gene abundance workflow

Fetched 2023-01-04 18:44:32 GMT

Verified with cwltool version 3.1.20221201130942

Selected
|
Default Values
Nested Workflows
Tools
Inputs/Outputs

This workflow is Open Source and may be reused according to the terms of: MIT License

Note that the tools invoked by the workflow may have separate licenses.

Inputs

ID	Type	Title	Doc
strand
refFlat	File
gtf_file	File
cdna_fasta	File
sample_name	String
kallisto_index	File
reference_fasta	File
star_genome_dir	Directory
trimming_adapters	File
outsam_attrrg_line	String[]
ribosomal_intervals	File
instrument_data_bams	File[]
trimming_max_uncalled	Integer
star_fusion_genome_dir	Directory
trimming_min_readlength	Integer
trimming_adapter_trim_end	String
gene_transcript_lookup_table	File
trimming_adapter_min_overlap	Integer

Steps

ID	Runs	Label	Doc
kallisto	../tools/kallisto.cwl (CommandLineTool)	Kallisto: Quant
mark_dup	../tools/mark_duplicates_and_sort.cwl (CommandLineTool)	Mark duplicates and Sort
sort_bam	../tools/samtools_sort.cwl (CommandLineTool)	samtools sort
index_bam	../tools/index_bam.cwl (CommandLineTool)	samtools index
stringtie	../tools/stringtie.cwl (CommandLineTool)	StringTie
index_cram	../tools/index_cram.cwl (CommandLineTool)	samtools index cram
bam_to_cram	../tools/bam_to_cram.cwl (CommandLineTool)	BAM to CRAM conversion
star_align_fusion	../tools/star_align_fusion.cwl (CommandLineTool)	STAR: align reads to transcriptome
star_fusion_detect	../tools/star_fusion_detect.cwl (CommandLineTool)	STAR-Fusion identify candidate fusion transcript
strandedness_check	../tools/strandedness_check.cwl (CommandLineTool)	runs how_are_we_stranded_here to determine RNAseq data strandedness	Uses how_are_we_stranded_here, a python package for testing strandedness. Runs Kallisto and Rseqc (infer-experiment-py) to to check which direction reads align once mapped in transcripts. It first creates a Kallisto index (or uses a pre-made index) of your organism's transcriptome. It then maps a small subset of reads (default 200000) to the transcriptome and uses Kallisto's --genomebam argument to project pseudoalignments to the genome sorted BAM file. (Currently only Kallisto version 0.44.0 works well with how_are_we_stranded_here.) It finally runs RSeQC's infer_experiment.py to check which direction reads from the first and second pairs are aligned in relation to the transcript strand, and provides output with the likely strandedness of your data.
transcript_to_gene	../tools/transcript_to_gene.cwl (CommandLineTool)	Kallisto: TranscriptToGene
generate_qc_metrics	../tools/generate_qc_metrics.cwl (CommandLineTool)	Picard: RNA Seq Metrics
bam_to_trimmed_fastq	../subworkflows/bam_to_trimmed_fastq.cwl (Workflow)	bam to trimmed fastqs
cgpbigwig_bamcoverage	../tools/bam_to_bigwig.cwl (CommandLineTool)	cgpBigWig Converting BAM to BigWig

Outputs

ID	Type	Label	Doc
cram	File
chart	File
metrics	File
strand_info	File[]
gene_abundance	File
fusion_evidence	File
star_fusion_log	File
star_fusion_out	File
star_junction_out	File
bamcoverage_bigwig	File
star_fusion_abridge	File
star_fusion_predict	File
transcript_abundance_h5	File
stringtie_transcript_gtf	File
transcript_abundance_tsv	File
stringtie_gene_expression_tsv	File

Permalink: https://w3id.org/cwl/view/git/389f6edccab082d947bee9c032f59dbdf9f7c325/definitions/pipelines/rnaseq_star_fusion.cwl