Explore Workflows

View already parsed workflows here or click here to add your own

Graph Name Retrieved From View
workflow graph downsample unaligned BAM and align

https://github.com/genome/analysis-workflows.git

Path: definitions/subworkflows/downsampled_alignment.cwl

Branch/Commit ID: 0c4f4e59c265eb22aed3d2d37b173cb5430773d2

workflow graph DESeq2 (LRT) - differential gene expression analysis using likelihood ratio test

Runs DESeq2 using LRT (Likelihood Ratio Test) ============================================= The LRT examines two models for the counts, a full model with a certain number of terms and a reduced model, in which some of the terms of the full model are removed. The test determines if the increased likelihood of the data using the extra terms in the full model is more than expected if those extra terms are truly zero. The LRT is therefore useful for testing multiple terms at once, for example testing 3 or more levels of a factor at once, or all interactions between two variables. The LRT for count data is conceptually similar to an analysis of variance (ANOVA) calculation in linear regression, except that in the case of the Negative Binomial GLM, we use an analysis of deviance (ANODEV), where the deviance captures the difference in likelihood between a full and a reduced model. When one performs a likelihood ratio test, the p values and the test statistic (the stat column) are values for the test that removes all of the variables which are present in the full design and not in the reduced design. This tests the null hypothesis that all the coefficients from these variables and levels of these factors are equal to zero. The likelihood ratio test p values therefore represent a test of all the variables and all the levels of factors which are among these variables. However, the results table only has space for one column of log fold change, so a single variable and a single comparison is shown (among the potentially multiple log fold changes which were tested in the likelihood ratio test). This indicates that the p value is for the likelihood ratio test of all the variables and all the levels, while the log fold change is a single comparison from among those variables and levels. **Technical notes** 1. At least two biological replicates are required for every compared category 2. Metadata file describes relations between compared experiments, for example ``` ,time,condition DH1,day5,WT DH2,day5,KO DH3,day7,WT DH4,day7,KO DH5,day7,KO ``` where `time, condition, day5, day7, WT, KO` should be a single words (without spaces) and `DH1, DH2, DH3, DH4, DH5` correspond to the experiment aliases set in **RNA-Seq experiments** input. 3. Design and reduced formulas should start with **~** and include categories or, optionally, their interactions from the metadata file header. See details in DESeq2 manual [here](https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions) and [here](https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#likelihood-ratio-test) 4. Contrast should be set based on your metadata file header and available categories in a form of `Factor Numerator Denominator`, where `Factor` - column name from metadata file, `Numerator` - category from metadata file to be used as numerator in fold change calculation, `Denominator` - category from metadata file to be used as denominator in fold change calculation. For example `condition WT KO`.

https://github.com/datirium/workflows.git

Path: workflows/deseq-lrt.cwl

Branch/Commit ID: 36fd18f11e939d3908b1eca8d2939402f7a99b0f

workflow graph gathered exome alignment and somatic variant detection for cle purpose

https://github.com/genome/analysis-workflows.git

Path: definitions/pipelines/somatic_exome_cle_gathered.cwl

Branch/Commit ID: cc3e7f1ccfdc7101c22bf88792608504eea7d53a

workflow graph count-lines13-wf.cwl

https://github.com/common-workflow-language/cwltool.git

Path: cwltool/schemas/v1.0/v1.0/count-lines13-wf.cwl

Branch/Commit ID: bbe20f54deea92d9c9cd38cb1f23c4423133d3de

workflow graph steplevel-resreq.cwl

https://github.com/common-workflow-language/cwltool.git

Path: cwltool/schemas/v1.0/v1.0/steplevel-resreq.cwl

Branch/Commit ID: 1eb6bfe3c77aebaf69453a669d21ae7a5a78056f

workflow graph miRNA-Seq miRDeep2 pipeline

A CWL workflow for discovering known or novel miRNAs from deep sequencing data using the miRDeep2 tool. The ExoCarta exosome database is also used for identifying exosome-related miRNAs, and TargetScan's organism-specific databases are used for identifying miRNA gene targets. ## __Outputs__ #### Primary Output files: - mirs_known.tsv, detected known mature miRNAs, \"Known miRNAs\" tab - mirs_novel.tsv, detected novel mature miRNAs, \"Novel miRNAs\" tab #### Secondary Output files: - mirs_known_exocarta_deepmirs.tsv, list of detected miRNA also in ExoCarta's exosome database, \"Detected Exosome miRNAs\" tab - mirs_known_gene_targets.tsv, pre-computed gene targets of known mature mirs, downloadable - known_mirs_mature.fa, known mature mir sequences, downloadable - known_mirs_precursor.fa, known precursor mir sequences, downloadable - novel_mirs_mature.fa, novel mature mir sequences, downloadable - novel_mirs_precursor.fa, novel precursor mir sequences, downloadable #### Reports: - overview.md (input list, alignment & mir metrics), \"Overview\" tab - mirdeep2_result.html, summary of mirdeep2 results, \"miRDeep2 Results\" tab ## __Inputs__ #### General Info - Sample short name/Alias: unique name for sample - Experimental condition: condition, variable, etc name (e.g. \"control\" or \"20C 60min\") - Cells: name of cells used for the sample - Catalog No.: vender catalog number if available - Bowtie2 index: Bowtie2 index directory of the reference genome. - Reference Genome FASTA: Reference genome FASTA file to be used for alignment. - Genome short name: Name used for setting organism name, genus, species, and tax ID. - Input FASTQ file: FASTQ file from a single-end miRNA sequencing run. #### Advanced - Adapter: Adapter sequence to be trimmed from miRNA sequence reads. (Default: TCGTAT) - Threads: Number of threads to use for steps that support multithreading (Default: 4). ## Hints & Tips: #### For the identification of novel miRNA candidates, the following may be used as a filtering guideline: 1. miRDeep score > 4 (some authors use 1) 2. not present a match with rfam 3. should present a significant RNAfold (\"yes\") 4. a number of mature reads > 10 5. if applicable, novel mir must be expressed in multiple samples #### For filtering mirbase by organism. | genome | organism | division | name | tree | NCBI-taxid | | ---- | --- | --- | ----------- | ----------- | ----------- | | hg19 | hsa | HSA | Homo sapiens | Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Primates;Hominidae | 9606 | | hg38 | hsa | HSA | Homo sapiens | Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Primates;Hominidae | 9606 | | mm10 | mmu | MMU | Mus musculus | Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Rodentia | 10090 | | rn7 | rno | RNO | Rattus norvegicus | Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Rodentia | 10116 | | dm3 | dme | DME | Drosophila melanogaster | Metazoa;Bilateria;Ecdysozoa;Arthropoda;Hexapoda | 7227 | ## __Data Analysis Steps__ 1. The miRDeep2 Mapper module processes Illumina FASTQ output and maps it to the reference genome. 2. The miRDeep2 miRDeep2 module identifies known and novel (mature and precursor) miRNAs. 3. The ExoCarta database of miRNA found in exosomes is then used to find overlap between mirs_known.tsv and exosome associated miRNAs. 4. Finally, TargetScan organism-specific miRNA gene target database is used to find overlap between mirs_known.tsv and gene targets. ## __References__ 1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245920 2. https://github.com/rajewsky-lab/mirdeep2 3. https://biocontainers.pro/tools/mirdeep2 4. https://www.mirbase.org/ 5. http://exocarta.org/index.html 6. https://www.targetscan.org/vert_80/

https://github.com/datirium/workflows.git

Path: workflows/mirna-mirdeep2-se.cwl

Branch/Commit ID: 57863b6131d8262c5ce864adaf8e4038401e71a2

workflow graph star-index.cwl

Generates indices for STAR v2.5.3a (03/17/2017).

https://github.com/datirium/workflows.git

Path: workflows/star-index.cwl

Branch/Commit ID: e284e3f6dff25037b209895c52f2abd37a1ce1bf

workflow graph scatter-valuefrom-wf1.cwl

https://github.com/common-workflow-language/cwl-v1.1.git

Path: tests/scatter-valuefrom-wf1.cwl

Branch/Commit ID: 50251ef931d108c09bed2d330d3d4fe9c562b1c3

workflow graph heatmap-prepare.cwl

Workflow runs homer-make-tag-directory.cwl tool using scatter for the following inputs - bam_file - fragment_size - total_reads `dotproduct` is used as a `scatterMethod`, so one element will be taken from each array to construct each job: 1) bam_file[0] fragment_size[0] total_reads[0] 2) bam_file[1] fragment_size[1] total_reads[1] ... N) bam_file[N] fragment_size[N] total_reads[N] `bam_file`, `fragment_size` and `total_reads` arrays should have the identical order.

https://github.com/datirium/workflows.git

Path: tools/heatmap-prepare.cwl

Branch/Commit ID: 57863b6131d8262c5ce864adaf8e4038401e71a2

workflow graph workflow_input_sf_expr_v1_2.cwl

https://github.com/common-workflow-language/cwl-utils.git

Path: testdata/workflow_input_sf_expr_v1_2.cwl

Branch/Commit ID: 0ab1d42d10f7311bb4032956c4a6f3d2730d9507