Explore Workflows

View already parsed workflows here or click here to add your own

Graph Name Retrieved From View
workflow graph env-wf1.cwl

https://github.com/common-workflow-language/cwl-v1.2.git

Path: tests/env-wf1.cwl

Branch/Commit ID: e62f99dd79d6cb9c157cceb458f74200da84f6e9

workflow graph phase VCF

https://github.com/genome/analysis-workflows.git

Path: definitions/subworkflows/phase_vcf.cwl

Branch/Commit ID: c711498c04d6b8ddf92ddceb6219f074765f7993

workflow graph default-wf5.cwl

https://github.com/common-workflow-language/cwltool.git

Path: tests/wf/default-wf5.cwl

Branch/Commit ID: eba80916b5cde8bdbd56c077c94240ddf796a27b

workflow graph Build STAR indices

Workflow runs [STAR](https://github.com/alexdobin/STAR) v2.5.3a (03/17/2017) PMID: [23104886](https://www.ncbi.nlm.nih.gov/pubmed/23104886) to build indices for reference genome provided in a single FASTA file as fasta_file input and GTF annotation file from annotation_gtf_file input. Generated indices are saved in a folder with the name that corresponds to the input genome.

https://github.com/datirium/workflows.git

Path: workflows/star-index.cwl

Branch/Commit ID: 730b40bc403263b724399a952c0f3e2d28f13519

workflow graph DESeq2 (LRT) - differential gene expression analysis using likelihood ratio test

Runs DESeq2 using LRT (Likelihood Ratio Test) ============================================= The LRT examines two models for the counts, a full model with a certain number of terms and a reduced model, in which some of the terms of the full model are removed. The test determines if the increased likelihood of the data using the extra terms in the full model is more than expected if those extra terms are truly zero. The LRT is therefore useful for testing multiple terms at once, for example testing 3 or more levels of a factor at once, or all interactions between two variables. The LRT for count data is conceptually similar to an analysis of variance (ANOVA) calculation in linear regression, except that in the case of the Negative Binomial GLM, we use an analysis of deviance (ANODEV), where the deviance captures the difference in likelihood between a full and a reduced model. When one performs a likelihood ratio test, the p values and the test statistic (the stat column) are values for the test that removes all of the variables which are present in the full design and not in the reduced design. This tests the null hypothesis that all the coefficients from these variables and levels of these factors are equal to zero. The likelihood ratio test p values therefore represent a test of all the variables and all the levels of factors which are among these variables. However, the results table only has space for one column of log fold change, so a single variable and a single comparison is shown (among the potentially multiple log fold changes which were tested in the likelihood ratio test). This indicates that the p value is for the likelihood ratio test of all the variables and all the levels, while the log fold change is a single comparison from among those variables and levels. **Technical notes** 1. At least two biological replicates are required for every compared category 2. Metadata file describes relations between compared experiments, for example ``` ,time,condition DH1,day5,WT DH2,day5,KO DH3,day7,WT DH4,day7,KO DH5,day7,KO ``` where `time, condition, day5, day7, WT, KO` should be a single words (without spaces) and `DH1, DH2, DH3, DH4, DH5` correspond to the experiment aliases set in **RNA-Seq experiments** input. 3. Design and reduced formulas should start with **~** and include categories or, optionally, their interactions from the metadata file header. See details in DESeq2 manual [here](https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions) and [here](https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#likelihood-ratio-test) 4. Contrast should be set based on your metadata file header and available categories in a form of `Factor Numerator Denominator`, where `Factor` - column name from metadata file, `Numerator` - category from metadata file to be used as numerator in fold change calculation, `Denominator` - category from metadata file to be used as denominator in fold change calculation. For example `condition WT KO`.

https://github.com/datirium/workflows.git

Path: workflows/deseq-lrt.cwl

Branch/Commit ID: e99e80a2c19682d59947bde04a892d7b6d90091c

workflow graph GAT - Genomic Association Tester

GAT: Genomic Association Tester ============================================== A common question in genomic analysis is whether two sets of genomic intervals overlap significantly. This question arises, for example, in the interpretation of ChIP-Seq or RNA-Seq data. The Genomic Association Tester (GAT) is a tool for computing the significance of overlap between multiple sets of genomic intervals. GAT estimates significance based on simulation. Gat implemements a sampling algorithm. Given a chromosome (workspace) and segments of interest, for example from a ChIP-Seq experiment, gat creates randomized version of the segments of interest falling into the workspace. These sampled segments are then compared to existing genomic annotations. The sampling method is conceptually simple. Randomized samples of the segments of interest are created in a two-step procedure. Firstly, a segment size is selected from to same size distribution as the original segments of interest. Secondly, a random position is assigned to the segment. The sampling stops when exactly the same number of nucleotides have been sampled. To improve the speed of sampling, segment overlap is not resolved until the very end of the sampling procedure. Conflicts are then resolved by randomly removing and re-sampling segments until a covering set has been achieved. Because the size of randomized segments is derived from the observed segment size distribution of the segments of interest, the actual segment sizes in the sampled segments are usually not exactly identical to the ones in the segments of interest. This is in contrast to a sampling method that permutes segment positions within the workspace.

https://github.com/datirium/workflows.git

Path: workflows/gat-run.cwl

Branch/Commit ID: e99e80a2c19682d59947bde04a892d7b6d90091c

workflow graph metabarcode (gene amplicon) analysis for fastq files

protein - qc, preprocess, annotation, index, abundance

https://github.com/MG-RAST/pipeline.git

Path: CWL/Workflows/metabarcode-fastq.workflow.cwl

Branch/Commit ID: 3e967f035c10a176b9457331df0b3374a8562b26

workflow graph CLE gold vcf evaluation workflow

https://github.com/genome/analysis-workflows.git

Path: definitions/subworkflows/vcf_eval_cle_gold.cwl

Branch/Commit ID: 3bebaf9b70331de9f4845e2223c55082f5a812fb

workflow graph gather AML trio outputs

https://github.com/genome/analysis-workflows.git

Path: definitions/pipelines/aml_trio_cle_gathered.cwl

Branch/Commit ID: 3f3b186da9bf82a5e2ae74ba27aef35a46174ebe

workflow graph dragen-germline-pipeline__4.2.4.cwl

https://github.com/umccr/cwl-ica.git

Path: workflows/dragen-germline-pipeline/4.2.4/dragen-germline-pipeline__4.2.4.cwl

Branch/Commit ID: 5516c2a252c9f167b83df99785de5d3451b65e00