Explore Workflows

View already parsed workflows here or click here to add your own

Graph Name Retrieved From View
workflow graph WGS and MT analysis for fastq files

rna / protein - qc, preprocess, filter, annotation, index, abundance

https://github.com/MG-RAST/pipeline.git

Path: CWL/Workflows/wgs-noscreen-fastq.workflow.cwl

Branch/Commit ID: 81feefc84ec0faecf1ade718001d5f07610e616e

workflow graph GAT - Genomic Association Tester

GAT: Genomic Association Tester ============================================== A common question in genomic analysis is whether two sets of genomic intervals overlap significantly. This question arises, for example, in the interpretation of ChIP-Seq or RNA-Seq data. The Genomic Association Tester (GAT) is a tool for computing the significance of overlap between multiple sets of genomic intervals. GAT estimates significance based on simulation. Gat implemements a sampling algorithm. Given a chromosome (workspace) and segments of interest, for example from a ChIP-Seq experiment, gat creates randomized version of the segments of interest falling into the workspace. These sampled segments are then compared to existing genomic annotations. The sampling method is conceptually simple. Randomized samples of the segments of interest are created in a two-step procedure. Firstly, a segment size is selected from to same size distribution as the original segments of interest. Secondly, a random position is assigned to the segment. The sampling stops when exactly the same number of nucleotides have been sampled. To improve the speed of sampling, segment overlap is not resolved until the very end of the sampling procedure. Conflicts are then resolved by randomly removing and re-sampling segments until a covering set has been achieved. Because the size of randomized segments is derived from the observed segment size distribution of the segments of interest, the actual segment sizes in the sampled segments are usually not exactly identical to the ones in the segments of interest. This is in contrast to a sampling method that permutes segment positions within the workspace.

https://github.com/datirium/workflows.git

Path: workflows/gat-run.cwl

Branch/Commit ID: 7fb8a1ebf8145791440bc2fed9c5f2d78a19d04c

workflow graph PCA - Principal Component Analysis

Principal Component Analysis --------------- Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components. The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy.

https://github.com/datirium/workflows.git

Path: workflows/pca.cwl

Branch/Commit ID: 10ce6e113f749c7bd725e426445220c3bdc5ddf1

workflow graph RNA-Seq pipeline paired-end stranded mitochondrial

Slightly changed original [BioWardrobe's](https://biowardrobe.com) [PubMed ID:26248465](https://www.ncbi.nlm.nih.gov/pubmed/26248465) **RNA-Seq** basic analysis for **strand specific pair-end** experiment. An additional steps were added to map data to mitochondrial chromosome only and then merge the output. Experiment files in [FASTQ](http://maq.sourceforge.net/fastq.shtml) format either compressed or not can be used. Current workflow should be used only with the pair-end strand specific RNA-Seq data. It performs the following steps: 1. `STAR` to align reads from input FASTQ file according to the predefined reference indices; generate unsorted BAM file and alignment statistics file 2. `fastx_quality_stats` to analyze input FASTQ file and generate quality statistics file 3. `samtools sort` to generate coordinate sorted BAM(+BAI) file pair from the unsorted BAM file obtained on the step 1 (after running STAR) 5. Generate BigWig file on the base of sorted BAM file 6. Map input FASTQ file to predefined rRNA reference indices using Bowtie to define the level of rRNA contamination; export resulted statistics to file 7. Calculate isoform expression level for the sorted BAM file and GTF/TAB annotation file using `GEEP` reads-counting utility; export results to file

https://github.com/datirium/workflows.git

Path: workflows/rnaseq-pe-dutp-mitochondrial.cwl

Branch/Commit ID: 91bb63948c0a264334b9007ef85f936768d90d11

workflow graph final_filtering

Final filtering

https://gitlab.bsc.es/lrodrig1/structuralvariants_poc.git

Path: structuralvariants/cwl/subworkflows/final_filtering.cwl

Branch/Commit ID: b62c7bfcf5eb7ac3c1ed06879200fdf5db947e4b

workflow graph indexing_bed

https://gitlab.bsc.es/lrodrig1/structuralvariants_poc.git

Path: structuralvariants/cwl/subworkflows/indexing_bed.cwl

Branch/Commit ID: de9cb009f8fe0c8d5a94db5c882cf21ddf372452

workflow graph genome-indices.cwl

Generates genome indices for STAR v2.5.3a (03/17/2017) & bowtie v1.2.0 (12/30/2016).

https://github.com/datirium/workflows.git

Path: workflows/genome-indices.cwl

Branch/Commit ID: cf107bc24a37883ef01b959fd89c19456aaecc02

workflow graph genome-kallisto-index.cwl

Generates a FASTA file with the DNA sequences for all transcripts in a GFF file and builds kallisto index

https://github.com/Barski-lab/workflows.git

Path: tools/genome-kallisto-index.cwl

Branch/Commit ID: 12edfc2207507e53c6b5bb21e50decb5535a12f7

workflow graph count-lines3-wf.cwl

https://github.com/common-workflow-language/cwltool.git

Path: cwltool/schemas/v1.0/v1.0/count-lines3-wf.cwl

Branch/Commit ID: e2ec740fccc81ff7071dcd607c5c158fbc0dfb90

workflow graph chipseq-gen-bigwig.cwl

https://github.com/datirium/workflows.git

Path: subworkflows/chipseq-gen-bigwig.cwl

Branch/Commit ID: ae2b231562822ed66b8e35e5452ae7f012416b2a