Explore Workflows

View already parsed workflows here or click here to add your own

Graph Name Retrieved From View
workflow graph cnv_exomedepth

CNV ExomeDepth calling

https://gitlab.bsc.es/lrodrig1/structuralvariants_poc.git

Path: structuralvariants/cwl/subworkflows/cnv_exome_depth.cwl

Branch/Commit ID: 3f6a871f81f343cf81a345f73ff2eeac70804b8c

workflow graph Cell Ranger Build Reference Indices

Devel version of Cell Ranger Build Reference Indices pipeline =============================================================

https://github.com/datirium/workflows.git

Path: workflows/cellranger-mkref.cwl

Branch/Commit ID: 7ced5a5259dbd8b3fc64456beaeffd44f4a24081

workflow graph PCA - Principal Component Analysis

Principal Component Analysis -------------- Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components. The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy.

https://github.com/datirium/workflows.git

Path: workflows/pca.cwl

Branch/Commit ID: d6f58c383d0676269afb519399061191a1144a6a

workflow graph tt_blastn_wnode

https://github.com/ncbi/pgap.git

Path: task_types/tt_blastn_wnode.cwl

Branch/Commit ID: e0fb04a0d8bc648183c6b71d099ce7aea3c3b3ff

workflow graph bacterial_orthology_cond

https://github.com/ncbi/pgap.git

Path: bacterial_orthology/wf_bacterial_orthology_conditional.cwl

Branch/Commit ID: 54c5074587af001a44eccb4762a4cb25fa24cb3e

workflow graph Align reference proteins plane complete workflow, with miniprot

https://github.com/ncbi/pgap.git

Path: protein_alignment/wf_protein_alignment_miniprot.cwl

Branch/Commit ID: 54c5074587af001a44eccb4762a4cb25fa24cb3e

workflow graph timelimit4-wf.cwl

https://github.com/common-workflow-language/cwl-v1.2.git

Path: tests/timelimit4-wf.cwl

Branch/Commit ID: c7c97715b400ff2194aa29fc211d3401cea3a9bf

workflow graph Trim Galore RNA-Seq pipeline single-read strand specific

Note: should be updated The original [BioWardrobe's](https://biowardrobe.com) [PubMed ID:26248465](https://www.ncbi.nlm.nih.gov/pubmed/26248465) **RNA-Seq** basic analysis for a **single-end** experiment. A corresponded input [FASTQ](http://maq.sourceforge.net/fastq.shtml) file has to be provided. Current workflow should be used only with the single-end RNA-Seq data. It performs the following steps: 1. Trim adapters from input FASTQ file 2. Use STAR to align reads from input FASTQ file according to the predefined reference indices; generate unsorted BAM file and alignment statistics file 3. Use fastx_quality_stats to analyze input FASTQ file and generate quality statistics file 4. Use samtools sort to generate coordinate sorted BAM(+BAI) file pair from the unsorted BAM file obtained on the step 1 (after running STAR) 5. Generate BigWig file on the base of sorted BAM file 6. Map input FASTQ file to predefined rRNA reference indices using Bowtie to define the level of rRNA contamination; export resulted statistics to file 7. Calculate isoform expression level for the sorted BAM file and GTF/TAB annotation file using GEEP reads-counting utility; export results to file

https://github.com/datirium/workflows.git

Path: workflows/trim-rnaseq-se-dutp.cwl

Branch/Commit ID: d6f58c383d0676269afb519399061191a1144a6a

workflow graph TgIF - Transgene Insertion Finder

TgIF (trans-gene insertion finder) ============================================== The TgIF algorithm returns a list of probable insertion sites in a target organism. It requires the user to provided a FASTQ file of ONT (Oxford Nanopore Technologies) reads (-f), the reference FASTA of the trans-gene (Tg) vector containing the insertion sequence (-i), and the reference FASTA of the target organism (-r). The algorithm is tailored for ONT reads from a modified nCATS[1] (nanopore Cas9-targeted sequencing) enriched library, however the algorithm will also produce informative results from a FASTQ derived from WGS (shotgun) sequencing libraries. The modified nCATS method is described here, and a brief overview can be found below. The basic workflow of TgIF is alignment (using minimap2[2]) of reads (-f) to a combined reference of the Tg vector (containing the desired insertion sequence) and target organism (ie. -i and -r are concatenated), and then searching for valleys (or gaps) in the resulting pileup of reads that map to both references at MAPQ>=30. A starting position (ps) of a valley is where the depth (d) at dp=0 and dp-1>0, an ending position (pe) of a valley is where the depth at dp=0 and dp+1>0, and a potential insertion scar is the gap between and including ps and pe. Primary Output files: - insertions_all.tsv, all probable insertion sites identified from the input fastq data - insertions_filtered.tgif, filtered sites that are most probable based on logic (4) above - reportsummary.md, summary of alignment metrics and insertion sites found Secondary Output files: - insertion_site_plots.tar, package of probable insertion site pileup plots - alignment_files.tar.gz, contains bam/bai for visualizing aligned reads to reference genome and vector sequence - primer3.tar, contains F/R primers for each filtered insertion site designed by primer3 Documents ============================================== - github Page: https://github.com/jhuapl-bio/TgIF/tree/main References ============================================== - Gilpatrick, T. et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nature Biotechnology 38, 433–438 (2020). - Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34:3094-3100. doi:10.1093/bioinformatics/bty191 - O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47. - Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, and 1000 Genome Project Data Processing Subgroup, The Sequence alignment/map (SAM) format and SAMtools, Bioinformatics (2009) 25(16) 2078-9 [19505943] - R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. - H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. - Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M and Rozen SG. Primer3--new capabilities and interfaces. Nucleic Acids Res. 2012 Aug 1;40(15):e115.

https://github.com/datirium/workflows.git

Path: workflows/tgif.cwl

Branch/Commit ID: 93b844a80f4008cc973ea9b5efedaff32a343895

workflow graph module-1.cwl

https://github.com/mskcc/ACCESS-Pipeline.git

Path: workflows/module-1.cwl

Branch/Commit ID: 5bf88423593441e4bf6b432111160446cd8dcf13