Explore Workflows

View already parsed workflows here or click here to add your own

Graph Name Retrieved From View
workflow graph exomeseq-gatk4-02-variantdiscovery.cwl

https://github.com/bespin-workflows/exomeseq-gatk4.git

Path: subworkflows/exomeseq-gatk4-02-variantdiscovery.cwl

Branch/Commit ID: a243d20e040b0b4b6ed875e68c39fcaee2cd9620

workflow graph revsort.cwl

Reverse the lines in a document, then sort those lines.

https://github.com/common-workflow-language/cwltool.git

Path: tests/wf/revsort.cwl

Branch/Commit ID: 639229b1159cf484e70e52da10194561b3fad719

workflow graph indices-header.cwl

https://github.com/datirium/workflows.git

Path: metadata/indices-header.cwl

Branch/Commit ID: 2f0db4b3c515f91c5cfda19c78cf90d339390986

workflow graph CLIP-Seq pipeline for single-read experiment NNNNG

Cross-Linking ImmunoPrecipitation ================================= `CLIP` (`cross-linking immunoprecipitation`) is a method used in molecular biology that combines UV cross-linking with immunoprecipitation in order to analyse protein interactions with RNA or to precisely locate RNA modifications (e.g. m6A). (Uhl|Houwaart|Corrado|Wright|Backofen|2017)(Ule|Jensen|Ruggiu|Mele|2003)(Sugimoto|König|Hussain|Zupan|2012)(Zhang|Darnell|2011) (Ke| Alemu| Mertens| Gantman|2015) CLIP-based techniques can be used to map RNA binding protein binding sites or RNA modification sites (Ke| Alemu| Mertens| Gantman|2015)(Ke| Pandya-Jones| Saito| Fak|2017) of interest on a genome-wide scale, thereby increasing the understanding of post-transcriptional regulatory networks. The identification of sites where RNA-binding proteins (RNABPs) interact with target RNAs opens the door to understanding the vast complexity of RNA regulation. UV cross-linking and immunoprecipitation (CLIP) is a transformative technology in which RNAs purified from _in vivo_ cross-linked RNA-protein complexes are sequenced to reveal footprints of RNABP:RNA contacts. CLIP combined with high-throughput sequencing (HITS-CLIP) is a generalizable strategy to produce transcriptome-wide maps of RNA binding with higher accuracy and resolution than standard RNA immunoprecipitation (RIP) profiling or purely computational approaches. The application of CLIP to Argonaute proteins has expanded the utility of this approach to mapping binding sites for microRNAs and other small regulatory RNAs. Finally, recent advances in data analysis take advantage of cross-link–induced mutation sites (CIMS) to refine RNA-binding maps to single-nucleotide resolution. Once IP conditions are established, HITS-CLIP takes ~8 d to prepare RNA for sequencing. Established pipelines for data analysis, including those for CIMS, take 3–4 d. Workflow -------- CLIP begins with the in-vivo cross-linking of RNA-protein complexes using ultraviolet light (UV). Upon UV exposure, covalent bonds are formed between proteins and nucleic acids that are in close proximity. (Darnell|2012) The cross-linked cells are then lysed, and the protein of interest is isolated via immunoprecipitation. In order to allow for sequence specific priming of reverse transcription, RNA adapters are ligated to the 3' ends, while radiolabeled phosphates are transferred to the 5' ends of the RNA fragments. The RNA-protein complexes are then separated from free RNA using gel electrophoresis and membrane transfer. Proteinase K digestion is then performed in order to remove protein from the RNA-protein complexes. This step leaves a peptide at the cross-link site, allowing for the identification of the cross-linked nucleotide. (König| McGlincy| Ule|2012) After ligating RNA linkers to the RNA 5' ends, cDNA is synthesized via RT-PCR. High-throughput sequencing is then used to generate reads containing distinct barcodes that identify the last cDNA nucleotide. Interaction sites can be identified by mapping the reads back to the transcriptome.

https://github.com/datirium/workflows.git

Path: workflows/clipseq-se.cwl

Branch/Commit ID: 9bf0aa495735f8081bb5870cb32fc898b9e6eb22

workflow graph samples_fillout_workflow.cwl

Workflow to run GetBaseCountsMultiSample fillout on a number of samples, each with their own bam and maf files

https://github.com/mskcc/pluto-cwl.git

Path: cwl/samples_fillout_workflow.cwl

Branch/Commit ID: ba3ff09328cc646d7254b2d2ee0fbe1abca3d4ad

workflow graph cnv_codex

CNV CODEX calling

https://gitlab.bsc.es/lrodrig1/structuralvariants_poc.git

Path: structuralvariants/cwl/subworkflows/cnv_codex.cwl

Branch/Commit ID: de9cb009f8fe0c8d5a94db5c882cf21ddf372452

workflow graph picard_markduplicates

Mark duplicates

https://gitlab.bsc.es/lrodrig1/structuralvariants_poc.git

Path: structuralvariants/cwl/subworkflows/picard_markduplicates.cwl

Branch/Commit ID: a4a3547b9790e99a58424a0dfcb4e467a7691d6a

workflow graph Trim Galore ChIP-Seq pipeline paired-end

The original [BioWardrobe's](https://biowardrobe.com) [PubMed ID:26248465](https://www.ncbi.nlm.nih.gov/pubmed/26248465) **ChIP-Seq** basic analysis workflow for a **paired-end** experiment with Trim Galore. _Trim Galore_ is a wrapper around [Cutadapt](https://github.com/marcelm/cutadapt) and [FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data. A [FASTQ](http://maq.sourceforge.net/fastq.shtml) input file has to be provided. In outputs it returns coordinate sorted BAM file alongside with index BAI file, quality statistics for both the input FASTQ files, reads coverage in a form of BigWig file, peaks calling data in a form of narrowPeak or broadPeak files, islands with the assigned nearest genes and region type, data for average tag density plot (on the base of BAM file). Workflow starts with running fastx_quality_stats (steps fastx_quality_stats_upstream and fastx_quality_stats_downstream) from FASTX-Toolkit to calculate quality statistics for both upstream and downstream input FASTQ files. At the same time Bowtie is used to align reads from input FASTQ files to reference genome (Step bowtie_aligner). The output of this step is unsorted SAM file which is being sorted and indexed by samtools sort and samtools index (Step samtools_sort_index). Depending on workflow’s input parameters indexed and sorted BAM file could be processed by samtools rmdup (Step samtools_rmdup) to remove all possible read duplicates. In a case when removing duplicates is not necessary the step returns original input BAM and BAI files without any processing. If the duplicates were removed the following step (Step samtools_sort_index_after_rmdup) reruns samtools sort and samtools index with BAM and BAI files, if not - the step returns original unchanged input files. Right after that macs2 callpeak performs peak calling (Step macs2_callpeak). On the base of returned outputs the next step (Step macs2_island_count) calculates the number of islands and estimated fragment size. If the last one is less that 80 (hardcoded in a workflow) macs2 callpeak is rerun again with forced fixed fragment size value (Step macs2_callpeak_forced). If at the very beginning it was set in workflow input parameters to force run peak calling with fixed fragment size, this step is skipped and the original peak calling results are saved. In the next step workflow again calculates the number of islands and estimated fragment size (Step macs2_island_count_forced) for the data obtained from macs2_callpeak_forced step. If the last one was skipped the results from macs2_island_count_forced step are equal to the ones obtained from macs2_island_count step. Next step (Step macs2_stat) is used to define which of the islands and estimated fragment size should be used in workflow output: either from macs2_island_count step or from macs2_island_count_forced step. If input trigger of this step is set to True it means that macs2_callpeak_forced step was run and it returned different from macs2_callpeak step results, so macs2_stat step should return [fragments_new, fragments_old, islands_new], if trigger is False the step returns [fragments_old, fragments_old, islands_old], where sufix \"old\" defines results obtained from macs2_island_count step and sufix \"new\" - from macs2_island_count_forced step. The following two steps (Step bamtools_stats and bam_to_bigwig) are used to calculate coverage on the base of input BAM file and save it in BigWig format. For that purpose bamtools stats returns the number of mapped reads number which is then used as scaling factor by bedtools genomecov when it performs coverage calculation and saves it in BED format. The last one is then being sorted and converted to BigWig format by bedGraphToBigWig tool from UCSC utilities. Step get_stat is used to return a text file with statistics in a form of [TOTAL, ALIGNED, SUPRESSED, USED] reads count. Step island_intersect assigns genes and regions to the islands obtained from macs2_callpeak_forced. Step average_tag_density is used to calculate data for average tag density plot on the base of BAM file.

https://github.com/datirium/workflows.git

Path: workflows/trim-chipseq-pe.cwl

Branch/Commit ID: 7518b100d8cbc80c8be32e9e939dfbb27d6b4361

workflow graph consensus_maf.cwl

Workflow to merge a large number of maf files into a single consensus maf file for use with GetBaseCountsMultiSample

https://github.com/mskcc/pluto-cwl.git

Path: cwl/consensus_maf.cwl

Branch/Commit ID: ba3ff09328cc646d7254b2d2ee0fbe1abca3d4ad

workflow graph bwa_index

https://gitlab.bsc.es/lrodrig1/structuralvariants_poc.git

Path: structuralvariants/cwl/abstract_operations/subworkflows/bwa_index.cwl

Branch/Commit ID: 82e533a98a763a258bd841ed0032c79445478d56