Explore Workflows

View already parsed workflows here or click here to add your own

Graph Name Retrieved From View
workflow graph Kraken2 Metagenomic pipeline paired-end

This workflow taxonomically classifies paired-end sequencing reads in FASTQ format, that have been optionally adapter trimmed with trimgalore, using Kraken2 and a user-selected pre-built database from a list of [genomic index files](https://benlangmead.github.io/aws-indexes/k2). ### __Inputs__ Kraken2 database for taxonomic classification: - [Viral (0.5 GB)](https://genome-idx.s3.amazonaws.com/kraken/k2_viral_20221209.tar.gz), all refseq viral genomes - [MinusB (8.7 GB)](https://genome-idx.s3.amazonaws.com/kraken/k2_minusb_20221209.tar.gz), standard minus bacteria (archaea, viral, plasmid, human1, UniVec_Core) - [PlusPFP-16 (15.0 GB)](https://genome-idx.s3.amazonaws.com/kraken/k2_pluspfp_16gb_20221209.tar.gz), standard (archaea, bacteria, viral, plasmid, human1, UniVec_Core) + (protozoa, fungi & plant) capped at 16 GB (shrunk via random kmer downselect) - [EuPathDB46 (34.1 GB)](https://genome-idx.s3.amazonaws.com/kraken/k2_eupathdb48_20201113.tar.gz), eukaryotic pathogen genomes with contaminants removed (https://veupathdb.org/veupathdb/app) - [16S_gg_13_5 (73 MB)](https://genome-idx.s3.amazonaws.com/kraken/16S_Greengenes13.5_20200326.tgz), Greengenes 16S rRNA database ([release 13.5](https://greengenes.secondgenome.com/?prefix=downloads/greengenes_database/gg_13_5/), 20200326)\n - [16S_silva_138 (112 MB)](https://genome-idx.s3.amazonaws.com/kraken/16S_Silva138_20200326.tgz), SILVA 16S rRNA database ([release 138.1](https://www.arb-silva.de/documentation/release-1381/), 20200827) Read 1 file: - FASTA/Q input R1 from a paired end library Read 2 file: - FASTA/Q input R2 from a paired end library Advanced Inputs Tab (Optional): - Number of bases to clip from the 3p end - Number of bases to clip from the 5p end ### __Outputs__ - k2db, an upstream database used by kraken2 classifier ### __Data Analysis Steps__ 1. Trimming the adapters with TrimGalore. - This step is particularly important when the reads are long and the fragments are short - resulting in sequencing adapters at the ends of reads. If adapter is not removed the read will not map. TrimGalore can recognize standard adapters, such as Illumina or Nextera/Tn5 adapters. 2. Generate quality control statistics of trimmed, unmapped sequence data 3. (Optional) Clipping of 5' and/or 3' end by the specified number of bases. 4. Mapping reads to primary genome index with Bowtie. ### __References__ - Wood, D.E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019). https://doi.org/10.1186/s13059-019-1891-0

https://github.com/datirium/workflows.git

Path: workflows/kraken2-classify-pe.cwl

Branch/Commit ID: 93b844a80f4008cc973ea9b5efedaff32a343895

workflow graph workflow_same_level.cwl#main_pipeline

Simulation steps pipeline

https://github.com/ILIAD-ocean-twin/application_package.git

Path: workflow_in_workflow/workflow_same_level.cwl

Branch/Commit ID: 9a0db98839bbc655e12d49f56c61deecd77ff14c

Packed ID: main_pipeline

workflow graph workflow_same_level.cwl#second_pipeline

Simulation of 2 workflows

https://github.com/ILIAD-ocean-twin/application_package.git

Path: workflow_in_workflow/workflow_same_level.cwl

Branch/Commit ID: 9a0db98839bbc655e12d49f56c61deecd77ff14c

Packed ID: second_pipeline

workflow graph pindel parallel workflow

https://github.com/genome/analysis-workflows.git

Path: definitions/subworkflows/pindel.cwl

Branch/Commit ID: 60edaf6f57eaaf02cda1a3d8cb9a825aa64a43e2

workflow graph SetMirrorPanelAlignment

Derive mirror panel alignment parameters from measurements of the optical point-spread functions.

https://github.com/gammasim/workflows.git

Path: workflows/SetMirrorPanelAlignment.cwl

Branch/Commit ID: 789752af87eb190387ff2acb4c95c7a5cdb961e7

workflow graph count-lines7-single-source-wf_v1_2.cwl

https://github.com/common-workflow-language/cwl-utils.git

Path: testdata/count-lines7-single-source-wf_v1_2.cwl

Branch/Commit ID: b76b039edb62dea76c43f173848cdc57e4b4aab7

workflow graph bam-bedgraph-bigwig.cwl

Workflow converts input BAM file into bigWig and bedGraph files. Input BAM file should be sorted by coordinates (required by `bam_to_bedgraph` step). If `split` input is not provided use true by default. Default logic is implemented in `valueFrom` field of `split` input inside `bam_to_bedgraph` step to avoid possible bug in cwltool with setting default values for workflow inputs. `scale` has higher priority over the `mapped_reads_number`. The last one is used to calculate `-scale` parameter for `bedtools genomecov` (step `bam_to_bedgraph`) only in a case when input `scale` is not provided. All logic is implemented inside `bedtools-genomecov.cwl`. `bigwig_filename` defines the output name only for generated bigWig file. `bedgraph_filename` defines the output name for generated bedGraph file and can influence on generated bigWig filename in case when `bigwig_filename` is not provided. All workflow inputs and outputs don't have `format` field to avoid format incompatibility errors when workflow is used as subworkflow.

https://github.com/datirium/workflows.git

Path: tools/bam-bedgraph-bigwig.cwl

Branch/Commit ID: f3e44d3b0f198cf5245c49011124dc3b6c2b06fd

workflow graph checkm_wnode

https://github.com/ncbi/pgap.git

Path: task_types/tt_checkm_wnode.cwl

Branch/Commit ID: 424a01693259a75641dc249d553235aa38a6ce23

workflow graph kmer_build_tree

https://github.com/ncbi/pgap.git

Path: task_types/tt_kmer_build_tree.cwl

Branch/Commit ID: 68058b108cb5b0b72ebe244c42eefa2747e1d64a

workflow graph Motif Finding with HOMER with custom background regions

Motif Finding with HOMER with custom background regions --------------------------------------------------- HOMER contains a novel motif discovery algorithm that was designed for regulatory element analysis in genomics applications (DNA only, no protein). It is a differential motif discovery algorithm, which means that it takes two sets of sequences and tries to identify the regulatory elements that are specifically enriched in on set relative to the other. It uses ZOOPS scoring (zero or one occurrence per sequence) coupled with the hypergeometric enrichment calculations (or binomial) to determine motif enrichment. HOMER also tries its best to account for sequenced bias in the dataset. It was designed with ChIP-Seq and promoter analysis in mind, but can be applied to pretty much any nucleic acids motif finding problem. For more information please refer to: ------------------------------------- [Official documentation](http://homer.ucsd.edu/homer/motif/)

https://github.com/datirium/workflows.git

Path: workflows/homer-motif-analysis-bg.cwl

Branch/Commit ID: 93b844a80f4008cc973ea9b5efedaff32a343895