Explore Workflows

View already parsed workflows here or click here to add your own

Graph Name Retrieved From View
workflow graph count-lines8-wf-noET.cwl

https://github.com/common-workflow-language/cwl-v1.2.git

Path: tests/count-lines8-wf-noET.cwl

Branch/Commit ID: c7c97715b400ff2194aa29fc211d3401cea3a9bf

workflow graph Bismark Methylation - pipeline for BS-Seq data analysis

Sequence reads are first cleaned from adapters and transformed into fully bisulfite-converted forward (C->T) and reverse read (G->A conversion of the forward strand) versions, before they are aligned to similarly converted versions of the genome (also C->T and G->A converted). Sequence reads that produce a unique best alignment from the four alignment processes against the bisulfite genomes (which are running in parallel) are then compared to the normal genomic sequence and the methylation state of all cytosine positions in the read is inferred. A read is considered to align uniquely if an alignment has a unique best alignment score (as reported by the AS:i field). If a read produces several alignments with the same number of mismatches or with the same alignment score (AS:i field), a read (or a read-pair) is discarded altogether. On the next step we extract the methylation call for every single C analysed. The position of every single C will be written out to a new output file, depending on its context (CpG, CHG or CHH), whereby methylated Cs will be labelled as forward reads (+), non-methylated Cs as reverse reads (-). The output of the methylation extractor is then transformed into a bedGraph and coverage file. The bedGraph counts output is then used to generate a genome-wide cytosine report which reports the number on every single CpG (optionally every single cytosine) in the genome, irrespective of whether it was covered by any reads or not. As this type of report is informative for cytosines on both strands the output may be fairly large (~46mn CpG positions or >1.2bn total cytosine positions in the human genome).

https://github.com/datirium/workflows.git

Path: workflows/bismark-methylation-se.cwl

Branch/Commit ID: a68821bf3a9ceadc3b2ffbb535d601d9a645b377

workflow graph bact_get_kmer_reference

https://github.com/ncbi/pgap.git

Path: task_types/tt_bact_get_kmer_reference.cwl

Branch/Commit ID: 708e141d99f6e5f30d9402d9f890562606a0d97e

workflow graph 16S metagenomic paired-end QIIME2 Analysis (differential abundance)

A workflow for processing a multiple 16S samples from within the SciDAP platform, via a QIIME2 pipeline. ## __Outputs__ #### Output files: Primary output files: - overview.md, list of inputs - demux.qzv, summary visualizations of imported data - alpha-rarefaction.qzv, plot of OTU rarefaction - taxa-bar-plots.qzv, relative frequency of taxomonies barplot - table.qza, table containing how many sequences are associated with each sample and with each feature (OTU) Optional output files: - pcoa-unweighted-unifrac-emperor.qzv, PCoA using unweighted unifrac method - pcoa-bray-curtis-emperor.qzv, PCoA using bray curtis method - heatmap.qzv, output from gneiss differential abundance analysis using unsupervised correlation-clustering method (this will define the partitions of microbes that commonly co-occur with each other using Ward hierarchical clustering) - ancom-\$LEVEL.qzv, output from ANCOM differential abundance analysis at family, genus, and species taxonomic levels (includes volcano plot) ## __Inputs__ #### General Info - Sample short name/Alias: Used for samplename in downstream analyses. Ensure this is the same name used in the metadata samplesheet. - metadata_file: Path to the TSV file containing experiment metadata. The first column must have the header \"sample-id\" with sample names exactly as they have been input into your SciDAP project. The remaining column headers are experiment-specific. NOTE: Custom Label parameter metadata must be INT data type. - Metadata header name for PCoA axis label: Must be identical to one of the headers of the metadata file. Values under this metadata header must be INT. Required for PCoA analysis. - Rarefaction normalization sampling depth: Required for differential abundance analyses (along with group and taxonomic level). This step will subsample the counts in each sample without replacement so that each sample in the resulting table has a total count of INT. If the total count for any sample(s) are smaller than this value, those samples will be dropped from further analysis. It's recommend making your choice by reviewing the rarefaction plot. Choose a value that is as high as possible (so you retain more sequences per sample) while excluding as few samples as possible. - Metadata header name for differential abundance analyses: Required for differential abundance analyses (along with sampling depth and taxonomic level). Group/experimental condition column name from sample metadata file. Must be identical to one of the headers of the sample-metadata file. The corresponding column should only have two groups/conditions. - Taxonomic level for differential abundance analysis: Required for differential abundance analyses (along with sampling depth and group). Collapses the OTU table at the taxonomic level of interest for differential abundance analysis with ANCOM. Default: Genus - 16S samples for combined analysis: Upstream 16S samples for combined analysis. R1 and R2 fastq are used for generating the manifest file for data import to qiime2. - Trim 5' of R1: Recommended if adapters are still on the input sequences. Trims the first J bases from the 5' end of each forward read. - Trim 5' of R2: Recommended if adapters are still on the input sequences. Trims the first K bases from the 5' end of each reverse read. - Truncate 3' of R1: Recommended if quality drops off along the length of the read. Clips the forward read starting M bases from the 5' end (before trimming). - Truncate 3' of R2: Recommended if quality drops off along the length of the read. Clips the reverse read starting N bases from the 5' end (before trimming). - Threads: Number of threads to use for steps that support multithreading. ### __Data Analysis Steps__ 1. Import all sample read data, make a qiime artifact (demux.qza), and summary visualization 2. Denoising will detect and correct (where possible) Illumina amplicon sequence data. This process will additionally filter any phiX reads (commonly present in marker gene Illumina sequence data) that are identified in the sequencing data, and will filter chimeric sequences. 3. Generate a phylogenetic tree for diversity analyses and rarefaction processing and plotting. 4. Taxonomy classification of amplicons. Performed using a Naive Bayes classifier trained on the Greengenes2 database \"gg_2022_10_backbone_full_length.nb.qza\". 5. If \"Metadata header name for PCoA axis label\" is provided, principle coordinates analysis (PCoA) will be performed using the unweighted unifrac and bray curtis methods. 3D plots are produced with PCo1, PCo2, and the provided axis label on the x, y, and z axes. 6. If the sampling depth and metadata header for differential analysis are provided, differential abundance analysis will be performed using Gneiss and ANCOM methods at the family, genus, and species taxonomic levels. A unsupervised hierarchical clustering heatmap (Gneiss) and volcano plot (ANCOM) are produced at the taxonomic level between the specified group. ### __References__ 1. Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, Cope EK, Da Silva R, Diener C, Dorrestein PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, Gibbons SM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley GA, Janssen S, Jarmusch AK, Jiang L, Kaehler BD, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, Kreps J, Langille MGI, Lee J, Ley R, Liu YX, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan SC, Morton JT, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers A, Robeson MS, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJJ, Vargas F, Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Weber KC, Williamson CHD, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R, and Caporaso JG. 2019. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37: 852–857. https://doi.org/10.1038/s41587-019-0209-9

https://github.com/datirium/workflows.git

Path: workflows/qiime2-aggregate.cwl

Branch/Commit ID: 93b844a80f4008cc973ea9b5efedaff32a343895

workflow graph count-lines11-null-step-wf-noET.cwl

https://github.com/common-workflow-language/cwl-v1.2.git

Path: tests/count-lines11-null-step-wf-noET.cwl

Branch/Commit ID: c7c97715b400ff2194aa29fc211d3401cea3a9bf

workflow graph Cell Ranger ARC Count Gene Expression + ATAC

Cell Ranger ARC Count Gene Expression + ATAC ============================================

https://github.com/datirium/workflows.git

Path: workflows/cellranger-arc-count.cwl

Branch/Commit ID: c6bfa0de917efb536dd385624fc7702e6748e61d

workflow graph directory.cwl

Inspect provided directory and return filenames. Generate a new directory and return it (including content).

https://github.com/common-workflow-language/cwltool.git

Path: tests/wf/directory.cwl

Branch/Commit ID: 3ed10d0ea7ac57550433a89a92bdbe756bdb0e40

workflow graph extract_readgroup_fastq_se_http.cwl

https://github.com/nci-gdc/gdc-dnaseq-cwl.git

Path: workflows/bamfastq_align/extract_readgroup_fastq_se_http.cwl

Branch/Commit ID: 3cb464a3a5c39cc060cd23d9c60918bc9ffb169b

workflow graph Filter single sample sv vcf from depth callers(cnvkit/cnvnator)

https://github.com/genome/analysis-workflows.git

Path: definitions/subworkflows/sv_depth_caller_filter.cwl

Branch/Commit ID: e59c77629936fad069007ba642cad49fef7ad29f

workflow graph kmer_seq_entry_extract_wnode

https://github.com/ncbi/pgap.git

Path: task_types/tt_kmer_seq_entry_extract_wnode.cwl

Branch/Commit ID: 0d9e6bb52eac0c209af3977aa779e39aaa432458