Explore Workflows

View already parsed workflows here or click here to add your own

Graph	Name	Retrieved From	View
	schemadef-wf.cwl	https://github.com/common-workflow-language/cwltool.git Path: cwltool/schemas/v1.0/v1.0/schemadef-wf.cwl Branch/Commit ID: 2ae8117360a3cd4909d9d3f2b35c30bfffb25d0a
	steplevel-resreq.cwl	https://github.com/common-workflow-language/cwltool.git Path: cwltool/schemas/v1.0/v1.0/steplevel-resreq.cwl Branch/Commit ID: e8b3565a008d95859fc44227987a54e6a53a8c29
	Single-Cell Multiome ATAC-Seq and RNA-Seq Filtering Analysis Single-Cell Multiome ATAC-Seq and RNA-Seq Filtering Analysis Removes low-quality cells from the outputs of the “Cell Ranger Count (RNA+ATAC)” and “Cell Ranger Aggregate (RNA+ATAC)” pipelines. The results of this workflow are used in the “Single-Cell RNA-Seq Dimensionality Reduction Analysis” and “Single-Cell ATAC-Seq Dimensionality Reduction Analysis” pipelines.	https://github.com/datirium/workflows.git Path: workflows/sc-multiome-filter.cwl Branch/Commit ID: 57863b6131d8262c5ce864adaf8e4038401e71a2
	varscan somatic workflow	https://github.com/genome/analysis-workflows.git Path: definitions/subworkflows/varscan.cwl Branch/Commit ID: a9133c999502acf94b433af8d39897e6c2cdf65f
	16S metagenomic paired-end QIIME2 Sample (preprocessing) A workflow for processing a single 16S sample via a QIIME2 pipeline. ## __Outputs__ #### Output files: - overview.md, list of inputs - demux.qzv, summary visualizations of imported data - alpha-rarefaction.qzv, plot of OTU rarefaction - taxa-bar-plots.qzv, relative frequency of taxomonies barplot ## __Inputs__ #### General Info - Sample short name/Alias: Used for samplename in downstream analyses. Ensure this is the same name used in the metadata samplesheet. - Environment: where the sample was collected - Catalog No.: catalog number if available (optional) - Read 1 FASTQ file: Read 1 FASTQ file from a paired-end sequencing run. - Read 2 FASTQ file: Read 2 FASTQ file that pairs with the input R1 file. - Trim 5' of R1: Recommended if adapters are still on the input sequences. Trims the first J bases from the 5' end of each forward read. - Trim 5' of R2: Recommended if adapters are still on the input sequences. Trims the first K bases from the 5' end of each reverse read. - Truncate 3' of R1: Recommended if quality drops off along the length of the read. Clips the forward read starting M bases from the 5' end (before trimming). - Truncate 3' of R2: Recommended if quality drops off along the length of the read. Clips the reverse read starting N bases from the 5' end (before trimming). - Threads: Number of threads to use for steps that support multithreading. ### __Data Analysis Steps__ 1. Generate FASTX quality statistics for visualization of unmapped, raw FASTQ reads. 2. Import the data, make a qiime artifact (demux.qza), and summary visualization 3. Denoising will detect and correct (where possible) Illumina amplicon sequence data. This process will additionally filter any phiX reads (commonly present in marker gene Illumina sequence data) that are identified in the sequencing data, and will filter chimeric sequences. 4. Generate a phylogenetic tree for diversity analyses and rarefaction processing and plotting. 5. Taxonomy classification of amplicons. Performed using a Naive Bayes classifier trained on the Greengenes2 database \"gg_2022_10_backbone_full_length.nb.qza\". ### __References__ 1. Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, Cope EK, Da Silva R, Diener C, Dorrestein PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, Gibbons SM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley GA, Janssen S, Jarmusch AK, Jiang L, Kaehler BD, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, Kreps J, Langille MGI, Lee J, Ley R, Liu YX, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan SC, Morton JT, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers A, Robeson MS, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJJ, Vargas F, Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Weber KC, Williamson CHD, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R, and Caporaso JG. 2019. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37: 852–857. https://doi.org/10.1038/s41587-019-0209-9	https://github.com/datirium/workflows.git Path: workflows/qiime2-sample-pe.cwl Branch/Commit ID: 57863b6131d8262c5ce864adaf8e4038401e71a2
	Per-region pindel	https://github.com/genome/analysis-workflows.git Path: definitions/subworkflows/pindel_cat.cwl Branch/Commit ID: a9133c999502acf94b433af8d39897e6c2cdf65f
	revsort.cwl Reverse the lines in a document, then sort those lines.	https://github.com/common-workflow-language/cwltool.git Path: tests/wf/revsort.cwl Branch/Commit ID: 819c81af5449ec912bbbbead042ad66b8d3fd8d4
	runner.cwl	https://github.com/nci-gdc/gdc-dnaseq-cwl.git Path: workflows/fastq_readgroup_stats/runner.cwl Branch/Commit ID: b110a23e2efaaadfd4feca4f9e130946d1c5418d
	GSEApy - Gene Set Enrichment Analysis in Python GSEAPY: Gene Set Enrichment Analysis in Python ============================================== Gene Set Enrichment Analysis is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes). GSEA requires as input an expression dataset, which contains expression profiles for multiple samples. While the software supports multiple input file formats for these datasets, the tab-delimited GCT format is the most common. The first column of the GCT file contains feature identifiers (gene ids or symbols in the case of data derived from RNA-Seq experiments). The second column contains a description of the feature; this column is ignored by GSEA and may be filled with “NA”s. Subsequent columns contain the expression values for each feature, with one sample's expression value per column. It is important to note that there are no hard and fast rules regarding how a GCT file's expression values are derived. The important point is that they are comparable to one another across features within a sample and comparable to one another across samples. Tools such as DESeq2 can be made to produce properly normalized data (normalized counts) which are compatible with GSEA.	https://github.com/datirium/workflows.git Path: workflows/gseapy.cwl Branch/Commit ID: f3e44d3b0f198cf5245c49011124dc3b6c2b06fd
	checkm_wnode	https://github.com/ncbi/pgap.git Path: task_types/tt_checkm_wnode.cwl Branch/Commit ID: 369e2b6c7f4db75099d258729dec1326f55d2cc5