Explore Workflows

View already parsed workflows here or click here to add your own

Graph	Name	Retrieved From	View
	advanced-header.cwl	https://github.com/datirium/workflows.git Path: metadata/advanced-header.cwl Branch/Commit ID: 3c11de851cdc030ef50ba795e7a2ecd957a69007
	Generate genome indices for STAR & bowtie Creates indices for: * [STAR](https://github.com/alexdobin/STAR) v2.5.3a (03/17/2017) PMID: [23104886](https://www.ncbi.nlm.nih.gov/pubmed/23104886) * [bowtie](http://bowtie-bio.sourceforge.net/tutorial.shtml) v1.2.0 (12/30/2016) It performs the following steps: 1. `STAR --runMode genomeGenerate` to generate indices, based on [FASTA](http://zhanglab.ccmb.med.umich.edu/FASTA/) and [GTF](http://mblab.wustl.edu/GTF2.html) input files, returns results as an array of files 2. Outputs indices as [Direcotry](http://www.commonwl.org/v1.0/CommandLineTool.html#Directory) data type 3. Separates chrNameLength.txt file from Directory output 4. `bowtie-build` to generate indices requires genome [FASTA](http://zhanglab.ccmb.med.umich.edu/FASTA/) file as input, returns results as a group of main and secondary files	https://github.com/datirium/workflows.git Path: workflows/genome-indices.cwl Branch/Commit ID: e238d1756f1db35571e84d72e1699e5d1540f10c
	mutect parallel workflow	https://github.com/genome/analysis-workflows.git Path: definitions/subworkflows/mutect.cwl Branch/Commit ID: e59c77629936fad069007ba642cad49fef7ad29f
	timelimit-wf.cwl	https://github.com/common-workflow-language/cwl-v1.2.git Path: tests/timelimit-wf.cwl Branch/Commit ID: ea9f8634e41824ac3f81c3dde698d5f0eef54f1b
	wgs alignment with qc	https://github.com/genome/analysis-workflows.git Path: definitions/pipelines/wgs_alignment.cwl Branch/Commit ID: 735be84cdea041fcc8bd8cbe5728b29ca3586a21
	iwdr_with_nested_dirs.cwl	https://github.com/common-workflow-language/cwltool.git Path: cwltool/schemas/v1.0/v1.0/iwdr_with_nested_dirs.cwl Branch/Commit ID: cd779a90a4336563dcf13795111f502372c6af83
	Bismark Methylation - pipeline for BS-Seq data analysis Sequence reads are first cleaned from adapters and transformed into fully bisulfite-converted forward (C->T) and reverse read (G->A conversion of the forward strand) versions, before they are aligned to similarly converted versions of the genome (also C->T and G->A converted). Sequence reads that produce a unique best alignment from the four alignment processes against the bisulfite genomes (which are running in parallel) are then compared to the normal genomic sequence and the methylation state of all cytosine positions in the read is inferred. A read is considered to align uniquely if an alignment has a unique best alignment score (as reported by the AS:i field). If a read produces several alignments with the same number of mismatches or with the same alignment score (AS:i field), a read (or a read-pair) is discarded altogether. On the next step we extract the methylation call for every single C analysed. The position of every single C will be written out to a new output file, depending on its context (CpG, CHG or CHH), whereby methylated Cs will be labelled as forward reads (+), non-methylated Cs as reverse reads (-). The output of the methylation extractor is then transformed into a bedGraph and coverage file. The bedGraph counts output is then used to generate a genome-wide cytosine report which reports the number on every single CpG (optionally every single cytosine) in the genome, irrespective of whether it was covered by any reads or not. As this type of report is informative for cytosines on both strands the output may be fairly large (~46mn CpG positions or >1.2bn total cytosine positions in the human genome).	https://github.com/datirium/workflows.git Path: workflows/bismark-methylation-se.cwl Branch/Commit ID: 4f48ee6f8665a34cdf96e89c012ee807f80c7a3d
	scatter-valuefrom-wf3.cwl#main	https://github.com/common-workflow-language/cwl-v1.2.git Path: tests/scatter-valuefrom-wf3.cwl Branch/Commit ID: ea9f8634e41824ac3f81c3dde698d5f0eef54f1b Packed ID: main
	exome alignment and somatic variant detection for cle purpose	https://github.com/genome/analysis-workflows.git Path: definitions/pipelines/cle_somatic_exome.cwl Branch/Commit ID: aba52e94b6d7470132d3c092c26d67e29d615300
	WGS Metagenomic pipeline paired-end This workflow taxonomically classifies paired-end sequencing reads in FASTQ format for a SINGLE sample. Reads are first adapter trimmed with trimgalore and filtered using kneaddata with a bmtagger database. The resulting cleaned reads are classified using Kraken2 and a user-selected pre-built database from a list of [genomic index files](https://benlangmead.github.io/aws-indexes/k2). Unaligned reads are then classified using metaphlan4 with the mpa_vJan21_CHOCOPhlAnSGB_202103 database. The kraken2 report is used to generate a krona plot visualization of the abundance profile. Cleaned reads are also run through HUMANN3 using the uniref90 diamond databaseto produce a gene abundance report and metabolic pathway file. The latter is used for abundance coverage and functional assignment. ### __Inputs__ Kraken2 database for taxonomic classification: - Standard is recommended Read 1 file: - FASTA/Q input R1 from a paired end library Read 2 file: - FASTA/Q input R2 from a paired end library Number of threads for steps that support multithreading: - Number of threads for steps that support multithreading - default set to `4` Advanced Inputs Tab (Optional): - Number of bases to clip from the 3p end - Number of bases to clip from the 5p end ### __Outputs__ - kraken2 report (abundance profile) - krona plot (hierarchical visualization of taxonomic classifications) - various log files - metabolic pathway file - functional assignment ### __Data Analysis Steps__ 1. QC raw FASTQ files with fastQC and trimmomatic - OUTPUT1: trimmed FASTQ files 2. Filter human reads out of OUTPUT1 with the KneadData tool () - OUTPUT2: filtered FASTQ files 3. Classify OUTPUT2 with kraken2 using “Standard” database (Refeq archaea, bacteria, viral, plasmid, human, UniVec_Core) - OUTPUT3: taxonomic abundance profile - OUTPUT4: FASTQ files of unclassified reads - VISUALIZATION1: krakenreport to kronaplot 4. Attempt to classify OUTPUT4 with MetaPhlAn using “latest” database - OUTPUT5: taxonomic abundance profile of unclassified kraken2 reads 5. Classify OUTPUT2 with MetaPhlAn using “latest” database - OUTPUT6: final computed taxon abundances (listed one clade per line, tab-separated from the clade's relative abundance in percent) - format: https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-Workshop-on-Genomics-2023#13-metaphlan-output-files - used in the multi-sample workflow (https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-Workshop-on-Genomics-2023#15-analyzing-multiple-samples) 6. Use OUTPUT2 in Metagenome functional profiling/assignment with HUMAnN using “uniref : uniref90_diamond” database - database link: http://huttenhower.sph.harvard.edu/humann_data/uniprot/uniref_annotated/uniref90_annotated_v201901b_full.tar.gz - OUTPUT7: _genefamilies.tsv, contains the abundances of each gene family in the community in reads per kilobase (RPK) units - OUTPUT8: _pathabundance.tsv, lists the abundances of each pathway in the community, also in RPK units as described for gene families - OUTPUT9: normalized_genefamilies-cpm.tsv, contains the normalized abundances of each gene family in counts per million (CPM) units - OUTPUT10: rxn-cpm.tsv, regroup our CPM-normalized gene family abundance values to MetaCyc reaction (RXN) abundances - https://github.com/biobakery/MetaPhlAn/wiki/HUMAnN-Workshop-on-Genomics-2023#3-manipulating-humann-output-tables ### __References__ - McIver LJ, Abu-Ali G, Franzosa EA, Schwager R, Morgan XC, Waldron L, Segata N, Huttenhower C. bioBakery: a meta'omic analysis environment. Bioinformatics. 2018 Apr 1;34(7):1235-1237. PMID: 29194469 - Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999; 27(2):573–580. doi:10.1093/nar/27.2.573 - [Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4.](https://doi.org/10.1038/s41587-023-01688-w) Aitor Blanco-Miguez, Francesco Beghini, Fabio Cumbo, Lauren J. McIver, Kelsey N. Thompson, Moreno Zolfo, Paolo Manghi, Leonard Dubois, Kun D. Huang, Andrew Maltez Thomas, Gianmarco Piccinno, Elisa Piperni, Michal Punčochář, Mireia Valles-Colomer, Adrian Tett, Francesca Giordano, Richard Davies, Jonathan Wolf, Sarah E. Berry, Tim D. Spector, Eric A. Franzosa, Edoardo Pasolli, Francesco Asnicar, Curtis Huttenhower, Nicola Segata. Nature Biotechnology (2023)	https://github.com/datirium/workflows.git Path: workflows/wgs-metagenomics-pe.cwl Branch/Commit ID: 93b844a80f4008cc973ea9b5efedaff32a343895

First
«
594
595
596
597
598
599
600
»
Last