Explore Workflows

View already parsed workflows here or click here to add your own

Graph	Name	Retrieved From	View
	assm_assm_blastn_wnode	https://github.com/ncbi/pgap.git Path: task_types/tt_assm_assm_blastn_wnode.cwl Branch/Commit ID: b4a6e46405c08e0b14ad92f0ab38bcc4a69caa5c
	bact_get_kmer_reference	https://github.com/ncbi/pgap.git Path: task_types/tt_bact_get_kmer_reference.cwl Branch/Commit ID: 16e3915d2a357e2a861b30911c832e5ddc0c1784
	wgs alignment and somatic variant detection	https://github.com/genome/analysis-workflows.git Path: definitions/pipelines/somatic_wgs_nonhuman.cwl Branch/Commit ID: 788bdc99c1d5b6ee7c431c3c011eb30d385c1370
	RNA-Seq pipeline single-read stranded mitochondrial Slightly changed original [BioWardrobe's](https://biowardrobe.com) [PubMed ID:26248465](https://www.ncbi.nlm.nih.gov/pubmed/26248465) RNA-Seq basic analysis for strand specific single-read experiment. An additional steps were added to map data to mitochondrial chromosome only and then merge the output. Experiment files in [FASTQ](http://maq.sourceforge.net/fastq.shtml) format either compressed or not can be used. Current workflow should be used only with single-read strand specific RNA-Seq data. It performs the following steps: 1. `STAR` to align reads from input FASTQ file according to the predefined reference indices; generate unsorted BAM file and alignment statistics file 2. `fastx_quality_stats` to analyze input FASTQ file and generate quality statistics file 3. `samtools sort` to generate coordinate sorted BAM(+BAI) file pair from the unsorted BAM file obtained on the step 1 (after running STAR) 5. Generate BigWig file on the base of sorted BAM file 6. Map input FASTQ file to predefined rRNA reference indices using Bowtie to define the level of rRNA contamination; export resulted statistics to file 7. Calculate isoform expression level for the sorted BAM file and GTF/TAB annotation file using `GEEP` reads-counting utility; export results to file	https://github.com/datirium/workflows.git Path: workflows/rnaseq-se-dutp-mitochondrial.cwl Branch/Commit ID: 92f1a6da9c4f85fb51340b01b32373a50fde0891
	Kraken2 Database installation pipeline This workflow downloads the user-selected pre-built kraken2 database from: https://benlangmead.github.io/aws-indexes/k2 ### __Inputs__ Select a pre-built Kraken2 database to download and use for metagenomic classification: - Available options comprised of various combinations of RefSeq reference genome sets: - [Viral (0.5 GB)](https://genome-idx.s3.amazonaws.com/kraken/k2_viral_20221209.tar.gz), all refseq viral genomes - [MinusB (8.7 GB)](https://genome-idx.s3.amazonaws.com/kraken/k2_minusb_20221209.tar.gz), standard minus bacteria (archaea, viral, plasmid, human1, UniVec_Core) - [PlusPFP-16 (15.0 GB)](https://genome-idx.s3.amazonaws.com/kraken/k2_pluspfp_16gb_20221209.tar.gz), standard (archaea, bacteria, viral, plasmid, human1, UniVec_Core) + (protozoa, fungi & plant) capped at 16 GB (shrunk via random kmer downselect) - [EuPathDB46 (34.1 GB)](https://genome-idx.s3.amazonaws.com/kraken/k2_eupathdb48_20201113.tar.gz), eukaryotic pathogen genomes with contaminants removed (https://veupathdb.org/veupathdb/app) - [16S_gg_13_5 (73 MB)](https://genome-idx.s3.amazonaws.com/kraken/16S_Greengenes13.5_20200326.tgz), Greengenes 16S rRNA database ([release 13.5](https://greengenes.secondgenome.com/?prefix=downloads/greengenes_database/gg_13_5/), 20200326)\n - [16S_silva_138 (112 MB)](https://genome-idx.s3.amazonaws.com/kraken/16S_Silva138_20200326.tgz), SILVA 16S rRNA database ([release 138.1](https://www.arb-silva.de/documentation/release-1381/), 20200827) ### __Outputs__ - k2db, an upstream database used by kraken2 classification tool - compressed_k2db_tar, compressed and tarred kraken2 database directory file for download and use outside of scidap ### __Data Analysis Steps__ 1. download selected pre-built kraken2 database. 2. make available as upstream source for kraken2 metagenomic taxonomic classification. ### __References__ - Wood, D.E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol 20, 257 (2019). https://doi.org/10.1186/s13059-019-1891-0	https://github.com/datirium/workflows.git Path: workflows/kraken2-databases.cwl Branch/Commit ID: 57863b6131d8262c5ce864adaf8e4038401e71a2
	rnaseq-se.cwl RNA-Seq basic analysis workflow for single-read experiment.	https://github.com/datirium/workflows.git Path: workflows/rnaseq-se.cwl Branch/Commit ID: 3ceeb2e90f49579369b2e10485908516348381a9
	kmer_ref_compare_wnode	https://github.com/ncbi/pgap.git Path: task_types/tt_kmer_ref_compare_wnode.cwl Branch/Commit ID: 68058b108cb5b0b72ebe244c42eefa2747e1d64a
	format_rrnas_from_seq_entry	https://github.com/ncbi/pgap.git Path: task_types/tt_format_rrnas_from_seq_entry.cwl Branch/Commit ID: 76a9637a06e2102645eae29aff10b6f7185892a5
	igv-report_maf_workflow.cwl Workflow to run GetBaseCountsMultiSample fillout on a number of samples, each with their own bam and maf files	https://github.com/mskcc/pluto-cwl.git Path: cwl/igv-report_maf_workflow.cwl Branch/Commit ID: 7eb2b0a4d37018142233d770595ac2e00376dab4
	samples_fillout_index_workflow.cwl Wrapper to run indexing on all bams before submitting for samples fillout Includes secondary input channels to allow for including .bam files that do not have indexes Also include other extra handling needed for files that might not meet needs for the fillout workflow NOTE: need v1.1 upgrade so we can do it all from a single channel with optional secondary files; https://www.commonwl.org/v1.1/CommandLineTool.html#SecondaryFileSchema	https://github.com/mskcc/pluto-cwl.git Path: cwl/samples_fillout_index_workflow.cwl Branch/Commit ID: 342e6f1f4f7a3839e579fbe96ccc8d6f7a61ac77