Explore Workflows

View already parsed workflows here or click here to add your own

Graph	Name	Retrieved From	View
	steplevel-resreq.cwl	https://github.com/common-workflow-language/cwltool.git Path: cwltool/schemas/v1.0/v1.0/steplevel-resreq.cwl Branch/Commit ID: 1eb6bfe3c77aebaf69453a669d21ae7a5a78056f
	miRNA-Seq miRDeep2 pipeline A CWL workflow for discovering known or novel miRNAs from deep sequencing data using the miRDeep2 tool. The ExoCarta exosome database is also used for identifying exosome-related miRNAs, and TargetScan's organism-specific databases are used for identifying miRNA gene targets. ## __Outputs__ #### Primary Output files: - mirs_known.tsv, detected known mature miRNAs, \"Known miRNAs\" tab - mirs_novel.tsv, detected novel mature miRNAs, \"Novel miRNAs\" tab #### Secondary Output files: - mirs_known_exocarta_deepmirs.tsv, list of detected miRNA also in ExoCarta's exosome database, \"Detected Exosome miRNAs\" tab - mirs_known_gene_targets.tsv, pre-computed gene targets of known mature mirs, downloadable - known_mirs_mature.fa, known mature mir sequences, downloadable - known_mirs_precursor.fa, known precursor mir sequences, downloadable - novel_mirs_mature.fa, novel mature mir sequences, downloadable - novel_mirs_precursor.fa, novel precursor mir sequences, downloadable #### Reports: - overview.md (input list, alignment & mir metrics), \"Overview\" tab - mirdeep2_result.html, summary of mirdeep2 results, \"miRDeep2 Results\" tab ## __Inputs__ #### General Info - Sample short name/Alias: unique name for sample - Experimental condition: condition, variable, etc name (e.g. \"control\" or \"20C 60min\") - Cells: name of cells used for the sample - Catalog No.: vender catalog number if available - Bowtie2 index: Bowtie2 index directory of the reference genome. - Reference Genome FASTA: Reference genome FASTA file to be used for alignment. - Genome short name: Name used for setting organism name, genus, species, and tax ID. - Input FASTQ file: FASTQ file from a single-end miRNA sequencing run. #### Advanced - Adapter: Adapter sequence to be trimmed from miRNA sequence reads. (Default: TCGTAT) - Threads: Number of threads to use for steps that support multithreading (Default: 4). ## Hints & Tips: #### For the identification of novel miRNA candidates, the following may be used as a filtering guideline: 1. miRDeep score > 4 (some authors use 1) 2. not present a match with rfam 3. should present a significant RNAfold (\"yes\") 4. a number of mature reads > 10 5. if applicable, novel mir must be expressed in multiple samples #### For filtering mirbase by organism. \| genome \| organism \| division \| name \| tree \| NCBI-taxid \| \| ---- \| --- \| --- \| ----------- \| ----------- \| ----------- \| \| hg19 \| hsa \| HSA \| Homo sapiens \| Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Primates;Hominidae \| 9606 \| \| hg38 \| hsa \| HSA \| Homo sapiens \| Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Primates;Hominidae \| 9606 \| \| mm10 \| mmu \| MMU \| Mus musculus \| Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Rodentia \| 10090 \| \| rn7 \| rno \| RNO \| Rattus norvegicus \| Metazoa;Bilateria;Deuterostoma;Chordata;Vertebrata;Mammalia;Rodentia \| 10116 \| \| dm3 \| dme \| DME \| Drosophila melanogaster \| Metazoa;Bilateria;Ecdysozoa;Arthropoda;Hexapoda \| 7227 \| ## __Data Analysis Steps__ 1. The miRDeep2 Mapper module processes Illumina FASTQ output and maps it to the reference genome. 2. The miRDeep2 miRDeep2 module identifies known and novel (mature and precursor) miRNAs. 3. The ExoCarta database of miRNA found in exosomes is then used to find overlap between mirs_known.tsv and exosome associated miRNAs. 4. Finally, TargetScan organism-specific miRNA gene target database is used to find overlap between mirs_known.tsv and gene targets. ## __References__ 1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245920 2. https://github.com/rajewsky-lab/mirdeep2 3. https://biocontainers.pro/tools/mirdeep2 4. https://www.mirbase.org/ 5. http://exocarta.org/index.html 6. https://www.targetscan.org/vert_80/	https://github.com/datirium/workflows.git Path: workflows/mirna-mirdeep2-se.cwl Branch/Commit ID: 57863b6131d8262c5ce864adaf8e4038401e71a2
	star-index.cwl Generates indices for STAR v2.5.3a (03/17/2017).	https://github.com/datirium/workflows.git Path: workflows/star-index.cwl Branch/Commit ID: e284e3f6dff25037b209895c52f2abd37a1ce1bf
	scatter-valuefrom-wf1.cwl	https://github.com/common-workflow-language/cwl-v1.1.git Path: tests/scatter-valuefrom-wf1.cwl Branch/Commit ID: 50251ef931d108c09bed2d330d3d4fe9c562b1c3
	heatmap-prepare.cwl Workflow runs homer-make-tag-directory.cwl tool using scatter for the following inputs - bam_file - fragment_size - total_reads `dotproduct` is used as a `scatterMethod`, so one element will be taken from each array to construct each job: 1) bam_file[0] fragment_size[0] total_reads[0] 2) bam_file[1] fragment_size[1] total_reads[1] ... N) bam_file[N] fragment_size[N] total_reads[N] `bam_file`, `fragment_size` and `total_reads` arrays should have the identical order.	https://github.com/datirium/workflows.git Path: tools/heatmap-prepare.cwl Branch/Commit ID: 57863b6131d8262c5ce864adaf8e4038401e71a2
	workflow_input_sf_expr_v1_2.cwl	https://github.com/common-workflow-language/cwl-utils.git Path: testdata/workflow_input_sf_expr_v1_2.cwl Branch/Commit ID: 0ab1d42d10f7311bb4032956c4a6f3d2730d9507
	HS Metrics workflow	https://github.com/genome/analysis-workflows.git Path: definitions/subworkflows/hs_metrics.cwl Branch/Commit ID: 60edaf6f57eaaf02cda1a3d8cb9a825aa64a43e2
	Bismark Methylation - pipeline for BS-Seq data analysis Sequence reads are first cleaned from adapters and transformed into fully bisulfite-converted forward (C->T) and reverse read (G->A conversion of the forward strand) versions, before they are aligned to similarly converted versions of the genome (also C->T and G->A converted). Sequence reads that produce a unique best alignment from the four alignment processes against the bisulfite genomes (which are running in parallel) are then compared to the normal genomic sequence and the methylation state of all cytosine positions in the read is inferred. A read is considered to align uniquely if an alignment has a unique best alignment score (as reported by the AS:i field). If a read produces several alignments with the same number of mismatches or with the same alignment score (AS:i field), a read (or a read-pair) is discarded altogether. On the next step we extract the methylation call for every single C analysed. The position of every single C will be written out to a new output file, depending on its context (CpG, CHG or CHH), whereby methylated Cs will be labelled as forward reads (+), non-methylated Cs as reverse reads (-). The output of the methylation extractor is then transformed into a bedGraph and coverage file. The bedGraph counts output is then used to generate a genome-wide cytosine report which reports the number on every single CpG (optionally every single cytosine) in the genome, irrespective of whether it was covered by any reads or not. As this type of report is informative for cytosines on both strands the output may be fairly large (~46mn CpG positions or >1.2bn total cytosine positions in the human genome).	https://github.com/datirium/workflows.git Path: workflows/bismark-methylation-se.cwl Branch/Commit ID: ee66d03be8a7fd61367db40c37a973ff55ece4da
	gcaccess_from_list	https://github.com/ncbi/pgap.git Path: task_types/tt_gcaccess_from_list.cwl Branch/Commit ID: 3bec7182e39cb4af10ed8920639adfa78a28ed81
	RNA-Seq pipeline paired-end The original [BioWardrobe's](https://biowardrobe.com) [PubMed ID:26248465](https://www.ncbi.nlm.nih.gov/pubmed/26248465) RNA-Seq basic analysis for a paired-end experiment. A corresponded input [FASTQ](http://maq.sourceforge.net/fastq.shtml) file has to be provided. Current workflow should be used only with the paired-end RNA-Seq data. It performs the following steps: 1. Use STAR to align reads from input FASTQ files according to the predefined reference indices; generate unsorted BAM file and alignment statistics file 2. Use fastx_quality_stats to analyze input FASTQ files and generate quality statistics files 3. Use samtools sort to generate coordinate sorted BAM(+BAI) file pair from the unsorted BAM file obtained on the step 1 (after running STAR) 4. Generate BigWig file on the base of sorted BAM file 5. Map input FASTQ files to predefined rRNA reference indices using Bowtie to define the level of rRNA contamination; export resulted statistics to file 6. Calculate isoform expression level for the sorted BAM file and GTF/TAB annotation file using GEEP reads-counting utility; export results to file	https://github.com/datirium/workflows.git Path: workflows/rnaseq-pe.cwl Branch/Commit ID: f3e44d3b0f198cf5245c49011124dc3b6c2b06fd