Explore Workflows
View already parsed workflows here or click here to add your own
Graph | Name | Retrieved From | View |
---|---|---|---|
bam-bedgraph-bigwig.cwl
Workflow converts input BAM file into bigWig and bedGraph files. Input BAM file should be sorted by coordinates (required by `bam_to_bedgraph` step). If `split` input is not provided use true by default. Default logic is implemented in `valueFrom` field of `split` input inside `bam_to_bedgraph` step to avoid possible bug in cwltool with setting default values for workflow inputs. `scale` has higher priority over the `mapped_reads_number`. The last one is used to calculate `-scale` parameter for `bedtools genomecov` (step `bam_to_bedgraph`) only in a case when input `scale` is not provided. All logic is implemented inside `bedtools-genomecov.cwl`. `bigwig_filename` defines the output name only for generated bigWig file. `bedgraph_filename` defines the output name for generated bedGraph file and can influence on generated bigWig filename in case when `bigwig_filename` is not provided. All workflow inputs and outputs don't have `format` field to avoid format incompatibility errors when workflow is used as subworkflow. |
https://github.com/datirium/workflows.git
Path: tools/bam-bedgraph-bigwig.cwl Branch/Commit ID: 1a46cb0e8f973481fe5ae3ae6188a41622c8532e |
||
CLIP-Seq pipeline for single-read experiment NNNNG
Cross-Linking ImmunoPrecipitation ================================= `CLIP` (`cross-linking immunoprecipitation`) is a method used in molecular biology that combines UV cross-linking with immunoprecipitation in order to analyse protein interactions with RNA or to precisely locate RNA modifications (e.g. m6A). (Uhl|Houwaart|Corrado|Wright|Backofen|2017)(Ule|Jensen|Ruggiu|Mele|2003)(Sugimoto|König|Hussain|Zupan|2012)(Zhang|Darnell|2011) (Ke| Alemu| Mertens| Gantman|2015) CLIP-based techniques can be used to map RNA binding protein binding sites or RNA modification sites (Ke| Alemu| Mertens| Gantman|2015)(Ke| Pandya-Jones| Saito| Fak|2017) of interest on a genome-wide scale, thereby increasing the understanding of post-transcriptional regulatory networks. The identification of sites where RNA-binding proteins (RNABPs) interact with target RNAs opens the door to understanding the vast complexity of RNA regulation. UV cross-linking and immunoprecipitation (CLIP) is a transformative technology in which RNAs purified from _in vivo_ cross-linked RNA-protein complexes are sequenced to reveal footprints of RNABP:RNA contacts. CLIP combined with high-throughput sequencing (HITS-CLIP) is a generalizable strategy to produce transcriptome-wide maps of RNA binding with higher accuracy and resolution than standard RNA immunoprecipitation (RIP) profiling or purely computational approaches. The application of CLIP to Argonaute proteins has expanded the utility of this approach to mapping binding sites for microRNAs and other small regulatory RNAs. Finally, recent advances in data analysis take advantage of cross-link–induced mutation sites (CIMS) to refine RNA-binding maps to single-nucleotide resolution. Once IP conditions are established, HITS-CLIP takes ~8 d to prepare RNA for sequencing. Established pipelines for data analysis, including those for CIMS, take 3–4 d. Workflow -------- CLIP begins with the in-vivo cross-linking of RNA-protein complexes using ultraviolet light (UV). Upon UV exposure, covalent bonds are formed between proteins and nucleic acids that are in close proximity. (Darnell|2012) The cross-linked cells are then lysed, and the protein of interest is isolated via immunoprecipitation. In order to allow for sequence specific priming of reverse transcription, RNA adapters are ligated to the 3' ends, while radiolabeled phosphates are transferred to the 5' ends of the RNA fragments. The RNA-protein complexes are then separated from free RNA using gel electrophoresis and membrane transfer. Proteinase K digestion is then performed in order to remove protein from the RNA-protein complexes. This step leaves a peptide at the cross-link site, allowing for the identification of the cross-linked nucleotide. (König| McGlincy| Ule|2012) After ligating RNA linkers to the RNA 5' ends, cDNA is synthesized via RT-PCR. High-throughput sequencing is then used to generate reads containing distinct barcodes that identify the last cDNA nucleotide. Interaction sites can be identified by mapping the reads back to the transcriptome. |
https://github.com/datirium/workflows.git
Path: workflows/clipseq-se.cwl Branch/Commit ID: 09267e79fd867aa68a219c69e6db7d8e2e877be2 |
||
annotator_sub_wf.cwl
This is a subworkflow of the main oxog_varbam_annotat_wf workflow - this is not meant to be run as a stand-alone workflow! |
https://github.com/icgc-tcga-pancancer/oxog-dockstore-tools.git
Path: annotator_sub_wf.cwl Branch/Commit ID: 6366ed398da10019b6d81a789291af6d909f28f4 |
||
Filter ChIP/ATAC peaks for Tag Density Profile or Motif Enrichment analyses
Filters ChIP/ATAC peaks with the neatest genes assigned for Tag Density Profile or Motif Enrichment analyses ============================================================================================================ Tool filters output from any ChIP/ATAC pipeline to create a file with regions of interest for Tag Density Profile or Motif Enrichment analyses. Peaks with duplicated coordinates are discarded. |
https://github.com/datirium/workflows.git
Path: workflows/filter-peaks-for-heatmap.cwl Branch/Commit ID: c9e7f3de7f6ba38ee663bd3f9649e8d7dbac0c86 |
||
RNA-Seq pipeline single-read stranded mitochondrial
Slightly changed original [BioWardrobe's](https://biowardrobe.com) [PubMed ID:26248465](https://www.ncbi.nlm.nih.gov/pubmed/26248465) **RNA-Seq** basic analysis for **strand specific single-read** experiment. An additional steps were added to map data to mitochondrial chromosome only and then merge the output. Experiment files in [FASTQ](http://maq.sourceforge.net/fastq.shtml) format either compressed or not can be used. Current workflow should be used only with single-read strand specific RNA-Seq data. It performs the following steps: 1. `STAR` to align reads from input FASTQ file according to the predefined reference indices; generate unsorted BAM file and alignment statistics file 2. `fastx_quality_stats` to analyze input FASTQ file and generate quality statistics file 3. `samtools sort` to generate coordinate sorted BAM(+BAI) file pair from the unsorted BAM file obtained on the step 1 (after running STAR) 5. Generate BigWig file on the base of sorted BAM file 6. Map input FASTQ file to predefined rRNA reference indices using Bowtie to define the level of rRNA contamination; export resulted statistics to file 7. Calculate isoform expression level for the sorted BAM file and GTF/TAB annotation file using `GEEP` reads-counting utility; export results to file |
https://github.com/datirium/workflows.git
Path: workflows/rnaseq-se-dutp-mitochondrial.cwl Branch/Commit ID: 581156366f91861bd4dbb5bcb59f67d468b32af3 |
||
Cut-n-Run pipeline paired-end
Experimental pipeline for Cut-n-Run analysis. Uses mapping results from the following experiment types: - `chipseq-pe.cwl` - `trim-chipseq-pe.cwl` - `trim-atacseq-pe.cwl` Note, the upstream analyses should not have duplicates removed |
https://github.com/datirium/workflows.git
Path: workflows/trim-chipseq-pe-cut-n-run.cwl Branch/Commit ID: a839eb6390974089e1a558c49fc07b4c66c50767 |
||
group-isoforms-batch.cwl
Workflow runs group-isoforms.cwl tool using scatter for isoforms_file input. genes_filename and common_tss_filename inputs are ignored. |
https://github.com/datirium/workflows.git
Path: tools/group-isoforms-batch.cwl Branch/Commit ID: 4ab9399a4777610a579ea2c259b9356f27641dcc |
||
group-isoforms-batch.cwl
Workflow runs group-isoforms.cwl tool using scatter for isoforms_file input. genes_filename and common_tss_filename inputs are ignored. |
https://github.com/datirium/workflows.git
Path: tools/group-isoforms-batch.cwl Branch/Commit ID: 9850a859de1f42d3d252c50e15701928856fe774 |
||
scatter-wf4.cwl#main
|
https://github.com/common-workflow-language/cwltool.git
Path: cwltool/schemas/v1.0/v1.0/scatter-wf4.cwl Branch/Commit ID: 4700fbee9a5a3271eef8bc9ee595619d0720431b Packed ID: main |
||
PCA - Principal Component Analysis
Principal Component Analysis --------------- Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components. The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy. |
https://github.com/datirium/workflows.git
Path: workflows/pca.cwl Branch/Commit ID: c9e7f3de7f6ba38ee663bd3f9649e8d7dbac0c86 |