Explore Workflows
View already parsed workflows here or click here to add your own
Graph | Name | Retrieved From | View |
---|---|---|---|
|
schemadef-wf.cwl
|
![]() Path: tests/schemadef-wf.cwl Branch/Commit ID: 1f3ef888d9ef2306c828065c460c1800604f0de4 |
|
|
Subworkflow to allow calling different SV callers which require bam files as inputs
|
![]() Path: definitions/subworkflows/single_sample_sv_callers.cwl Branch/Commit ID: 51724b44c96e5fd849ae55b752865b80bc47d66c |
|
|
Nested workflow example
|
![]() Path: tests/wf/nested.cwl Branch/Commit ID: a3d565bf8e630101d25d31804cfbceb0a0ba28de |
|
|
TgIF - Transgene Insertion Finder
TgIF (trans-gene insertion finder) ============================================== The TgIF algorithm returns a list of probable insertion sites in a target organism. It requires the user to provided a FASTQ file of ONT (Oxford Nanopore Technologies) reads (-f), the reference FASTA of the trans-gene (Tg) vector containing the insertion sequence (-i), and the reference FASTA of the target organism (-r). The algorithm is tailored for ONT reads from a modified nCATS[1] (nanopore Cas9-targeted sequencing) enriched library, however the algorithm will also produce informative results from a FASTQ derived from WGS (shotgun) sequencing libraries. The modified nCATS method is described here, and a brief overview can be found below. The basic workflow of TgIF is alignment (using minimap2[2]) of reads (-f) to a combined reference of the Tg vector (containing the desired insertion sequence) and target organism (ie. -i and -r are concatenated), and then searching for valleys (or gaps) in the resulting pileup of reads that map to both references at MAPQ>=30. A starting position (ps) of a valley is where the depth (d) at dp=0 and dp-1>0, an ending position (pe) of a valley is where the depth at dp=0 and dp+1>0, and a potential insertion scar is the gap between and including ps and pe. Primary Output files: - insertions_all.tsv, all probable insertion sites identified from the input fastq data - insertions_filtered.tgif, filtered sites that are most probable based on logic (4) above - reportsummary.md, summary of alignment metrics and insertion sites found Secondary Output files: - insertion_site_plots.tar, package of probable insertion site pileup plots - alignment_files.tar.gz, contains bam/bai for visualizing aligned reads to reference genome and vector sequence - primer3.tar, contains F/R primers for each filtered insertion site designed by primer3 Documents ============================================== - github Page: https://github.com/jhuapl-bio/TgIF/tree/main References ============================================== - Gilpatrick, T. et al. Targeted nanopore sequencing with Cas9-guided adapter ligation. Nature Biotechnology 38, 433–438 (2020). - Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34:3094-3100. doi:10.1093/bioinformatics/bty191 - O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47. - Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, and 1000 Genome Project Data Processing Subgroup, The Sequence alignment/map (SAM) format and SAMtools, Bioinformatics (2009) 25(16) 2078-9 [19505943] - R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. - H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. - Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M and Rozen SG. Primer3--new capabilities and interfaces. Nucleic Acids Res. 2012 Aug 1;40(15):e115. |
![]() Path: workflows/tgif.cwl Branch/Commit ID: 261c0232a7a40880f2480b811ed2d7e89c463869 |
|
|
format_rrnas_from_seq_entry
|
![]() Path: task_types/tt_format_rrnas_from_seq_entry.cwl Branch/Commit ID: 2801ce53744a085580a8de91cd007c45146b51e8 |
|
|
count-lines6-wf.cwl
|
![]() Path: cwltool/schemas/v1.0/v1.0/count-lines6-wf.cwl Branch/Commit ID: e9c83739a93fa0b18f8dea2f98b632a9e32725c9 |
|
|
Detect DoCM variants
|
![]() Path: definitions/subworkflows/docm_germline.cwl Branch/Commit ID: 51724b44c96e5fd849ae55b752865b80bc47d66c |
|
|
MAnorm PE - quantitative comparison of ChIP-Seq paired-end data
What is MAnorm? -------------- MAnorm is a robust model for quantitative comparison of ChIP-Seq data sets of TFs (transcription factors) or epigenetic modifications and you can use it for: * Normalization of two ChIP-seq samples * Quantitative comparison (differential analysis) of two ChIP-seq samples * Evaluating the overlap enrichment of the protein binding sites(peaks) * Elucidating underlying mechanisms of cell-type specific gene regulation How MAnorm works? ---------------- MAnorm uses common peaks of two samples as a reference to build the rescaling model for normalization, which is based on the empirical assumption that if a chromatin-associated protein has a substantial number of peaks shared in two conditions, the binding at these common regions will tend to be determined by similar mechanisms, and thus should exhibit similar global binding intensities across samples. The observed differences on common peaks are presumed to reflect the scaling relationship of ChIP-Seq signals between two samples, which can be applied to all peaks. What do the inputs mean? ---------------- ### General **Experiment short name/Alias** * short name for you experiment to identify among the others **ChIP-Seq PE sample 1** * previously analyzed ChIP-Seq paired-end experiment to be used as Sample 1 **ChIP-Seq PE sample 2** * previously analyzed ChIP-Seq paired-end experiment to be used as Sample 2 **Genome** * Reference genome to be used for gene assigning ### Advanced **Reads shift size for sample 1** * This value is used to shift reads towards 3' direction to determine the precise binding site. Set as half of the fragment length. Default 100 **Reads shift size for sample 2** * This value is used to shift reads towards 5' direction to determine the precise binding site. Set as half of the fragment length. Default 100 **M-value (log2-ratio) cutoff** * Absolute M-value (log2-ratio) cutoff to define biased (differential binding) peaks. Default: 1.0 **P-value cutoff** * P-value cutoff to define biased peaks. Default: 0.01 **Window size** * Window size to count reads and calculate read densities. 2000 is recommended for sharp histone marks like H3K4me3 and H3K27ac, and 1000 for TFs or DNase-seq. Default: 2000 |
![]() Path: workflows/manorm-pe.cwl Branch/Commit ID: c5bae2ca862c764911b83d1f15ff6af4e2a0db28 |
|
|
scatter GATK HaplotypeCaller over intervals
|
![]() Path: definitions/subworkflows/gatk_haplotypecaller_iterator.cwl Branch/Commit ID: e59c77629936fad069007ba642cad49fef7ad29f |
|
|
wf-loadContents2.cwl
|
![]() Path: tests/wf-loadContents2.cwl Branch/Commit ID: c7c97715b400ff2194aa29fc211d3401cea3a9bf |