Explore Workflows

View already parsed workflows here or click here to add your own

Graph Name Retrieved From View
workflow graph Motif Finding with HOMER with random background regions

Motif Finding with HOMER with random background regions --------------------------------------------------- HOMER contains a novel motif discovery algorithm that was designed for regulatory element analysis in genomics applications (DNA only, no protein). It is a differential motif discovery algorithm, which means that it takes two sets of sequences and tries to identify the regulatory elements that are specifically enriched in on set relative to the other. It uses ZOOPS scoring (zero or one occurrence per sequence) coupled with the hypergeometric enrichment calculations (or binomial) to determine motif enrichment. HOMER also tries its best to account for sequenced bias in the dataset. It was designed with ChIP-Seq and promoter analysis in mind, but can be applied to pretty much any nucleic acids motif finding problem. Here is how we generate background for Motifs Analysis ------------------------------------- 1. Take input file with regions in a form of “chr\" “start\" “end\" 2. Sort and remove duplicates from this regions file 3. Extend each region in 20Kb into both directions 4. Merge all overlapped extended regions 5. Subtract not extended regions from the extended ones 6. Randomly distribute not extended regions within the regions that we got as a result of the previous step 7. Get fasta file from these randomly distributed regions (from the previous step). Use it as background For more information please refer to: ------------------------------------- [Official documentation](http://homer.ucsd.edu/homer/motif/)

https://github.com/datirium/workflows.git

Path: workflows/homer-motif-analysis.cwl

Branch/Commit ID: 4a5c59829ff8b9f3c843e66e3c675dcd9c689ed5

workflow graph readgroups_bam_to_readgroups_fastq_lists.cwl

https://github.com/nci-gdc/gdc-dnaseq-cwl.git

Path: workflows/bamfastq_align/readgroups_bam_to_readgroups_fastq_lists.cwl

Branch/Commit ID: 0495e3095182b2e1b4d6274833b3d2ce30347a4e

workflow graph align_merge_sas

https://github.com/ncbi/pgap.git

Path: task_types/tt_align_merge_sas.cwl

Branch/Commit ID: 4e2a295bb6c8b4982402ee80538a0cdb8ee6b6dd

workflow graph heatmap-prepare.cwl

Workflow runs homer-make-tag-directory.cwl tool using scatter for the following inputs - bam_file - fragment_size - total_reads `dotproduct` is used as a `scatterMethod`, so one element will be taken from each array to construct each job: 1) bam_file[0] fragment_size[0] total_reads[0] 2) bam_file[1] fragment_size[1] total_reads[1] ... N) bam_file[N] fragment_size[N] total_reads[N] `bam_file`, `fragment_size` and `total_reads` arrays should have the identical order.

https://github.com/datirium/workflows.git

Path: tools/heatmap-prepare.cwl

Branch/Commit ID: cbefc215d8286447620664fb47076ba5d81aa47f

workflow graph kmer_build_tree

https://github.com/ncbi/pgap.git

Path: task_types/tt_kmer_build_tree.cwl

Branch/Commit ID: 861d9baa067af98d794ba0ed4e43aa42e37d8a24

workflow graph assm_assm_blastn_wnode

https://github.com/ncbi/pgap.git

Path: task_types/tt_assm_assm_blastn_wnode.cwl

Branch/Commit ID: 8fb4ac7f5a66897206c7469101a471108b06eada

workflow graph tt_blastn_wnode

https://github.com/ncbi/pgap.git

Path: task_types/tt_blastn_wnode.cwl

Branch/Commit ID: f6950321e5c9ee733ad68a273d2ad8e802a6b982

workflow graph Cellranger aggr - aggregates data from multiple Cellranger runs

Devel version of Single-Cell Cell Ranger Aggregate ================================================== Workflow calls \"cellranger aggr\" command to combine output files from \"cellranger count\" (the molecule_info.h5 file from each run) into a single feature-barcode matrix containing all the data. When combining multiple GEM wells, the barcode sequences for each channel are distinguished by a GEM well suffix appended to the barcode sequence. Each GEM well is a physically distinct set of GEM partitions, but draws barcode sequences randomly from the pool of valid barcodes, known as the barcode whitelist. To keep the barcodes unique when aggregating multiple libraries, we append a small integer identifying the GEM well to the barcode nucleotide sequence, and use that nucleotide sequence plus ID as the unique identifier in the feature-barcode matrix. For example, AGACCATTGAGACTTA-1 and AGACCATTGAGACTTA-2 are distinct cell barcodes from different GEM wells, despite having the same barcode nucleotide sequence. This number, which tells us which GEM well this barcode sequence came from, is called the GEM well suffix. The numbering of the GEM wells will reflect the order that the GEM wells were provided in the \"molecule_info_h5\" and \"gem_well_labels\" inputs. When combining data from multiple GEM wells, the \"cellranger aggr\" pipeline automatically equalizes the average read depth per cell between groups before merging. This approach avoids artifacts that may be introduced due to differences in sequencing depth. It is possible to turn off normalization or change the way normalization is done through the \"normalization_mode\" input. The \"none\" value may be appropriate if you want to maximize sensitivity and plan to deal with depth normalization in a downstream step.

https://github.com/datirium/workflows.git

Path: workflows/cellranger-aggr.cwl

Branch/Commit ID: 4a5c59829ff8b9f3c843e66e3c675dcd9c689ed5

workflow graph rnaseq-star-rsem-pe.cwl

https://github.com/pitagora-network/dat2-cwl.git

Path: workflow/rna-seq/rnaseq-star-rsem-pe/rnaseq-star-rsem-pe.cwl

Branch/Commit ID: 0cd20e1be620ae0817a1aa4286d73b78c89809f0

workflow graph assm_assm_blastn_wnode

https://github.com/ncbi/pgap.git

Path: task_types/tt_assm_assm_blastn_wnode.cwl

Branch/Commit ID: 90a321ecf2d049330bcf0657cc4d764d2c3f42dd