Explore Workflows
View already parsed workflows here or click here to add your own
Graph | Name | Retrieved From | View |
---|---|---|---|
Motif Finding with HOMER with random background regions
Motif Finding with HOMER with random background regions --------------------------------------------------- HOMER contains a novel motif discovery algorithm that was designed for regulatory element analysis in genomics applications (DNA only, no protein). It is a differential motif discovery algorithm, which means that it takes two sets of sequences and tries to identify the regulatory elements that are specifically enriched in on set relative to the other. It uses ZOOPS scoring (zero or one occurrence per sequence) coupled with the hypergeometric enrichment calculations (or binomial) to determine motif enrichment. HOMER also tries its best to account for sequenced bias in the dataset. It was designed with ChIP-Seq and promoter analysis in mind, but can be applied to pretty much any nucleic acids motif finding problem. Here is how we generate background for Motifs Analysis ------------------------------------- 1. Take input file with regions in a form of “chr\" “start\" “end\" 2. Sort and remove duplicates from this regions file 3. Extend each region in 20Kb into both directions 4. Merge all overlapped extended regions 5. Subtract not extended regions from the extended ones 6. Randomly distribute not extended regions within the regions that we got as a result of the previous step 7. Get fasta file from these randomly distributed regions (from the previous step). Use it as background For more information please refer to: ------------------------------------- [Official documentation](http://homer.ucsd.edu/homer/motif/) |
https://github.com/datirium/workflows.git
Path: workflows/homer-motif-analysis.cwl Branch/Commit ID: 4a5c59829ff8b9f3c843e66e3c675dcd9c689ed5 |
||
readgroups_bam_to_readgroups_fastq_lists.cwl
|
https://github.com/nci-gdc/gdc-dnaseq-cwl.git
Path: workflows/bamfastq_align/readgroups_bam_to_readgroups_fastq_lists.cwl Branch/Commit ID: 0495e3095182b2e1b4d6274833b3d2ce30347a4e |
||
align_merge_sas
|
https://github.com/ncbi/pgap.git
Path: task_types/tt_align_merge_sas.cwl Branch/Commit ID: 4e2a295bb6c8b4982402ee80538a0cdb8ee6b6dd |
||
heatmap-prepare.cwl
Workflow runs homer-make-tag-directory.cwl tool using scatter for the following inputs - bam_file - fragment_size - total_reads `dotproduct` is used as a `scatterMethod`, so one element will be taken from each array to construct each job: 1) bam_file[0] fragment_size[0] total_reads[0] 2) bam_file[1] fragment_size[1] total_reads[1] ... N) bam_file[N] fragment_size[N] total_reads[N] `bam_file`, `fragment_size` and `total_reads` arrays should have the identical order. |
https://github.com/datirium/workflows.git
Path: tools/heatmap-prepare.cwl Branch/Commit ID: cbefc215d8286447620664fb47076ba5d81aa47f |
||
kmer_build_tree
|
https://github.com/ncbi/pgap.git
Path: task_types/tt_kmer_build_tree.cwl Branch/Commit ID: 861d9baa067af98d794ba0ed4e43aa42e37d8a24 |
||
assm_assm_blastn_wnode
|
https://github.com/ncbi/pgap.git
Path: task_types/tt_assm_assm_blastn_wnode.cwl Branch/Commit ID: 8fb4ac7f5a66897206c7469101a471108b06eada |
||
tt_blastn_wnode
|
https://github.com/ncbi/pgap.git
Path: task_types/tt_blastn_wnode.cwl Branch/Commit ID: f6950321e5c9ee733ad68a273d2ad8e802a6b982 |
||
Cellranger aggr - aggregates data from multiple Cellranger runs
Devel version of Single-Cell Cell Ranger Aggregate ================================================== Workflow calls \"cellranger aggr\" command to combine output files from \"cellranger count\" (the molecule_info.h5 file from each run) into a single feature-barcode matrix containing all the data. When combining multiple GEM wells, the barcode sequences for each channel are distinguished by a GEM well suffix appended to the barcode sequence. Each GEM well is a physically distinct set of GEM partitions, but draws barcode sequences randomly from the pool of valid barcodes, known as the barcode whitelist. To keep the barcodes unique when aggregating multiple libraries, we append a small integer identifying the GEM well to the barcode nucleotide sequence, and use that nucleotide sequence plus ID as the unique identifier in the feature-barcode matrix. For example, AGACCATTGAGACTTA-1 and AGACCATTGAGACTTA-2 are distinct cell barcodes from different GEM wells, despite having the same barcode nucleotide sequence. This number, which tells us which GEM well this barcode sequence came from, is called the GEM well suffix. The numbering of the GEM wells will reflect the order that the GEM wells were provided in the \"molecule_info_h5\" and \"gem_well_labels\" inputs. When combining data from multiple GEM wells, the \"cellranger aggr\" pipeline automatically equalizes the average read depth per cell between groups before merging. This approach avoids artifacts that may be introduced due to differences in sequencing depth. It is possible to turn off normalization or change the way normalization is done through the \"normalization_mode\" input. The \"none\" value may be appropriate if you want to maximize sensitivity and plan to deal with depth normalization in a downstream step. |
https://github.com/datirium/workflows.git
Path: workflows/cellranger-aggr.cwl Branch/Commit ID: 4a5c59829ff8b9f3c843e66e3c675dcd9c689ed5 |
||
rnaseq-star-rsem-pe.cwl
|
https://github.com/pitagora-network/dat2-cwl.git
Path: workflow/rna-seq/rnaseq-star-rsem-pe/rnaseq-star-rsem-pe.cwl Branch/Commit ID: 0cd20e1be620ae0817a1aa4286d73b78c89809f0 |
||
assm_assm_blastn_wnode
|
https://github.com/ncbi/pgap.git
Path: task_types/tt_assm_assm_blastn_wnode.cwl Branch/Commit ID: 90a321ecf2d049330bcf0657cc4d764d2c3f42dd |