Workflow: Generate genome indices for STAR & bowtie
Creates indices for: * [STAR](https://github.com/alexdobin/STAR) v2.5.3a (03/17/2017) PMID: [23104886](https://www.ncbi.nlm.nih.gov/pubmed/23104886) * [bowtie](http://bowtie-bio.sourceforge.net/tutorial.shtml) v1.2.0 (12/30/2016) It performs the following steps: 1. `STAR --runMode genomeGenerate` to generate indices, based on [FASTA](http://zhanglab.ccmb.med.umich.edu/FASTA/) and [GTF](http://mblab.wustl.edu/GTF2.html) input files, returns results as an array of files 2. Outputs indices as [Direcotry](http://www.commonwl.org/v1.0/CommandLineTool.html#Directory) data type 3. Separates *chrNameLength.txt* file from Directory output 4. `bowtie-build` to generate indices requires genome [FASTA](http://zhanglab.ccmb.med.umich.edu/FASTA/) file as input, returns results as a group of main and secondary files
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
fasta | File [FASTA] | Genome FASTA file |
Reference genome FASTA file |
genome | String | Genome |
Output files base string |
threads | Integer (Optional) | Number of threads to run tools |
Number of threads for those steps that support multithreading |
genome_label | String (Optional) | Genome label |
Genome label is used by web-ui to show label |
annotation_tab | File [TSV] | Annotation file |
Tab-separated annotation file |
genome_details | String (Optional) | Genome details |
Genome details |
fasta_ribosomal | File (Optional) [FASTA] | Ribosomal DNA sequence FASTA file |
Ribosomal DNA sequence FASTA file |
genome_description | String (Optional) | Genome description |
Genome description is used by web-ui to show description |
genome_sa_sparse_d | Integer (Optional) | Genome SA sparse (Use 2 to decrease RAM usage) |
default: 1 |
fasta_mitochondrial | File (Optional) [FASTA] | Mitochondrial chromosome sequence FASTA file |
Mitochondrial chromosome sequence FASTA file |
input_annotation_gtf | File [GTF] | GTF input file |
Annotation input file |
effective_genome_size | String | Effective genome size |
MACS2 effective genome size: hs, mm, ce, dm or number, for example 2.7e9 |
genome_chr_bin_n_bits | Integer (Optional) | Genome Chr Bin NBits |
If you are using a genome with a large (>5,000) number of references (chrosomes/scaffolds),
you may need to reduce the --genomeChrBinNbits to reduce RAM consumption.
The following scaling is recommended: --genomeChrBinNbits = min(18,log2[max(GenomeLength/NumberOfReferences,ReadLength)]).
For example, for 3 gigaBase genome with 100,000 chromosomes/scaffolds, this is equal to 15. |
genome_sa_index_n_bases | Integer (Optional) | length of the SA pre-indexing string |
For small genomes, the parameter --genomeSAindexNbases must to be scaled down, with a typical value of
min(14, log2(GenomeLength)/2 - 1). For example, for 1 megaBase genome, this is equal to 9,
for 100 kiloBase genome, this is equal to 7. |
limit_genome_generate_ram | Long (Optional) | Genome Generate RAM (31G default) |
31000000000 int>0: maximum available RAM (bytes) for genome generation |
genome_sa_index_n_bases_mitochondrial | Integer (Optional) | length (mitochondrial) of the SA pre-indexing string |
For small genomes, the parameter --genomeSAindexNbases must to be scaled down, with a typical value of
min(14, log2(GenomeLength)/2 - 1). For example, for 1 megaBase genome, this is equal to 9,
for 100 kiloBase genome, this is equal to 7. |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
star_generate_indices |
../tools/star-genomegenerate.cwl
(CommandLineTool)
|
Runs STAR genomeGenerated. Returns directory with index |
|
bowtie_generate_indices |
../tools/bowtie-build.cwl
(CommandLineTool)
|
Tool runs bowtie-build Not supported parameters: -c - reference sequences given on cmd line (as <seq_in>) |
|
ribosomal_generate_indices |
../tools/bowtie-build.cwl
(CommandLineTool)
|
Tool runs bowtie-build Not supported parameters: -c - reference sequences given on cmd line (as <seq_in>) |
|
mitochondrial_generate_indices |
../tools/star-genomegenerate.cwl
(CommandLineTool)
|
Runs STAR genomeGenerated. Returns directory with index |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
annotation | File [TSV] | Annotation file |
Tab-separated annotation file |
genome_size | String | Effective genome size |
MACS2 effective genome size: hs, mm, ce, dm or number, for example 2.7e9 |
chrom_length | File [Textual format] | Chromosome length file |
Chromosome length file |
star_indices | Directory | STAR indices folder |
Folder which includes all STAR generated indices folder |
annotation_gtf | File [GTF] | GTF input file |
Annotation input file |
bowtie_indices | Directory | Bowtie indices folder |
Folder which includes all Bowtie generated indices folder |
ribosomal_indices | Directory | Ribosomal DNA indices folder |
Ribosomal DNA Bowtie generated indices folder |
mitochondrial_indices | Directory | Mitochondrial chromosome index folder |
Mitochondrial chromosome index folder |
https://w3id.org/cwl/view/git/c602e3cdd72ff904dd54d46ba2b5146eb1c57022/workflows/genome-indices.cwl