Workflow: Generate genome index STAR RNA
Workflow makes indices for [STAR](https://github.com/alexdobin/STAR) v2.5.3a (03/17/2017) PMID: [23104886](https://www.ncbi.nlm.nih.gov/pubmed/23104886). It performs the following steps: 1. Runs `STAR --runMode genomeGenerate` to generate indices, based on [FASTA](http://zhanglab.ccmb.med.umich.edu/FASTA/) and [GTF](http://mblab.wustl.edu/GTF2.html) input files, returns results as an array of files 2. Transforms array of files into [Direcotry](http://www.commonwl.org/v1.0/CommandLineTool.html#Directory) data type 3. Separates *chrNameLength.txt* file as an output
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
fasta | File [FASTA] | FASTA input file |
Reference genome input FASTA file |
threads | Integer (Optional) | Number of threads to run tools |
Number of threads for those steps that support multithreading |
genome_label | String (Optional) | Genome label |
Genome label is used by web-ui to show label |
annotation_gtf | File [GTF] | GTF input file |
Annotation input file |
annotation_tab | File [TSV] | Annotation file |
Tab-separated annotation file |
genome_details | String (Optional) | Genome details |
Genome details |
genome_description | String (Optional) | Genome description |
Genome description is used by web-ui to show description |
genome_sa_sparse_d | Integer (Optional) | Use 2 to decrease needed RAM for STAR |
int>0: suffux array sparsity, i.e. distance between indices: use bigger numbers to decrease needed RAM at the cost of mapping speed reduction |
genome_chr_bin_n_bits | Integer (Optional) | Genome Chr Bin NBits |
If you are using a genome with a large (>5,000) number of references (chrosomes/scaffolds),
you may need to reduce the --genomeChrBinNbits to reduce RAM consumption.
The following scaling is recommended: --genomeChrBinNbits = min(18,log2[max(GenomeLength/NumberOfReferences,ReadLength)]).
For example, for 3 gigaBase genome with 100,000 chromosomes/scaffolds, this is equal to 15. |
genome_sa_index_n_bases | Integer (Optional) | length of the SA pre-indexing string |
For small genomes, the parameter --genomeSAindexNbases must to be scaled down, with a typical value of
min(14, log2(GenomeLength)/2 - 1). For example, for 1 megaBase genome, this is equal to 9,
for 100 kiloBase genome, this is equal to 7. |
limit_genome_generate_ram | Long (Optional) |
31000000000 int>0: maximum available RAM (bytes) for genome generation |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
star_generate_indices |
../tools/star-genomegenerate.cwl
(CommandLineTool)
|
Runs STAR genomeGenerated. Returns directory with index |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
annotation | File [TSV] | Annotation file |
Tab-separated annotation file |
chrom_length | File [Textual format] | Chromosome length file |
Chromosome length file |
star_indices | Directory | STAR indices folder |
Folder which includes all STAR generated indices files |
https://w3id.org/cwl/view/git/bfa3843bcf36125ff258d6314f64b41336f06e6b/workflows/star-index.cwl