Workflow: GSEApy - Gene Set Enrichment Analysis in Python
GSEAPY: Gene Set Enrichment Analysis in Python ============================================== Gene Set Enrichment Analysis is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes). GSEA requires as input an expression dataset, which contains expression profiles for multiple samples. While the software supports multiple input file formats for these datasets, the tab-delimited GCT format is the most common. The first column of the GCT file contains feature identifiers (gene ids or symbols in the case of data derived from RNA-Seq experiments). The second column contains a description of the feature; this column is ignored by GSEA and may be filled with “NA”s. Subsequent columns contain the expression values for each feature, with one sample's expression value per column. It is important to note that there are no hard and fast rules regarding how a GCT file's expression values are derived. The important point is that they are comparable to one another across features within a sample and comparable to one another across samples. Tools such as DESeq2 can be made to produce properly normalized data (normalized counts) which are compatible with GSEA.
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
seed | Integer (Optional) | Number of random seed. Default: None |
Number of random seed. Default: None |
alias | String | Experiment short name/Alias | |
threads | Integer (Optional) | Number of threads |
Number of threads for those steps that support multithreading |
graphs_count | Integer (Optional) | Numbers of top graphs produced |
Numbers of top graphs produced. Default: 20 |
phenotypes_file | File [Textual format] | DESeq experiment |
Input class vector (phenotype) file in CLS format. Same with GSEA |
ranking_metrics | https://w3id.org/cwl/view/git/799575ce58746813f066a665adeacdda252d8cab/workflows/gseapy.cwl#ranking_metrics/rankingmetrics (Optional) | Methods to calculate correlations of ranking metrics |
Methods to calculate correlations of ranking metrics. Default: log2_ratio_of_classes |
permutation_type | https://w3id.org/cwl/view/git/799575ce58746813f066a665adeacdda252d8cab/workflows/gseapy.cwl#permutation_type/permutationtype (Optional) | Permutation type |
Permutation type. Default: gene_set |
read_counts_file | File [GCT/Res format] | DESeq experiment |
Input gene expression dataset file in txt or gct format. Same with GSEA |
gene_set_database | https://w3id.org/cwl/view/git/799575ce58746813f066a665adeacdda252d8cab/workflows/gseapy.cwl#gene_set_database/genesetdatabase | Gene set database |
Gene set database |
max_gene_set_size | Integer (Optional) | Max size of input genes presented in Gene Sets |
Max size of input genes presented in Gene Sets. Default: 500 |
min_gene_set_size | Integer (Optional) | Min size of input genes presented in Gene Sets |
Min size of input genes presented in Gene Sets. Default: 15 |
permutation_count | Integer (Optional) | Number of random permutations |
Number of random permutations. For calculating esnulls. Default: 1000 |
ascending_rank_sorting | Boolean (Optional) | Ascending rank metric sorting order |
Ascending rank metric sorting order. Default: False |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
run_gseapy |
../tools/gseapy.cwl
(CommandLineTool)
|
GSEAPY: Gene Set Enrichment Analysis in Python
============================================== |
|
convert_to_tsv |
../tools/custom-bash.cwl
(CommandLineTool)
|
Tool to run custom script set as `script` input with arguments from `param`. Default script runs sed command over the input file and exports results to the file with the same name as input's basename |
|
rename_enrichment_plots |
../tools/rename.cwl
(CommandLineTool)
|
Tool renames `source_file` to `target_filename`. Input `target_filename` should be set as string. If it's a full path, only basename will be used. If BAI file is present, it will be renamed too |
|
compress_enrichment_plots |
../tools/tar-compress.cwl
(CommandLineTool)
|
Compresses input directory to tar.gz |
|
rename_enrichment_heatmaps |
../tools/rename.cwl
(CommandLineTool)
|
Tool renames `source_file` to `target_filename`. Input `target_filename` should be set as string. If it's a full path, only basename will be used. If BAI file is present, it will be renamed too |
|
compress_enrichment_heatmaps |
../tools/tar-compress.cwl
(CommandLineTool)
|
Compresses input directory to tar.gz |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
gseapy_stderr_log | File [Textual format] | GSEApy stderr log |
GSEApy stderr log |
gseapy_stdout_log | File [Textual format] | GSEApy stdout log |
GSEApy stdout log |
gseapy_enrichment_plots | File | Compressed TAR with enrichment plots |
Compressed TAR with enrichment plots |
gseapy_enrichment_report | File [TSV] | Enrichment report |
Enrichment report |
gseapy_enrichment_heatmaps | File | Compressed TAR with enrichment heatmaps |
Compressed TAR with enrichment heatmaps |
https://w3id.org/cwl/view/git/799575ce58746813f066a665adeacdda252d8cab/workflows/gseapy.cwl