Workflow: PrediXcan
Predict.py has been wrapped in cwl, getting the information from: https://github.com/hakyimlab/MetaXcan/wiki/Individual-level-PrediXcan:-introduction,-tutorials-and-manual Here is a snippet from: https://github.com/hakyimlab/MetaXcan/wiki/Individual-level-PrediXcan:-introduction,-tutorials-and-manual In the following, we focus on the individual-level implementation of PrediXcan. The method was originally implemented in this repository. PrediXcan consists of two steps: Predict gene expression (or whatever biology the models predict) in a cohort with available genotypes Run associations to a trait measured in the cohort The first step is implemented in Predict.py. The prediction models are trained and pre-compiled on specific data sets with their own human genome releases and variant definitions. We implemented a few rules to support variant matching from genotypes based on different variant definitions. In the following, mapping refers to the process of assigning a model variant to a genotype variant. Originally, PrediXcan was applied to genes so we say \"gene expression\" a lot as it was the mechanism we initially studied. But conceptually, everything said here applies to any intermediate/molecular mechanism such as splicing or brain morphology. Whenever we say \"gene\", it generally could mean a splicing intron event, etc.
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
vcf_mode | String (Optional) |
-\"genotyped\" is meant for phased, genotyped vcfs that contain counts of each allele at each chromosome pair. -\"imputed\" will load DS field as dosage. This is meant to work with imputed vcfs as generated by the Michigan Imputation Server. |
|
covariates | String |
Please type in the column names of any additional covariates you would like to account for. Please input covariates exactly as they appear in the phenotype file with quotations around each input and separate by a comma, no spaces. ex) \"sex\",\"age\",\"PC1\" |
|
model_db_path | File (Optional) |
Path to a SQlite file containing prediction models. |
|
output_prefix | String |
[REQUIRED] File name prefix for output files. |
|
vcf_genotypes | File[] (Optional) |
Pattern of vcf genotype files. |
|
kinship_matrix | File (Optional) |
A text delimited file with a .txt file extension or an R data file with a .RData file extension containing a matrix of size M × M, where rows and columns are the sample/subject IDs |
|
phenotype_file | File (Optional) |
[REQUIRED] A text delimited file with a .txt file extension containing a matrix of size M + 1 × C + 1, where M >= N and is the number of samples for which covariate data is provided. |
|
model_db_snp_key | String (Optional) |
Optional. If provided, will load variant ids from an alternative column in the db. By default, PrediXcan uses rsids, and this works with Elastic Net models. For the more sophisticated MASHR models, --model_db_snp_key varID must be specified with this argument. |
|
prediction_output | String |
Specify output (and output type) of predicted expression matrix |
|
on_the_fly_mapping | String (Optional) |
Optional. Specify a pattern to build a variant id from genotype variant properties. e.g. --on_the_fly_mapping METADATA \"chr{}_{}_{}_{}_b38\" will take the genotype variant's chromosome, position, alleles to build a variant id like chr1_123_A_G_b38. This will use the genotype properties, or if liftover is specified, the lifted coordinates. |
|
prediction_summary_output | String |
A separate file that will contain some additional information on the predictions (such as number of snps in the gene's models, number of snps used, etc). |
|
main_phenotype_of_interest | String |
[REQUIRED] A string value defining the column name of the phenotype of interest. Should be a dichotomous or continuous variable. Please enter in exactly as it appears in phenotype file not surrounded by quotations.
ex) main_interest |
Steps
There are no steps in this workflow
Outputs
ID | Type | Label | Doc |
---|---|---|---|
summary | File (Optional) | ||
Association_output | File |
https://w3id.org/cwl/view/git/5b49ef07b994963d190f4f508bc08e4bec8b8a0b/predixcan/predixcan_unpack.cwl