Workflow: Complex DAG

Fetched 2023-01-11 22:44:24 GMT

Non-linear combination of KnowEnG tools

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
taxon String Subnetwork Species ID

the taxonomic id for the species of interest

pheno_file File Phenotypic File

spreadsheet of phenotypic data with samples as rows and phenotypes as columns

genomic_file File Genomic Spreadsheet File

spreadsheet of genomic data with samples as columns and genes as rows

gg_edge_type String Subnetwork Edge Type

the edge type keyword for the subnetwork of interest

pg_edge_types String[] Subnetwork Edge Type

the edge type keyword for the subnetwork of interest

num_bootstraps Integer Number of bootstraps

number of types to sample the data and repeat the analysis

correlation_method String Correlation Method

keyword for correlation metric, i.e. t_test or pearson

num_clusters_array Integer[] Number of clusters

number of subtypes to divide the samples into

Steps

ID Runs Label Doc
clean_g
sspp_runner.cwl (CommandLineTool)
KN Spreadsheet Preprocessor

Transforms user spreadsheet in preparation for KN analytics by removing noise, mapping gene names, and extracting metadata statistics

clean_p
sspp_runner.cwl (CommandLineTool)
KN Spreadsheet Preprocessor

Transforms user spreadsheet in preparation for KN analytics by removing noise, mapping gene names, and extracting metadata statistics

clean_pt
sspp_runner.cwl (CommandLineTool)
KN Spreadsheet Preprocessor

Transforms user spreadsheet in preparation for KN analytics by removing noise, mapping gene names, and extracting metadata statistics

ggkn_fetch
knf_runner.cwl (CommandLineTool)
Knowledge Network Fetcher

Retrieve appropriate subnetwork from KnowEnG Knowledge Network from AWS S3 storage

gokn_fetch
knf_runner.cwl (CommandLineTool)
Knowledge Network Fetcher

Retrieve appropriate subnetwork from KnowEnG Knowledge Network from AWS S3 storage

gp_netboot
gp_runner.cwl (CommandLineTool)
ProGENI

Network-guided gene prioritization method implementation by KnowEnG that ranks gene measurements by their correlation to observed phenotypes.

enrichments
workflow.gsc.cwl (Workflow)
GSC Paired Jobs

Serial combination of KnowEnG tools

gsc_go_drawr
gsc_runner.cwl (CommandLineTool)
Gene Set Characterization

Network-guided gene set characterization method implementation by KnowEnG that relates public gene sets to user gene sets

top10_gather
top10_runner.cwl (CommandLineTool)
top10

Get the 10 rows with the smallest value in the selected column

clustering_wf
workflow.sc.cwl (Workflow)
SC w/ Evaluation

Serial combination of KnowEnG tools

Outputs

ID Type Label Doc
gp_out File GP top100 Genes

Membership spreadsheet with phenotype columns and gene rows

g_c_out File Cleaned Genomic Spread

Spreadsheet with columns and row headers

p_c_out File Cleaned Pheno Spread

Spreadsheet with columns and row headers

p_t_out File Transposed Pheno Spread

Spreadsheet with columns and row headers

sc_ce_out File[] Cluster Eval Results

Table with results of statistical tests between cluster membership and phenoyptes

gg_knf_out File GG KnowNet Edges

4 column format for subnetwork for single edge type and species

go_gsc_out File GO GSC Scores

Edge format file with first three columns (user gene set, public gene set, score)

go_knf_out File GO KnowNet Edges

4 column format for subnetwork for single edge type and species

sc_map_out File[] Cluster Membership

Assignment of samples to clusters

sc_ce_top10 File top10 clusters~pheno

file with 10 rows with the smallest value from the selected column

other_gsc_out File[] GSC Scores

Edge format file with first three columns (user gene set, public gene set, score)

Permalink: https://w3id.org/cwl/view/git/8ca52a8d2b76d91b7618032a22699c5be7d12c6c/code/workflow.cwl