CWL Workflow: Complex DAG

Workflow: Complex DAG

Fetched 2023-01-11 22:44:24 GMT

Verified with cwltool version 3.1.20221201130942

Non-linear combination of KnowEnG tools

Selected
|
Default Values
Nested Workflows
Tools
Inputs/Outputs

Unknown workflow license, check source repository.

Inputs

ID	Type	Title	Doc
taxon	String	Subnetwork Species ID	the taxonomic id for the species of interest
pheno_file	File	Phenotypic File	spreadsheet of phenotypic data with samples as rows and phenotypes as columns
genomic_file	File	Genomic Spreadsheet File	spreadsheet of genomic data with samples as columns and genes as rows
gg_edge_type	String	Subnetwork Edge Type	the edge type keyword for the subnetwork of interest
pg_edge_types	String[]	Subnetwork Edge Type	the edge type keyword for the subnetwork of interest
num_bootstraps	Integer	Number of bootstraps	number of types to sample the data and repeat the analysis
correlation_method	String	Correlation Method	keyword for correlation metric, i.e. t_test or pearson
num_clusters_array	Integer[]	Number of clusters	number of subtypes to divide the samples into

Steps

ID	Runs	Label	Doc
clean_g	sspp_runner.cwl (CommandLineTool)	KN Spreadsheet Preprocessor	Transforms user spreadsheet in preparation for KN analytics by removing noise, mapping gene names, and extracting metadata statistics
clean_p	sspp_runner.cwl (CommandLineTool)	KN Spreadsheet Preprocessor	Transforms user spreadsheet in preparation for KN analytics by removing noise, mapping gene names, and extracting metadata statistics
clean_pt	sspp_runner.cwl (CommandLineTool)	KN Spreadsheet Preprocessor	Transforms user spreadsheet in preparation for KN analytics by removing noise, mapping gene names, and extracting metadata statistics
ggkn_fetch	knf_runner.cwl (CommandLineTool)	Knowledge Network Fetcher	Retrieve appropriate subnetwork from KnowEnG Knowledge Network from AWS S3 storage
gokn_fetch	knf_runner.cwl (CommandLineTool)	Knowledge Network Fetcher	Retrieve appropriate subnetwork from KnowEnG Knowledge Network from AWS S3 storage
gp_netboot	gp_runner.cwl (CommandLineTool)	ProGENI	Network-guided gene prioritization method implementation by KnowEnG that ranks gene measurements by their correlation to observed phenotypes.
enrichments	workflow.gsc.cwl (Workflow)	GSC Paired Jobs	Serial combination of KnowEnG tools
gsc_go_drawr	gsc_runner.cwl (CommandLineTool)	Gene Set Characterization	Network-guided gene set characterization method implementation by KnowEnG that relates public gene sets to user gene sets
top10_gather	top10_runner.cwl (CommandLineTool)	top10	Get the 10 rows with the smallest value in the selected column
clustering_wf	workflow.sc.cwl (Workflow)	SC w/ Evaluation	Serial combination of KnowEnG tools

Outputs

ID	Type	Label	Doc
gp_out	File	GP top100 Genes	Membership spreadsheet with phenotype columns and gene rows
g_c_out	File	Cleaned Genomic Spread	Spreadsheet with columns and row headers
p_c_out	File	Cleaned Pheno Spread	Spreadsheet with columns and row headers
p_t_out	File	Transposed Pheno Spread	Spreadsheet with columns and row headers
sc_ce_out	File[]	Cluster Eval Results	Table with results of statistical tests between cluster membership and phenoyptes
gg_knf_out	File	GG KnowNet Edges	4 column format for subnetwork for single edge type and species
go_gsc_out	File	GO GSC Scores	Edge format file with first three columns (user gene set, public gene set, score)
go_knf_out	File	GO KnowNet Edges	4 column format for subnetwork for single edge type and species
sc_map_out	File[]	Cluster Membership	Assignment of samples to clusters
sc_ce_top10	File	top10 clusters~pheno	file with 10 rows with the smallest value from the selected column
other_gsc_out	File[]	GSC Scores	Edge format file with first three columns (user gene set, public gene set, score)

Permalink: https://w3id.org/cwl/view/git/8ca52a8d2b76d91b7618032a22699c5be7d12c6c/code/workflow.cwl