Workflow: protein annotation
Proteins - predict, cluster, identify, annotate
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
jobid | String | ||
m5nrBDB | File | ||
m5nrSCG | File | ||
m5nrFull | File[] | ||
sequences | File | ||
protIdentity | Float (Optional) |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
catSims |
../Tools/cat.tool.cwl
(CommandLineTool)
|
GNU cat |
Concatenate FILE(s) to standard output |
sortSims |
../Tools/sort.tool.cwl
(CommandLineTool)
|
GNU sort |
sort text file base on given field(s) |
superblat |
../Tools/superblat.tool.cwl
(CommandLineTool)
|
superBLAT |
multi-threaded fast sequence search command line tool, protein only >superblat -fastMap -prot -out blast8 <database> <query> <output> |
bleachSims |
../Tools/bleachsims.tool.cwl
(CommandLineTool)
|
bleachsims |
filter similarity file by E-value and number of hits >bleachsims -s <input> -o <output> -m 20 -r 0 -c 3 |
protCluster |
../Tools/cdhit.tool.cwl
(CommandLineTool)
|
CD-HIT |
cluster protein sequences use max available cpus and memory >cdhit -n 5 -d 0 -T 0 -M 0 -c 0.9 -i <input> -o <output> |
protFeature |
../Tools/fraggenescan.tool.cwl
(CommandLineTool)
|
FragGeneScan |
hidden Markov model for predicting prokaryotic coding regions >run_FragGeneScan.pl --genome <input> --out <output> --complete 0 --train 454_30 |
annotateSims |
../Tools/sims_annotate.tool.cwl
(CommandLineTool)
|
annotate sims |
create expanded annotated sims files from input md5 sim file and m5nr db sims_annotate.pl --verbose --in_sim <input> --in_scg <scgs> --ann_file <database> --format <seqFormat> --out_filter <outFilter> --out_expand <outExpand> -out_lca <outLca> --frag_num 5000 |
formatCluster |
../Tools/format_cluster.tool.cwl
(CommandLineTool)
|
cluster file reformat |
re-formats cd-hit .clstr file into mg-rast .mapping file >format_cluster.pl --input <input> --output <output> |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
protLCAOut | File | ||
protSimsOut | File | ||
protExpandOut | File | ||
protFilterOut | File | ||
protFeatureOut | File | ||
protClustMapOut | File | ||
protClustSeqOut | File |
https://w3id.org/cwl/view/git/7b1df2ecce5a8727f2c546c5baa45c919edd8a76/CWL/Workflows/protein-annotation.workflow.cwl