Workflow: Runs InterProScan on batches of sequences to retrieve functional annotations.
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
inputFile | File [FASTA] | Input file path |
Optional, path to fasta file that should be loaded on Master startup. Alternatively, in CONVERT mode, the InterProScan 5 XML file to convert. |
databases | Directory (Optional) | ||
chunk_size | Integer (Optional) | ||
disableResidueAnnotation | Boolean (Optional) | Disables residue annotation |
Optional, excludes sites from the XML, JSON output. |
catOutputFileName | String | ||
seqtype | https://w3id.org/cwl/view/git/93c7dee353f887e978ca8c5423a5c975c0796e40/workflows/InterProScan-v5-chunked-wf.cwl#seqtype/seqtype (Optional) | Sequence type |
Optional, the type of the input sequences (dna/rna (n) or protein (p)). The default sequence type is protein. |
outputFormat | String[] | output format |
Optional, case-insensitive, comma separated list of output formats. Supported formats are TSV, XML, JSON, GFF3, HTML and SVG. Default for protein sequences are TSV, XML and GFF3, or for nucleotide sequences GFF3 and XML. |
applications | String[] (Optional) | Analysis |
Optional, comma separated list of analyses. If this option is not set, ALL analyses will be run. |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
split_seqs |
../utils/fasta_chunker.cwl
(CommandLineTool)
|
split FASTA by number of records |
based upon code by developers from EMBL-EBI |
run_interproscan |
../tools/InterProScan/InterProScan-v5-none_docker.cwl
(CommandLineTool)
|
InterProScan: protein sequence classifier |
InterProScan is the software package that allows sequences (protein and nucleic) to be scanned against InterPro's signatures. Signatures are predictive models, provided by several different databases, that make up the InterPro consortium. |
combine_interproscan_results |
../utils/concatenate.cwl
(CommandLineTool)
|
Redirecting Multiple Files Contain in a Single File |
The cat (short for “concatenate“) command is one of the most frequently used command in Linux/Unix like operating systems. cat command allows us to create single or multiple files, view contain of file, concatenate files and redirect output in terminal or files. |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
i5Annotations | File |
https://w3id.org/cwl/view/git/93c7dee353f887e978ca8c5423a5c975c0796e40/workflows/InterProScan-v5-chunked-wf.cwl