TransDecoder 2 step workflow, running TransDecoder.LongOrfs (step 1) followed by TransDecoder.Predict (step2)

Workflow: TransDecoder 2 step workflow, running TransDecoder.LongOrfs (step 1) followed by TransDecoder.Predict (step2)

Fetched 2025-07-11 08:03:40 GMT

Verified with cwltool version 3.1.20221201130942

Selected
|
Default Values
Nested Workflows
Tools
Inputs/Outputs

This workflow is Open Source and may be reused according to the terms of: Apache License 2.0

Note that the tools invoked by the workflow may have separate licenses.

Inputs

ID	Type	Title	Doc
singleBestOnly	Boolean (Optional)
transcriptsFile	File [FASTA]

Steps

ID	Runs	Label	Doc
extract_long_orfs	../tools/TransDecoder/TransDecoder.LongOrfs-v5.cwl (CommandLineTool)	TransDecoder.LongOrfs: Perl script, which extracts the long open reading frames	TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks. TransDecoder identifies likely coding sequences based on the following criteria: + a minimum length open reading frame (ORF) is found in a transcript sequence + a log-likelihood score similar to what is computed by the GeneID software is > 0. + the above coding score is greatest when the ORF is scored in the 1st reading frame as compared to scores in the other 2 forward reading frames. + if a candidate ORF is found fully encapsulated by the coordinates of another candidate ORF, the longer one is reported. However, a single transcript can report multiple ORFs (allowing for operons, chimeras, etc). + a PSSM is built/trained/used to refine the start codon prediction. + optional the putative peptide has a match to a Pfam domain above the noise cutoff score. Please visit https://github.com/TransDecoder/TransDecoder/wiki for full documentation. Releases can be downloaded from https://github.com/TransDecoder/TransDecoder/releases
predict_coding_regions	../tools/TransDecoder/TransDecoder.Predict-v5.cwl (CommandLineTool)	TransDecoder.Predict: Perl script, which predicts the likely coding regions	TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks. TransDecoder identifies likely coding sequences based on the following criteria: + a minimum length open reading frame (ORF) is found in a transcript sequence + a log-likelihood score similar to what is computed by the GeneID software is > 0. + the above coding score is greatest when the ORF is scored in the 1st reading frame as compared to scores in the other 2 forward reading frames. + if a candidate ORF is found fully encapsulated by the coordinates of another candidate ORF, the longer one is reported. However, a single transcript can report multiple ORFs (allowing for operons, chimeras, etc). + a PSSM is built/trained/used to refine the start codon prediction. + optional the putative peptide has a match to a Pfam domain above the noise cutoff score. Please visit https://github.com/TransDecoder/TransDecoder/wiki for full documentation. Releases can be downloaded from https://github.com/TransDecoder/TransDecoder/releases

Outputs

ID	Type	Label	Doc
bed_output	File
gff3_output	File
coding_regions	File
peptide_sequences	File

Permalink: https://w3id.org/cwl/view/git/e9bbe2917384efc75ba067db23612bc8e22f3f06/workflows/TransDecoder-v5-wf-2steps.cwl