Workflow: TransDecoder 2 step workflow, running TransDecoder.LongOrfs (step 1) followed by TransDecoder.Predict (step2)
- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
singleBestOnly | Boolean (Optional) | ||
transcriptsFile | File [FASTA] |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
extract_long_orfs |
../tools/TransDecoder/TransDecoder.LongOrfs-v5.cwl
(CommandLineTool)
|
TransDecoder.LongOrfs: Perl script, which extracts the long open reading frames |
TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.
TransDecoder identifies likely coding sequences based on the following criteria:
+ a minimum length open reading frame (ORF) is found in a transcript sequence
+ a log-likelihood score similar to what is computed by the GeneID software is > 0.
+ the above coding score is greatest when the ORF is scored in the 1st reading frame
as compared to scores in the other 2 forward reading frames.
+ if a candidate ORF is found fully encapsulated by the coordinates of another candidate ORF,
the longer one is reported. However, a single transcript can report multiple ORFs
(allowing for operons, chimeras, etc).
+ a PSSM is built/trained/used to refine the start codon prediction.
+ optional the putative peptide has a match to a Pfam domain above the noise cutoff score. |
predict_coding_regions |
../tools/TransDecoder/TransDecoder.Predict-v5.cwl
(CommandLineTool)
|
TransDecoder.Predict: Perl script, which predicts the likely coding regions |
TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks. TransDecoder identifies likely coding sequences based on the following criteria: + a minimum length open reading frame (ORF) is found in a transcript sequence + a log-likelihood score similar to what is computed by the GeneID software is > 0. + the above coding score is greatest when the ORF is scored in the 1st reading frame as compared to scores in the other 2 forward reading frames. + if a candidate ORF is found fully encapsulated by the coordinates of another candidate ORF, the longer one is reported. However, a single transcript can report multiple ORFs (allowing for operons, chimeras, etc). + a PSSM is built/trained/used to refine the start codon prediction. + optional the putative peptide has a match to a Pfam domain above the noise cutoff score. Please visit https://github.com/TransDecoder/TransDecoder/wiki for full documentation. Releases can be downloaded from https://github.com/TransDecoder/TransDecoder/releases |
Outputs
ID | Type | Label | Doc |
---|---|---|---|
bed_output | File | ||
gff3_output | File | ||
coding_regions | File | ||
peptide_sequences | File |
https://w3id.org/cwl/view/git/b32b610cde3524e628d1c7370a21d167da19470c/workflows/TransDecoder-v5-wf-2steps.cwl