Workflow: contig construction and protein prediction

Fetched 2025-10-19 19:46:05 GMT

\"This workflow performs construction of metagenomic contigs and prediction protein sequences for metagenomic contigs. It executes 2 processes: contig construction and protein prediction. related CWL file: ./Tools/01_megahit.cwl ./Tools/02_rename.cwl ./Tools/03_seqkit_stats.cwl ./Tools/04_prodigal.cwl\"

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
FASTQ_1 File Forward reads file (_1_trim.fastq.gz)

trimmed metagenomic forward reads file (_1_trim.fastq.gz)

FASTQ_2 File Reverse reads file (_2_trim.fastq.gz)

trimmed metagenomic reverse reads file (_2_trim.fastq.gz)

THREADS Integer Threads

Number of threads to use

MEGAHIT_OUTPUT_DIR String Output directory name (e.g., contig${sample_name})

Output directory name (e.g., contig${sample_name}), you should not make the directory before execution.

Steps

ID Runs Label Doc
MEGAHIT
../Tools/01_megahit.cwl (CommandLineTool)
metagenome assembly process using MEGAHIT

\"metagenome assembly process using MEGAHIT Input files are trimmed metagenomic fastq which obtained from 00_fastp.cwl, the metagenomic contigs are outputs. Original script: https://github.com/RyoMameda/workflow/blob/main/03_assembly.sh\"

SEQKIT_STATS
../Tools/03_seqkit_stats.cwl (CommandLineTool)
seqkit stats process

\"statical analysis of metagenomic contigs using SeqKit Checking value (such as N50, length ...) to confirm the contigs quality Original script: https://github.com/RyoMameda/workflow/blob/main/03_assembly.sh\"

RENAME_CONTIG
../Tools/02_rename.cwl (CommandLineTool)
rename process

\"assembled contig rename process\"

PROTEIN_PREDICTION
../Tools/04_prodigal.cwl (CommandLineTool)
prodigal process

\"prediction of protein coding sequences from metagenomic contigs using Prodigal Original script: https://github.com/RyoMameda/workflow/blob/main/04_prodigal.sh prodigal -i ${contig} -o ${output}.gbk -p meta -q -a ${output}.faa -d ${output}.fna\"

Outputs

ID Type Label Doc
CONTIG_STATS_FILE File stats file

text file containing metagenoic contig stats

PREDICTED_PROTEINS File Output protein fasta file

predicted protein sequences to the selected file.

PREDICTED_CODING_DNA File Output dna fasta file

protein coding nucleotide sequences to the selected file

NAMED_CONTIG_FASTA_FILE File Output contigs fasta file

Named metagenomic contigs fasta file

MEGAHIT_CONTIG_DIRECTORY Directory Output directory of megahit

Output directory containing final metagenomic contigs fasta file (final.contigs.fa)

Permalink: https://w3id.org/cwl/view/git/1838569c1d6d3c15f58c254667d4c6258e67e5a6/Workflow/megahit_prodigal_sw.cwl