Workflow: EMG assembly for paired end Illumina

Fetched 2021-07-28 11:13:45 GMT
children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
covariance_model_database File
mapseq_ref File [FASTA]
mapseq_taxonomies File[]
assembly_mem_limit Integer

in Gb

forward_reads File [FASTQ]
reverse_reads File [FASTQ]
fraggenescan_model https://w3id.org/cwl/view/git/30397448563d06c342b25a3603c97b6fff7ba7d3/tools/FragGeneScan-model.yaml#model

Steps

ID Runs Label Doc
fraggenescan
../tools/FragGeneScan1_20.cwl (CommandLineTool)
FragGeneScan: find (fragmented) genes in short reads

FragGeneScan is an application for finding (fragmented) genes in short reads. It can also be applied to predict prokaryotic genes in incomplete assemblies or complete genomes.

FragGeneScan was first released through omics website (http://omics.informatics.indiana.edu/FragGeneScan/) in March 2010, where you can find its old releases. FragGeneScan migrated to SourceForge in October, 2013 (https://sourceforge.net/projects/fraggenescan/).

Version 1.20 can be downloaded here: https://sourceforge.net/projects/fraggenescan/files/

cmscan
../tools/infernal-cmscan.cwl (CommandLineTool)
search sequence(s) against a covariance model database

http://eddylab.org/infernal/Userguide.pdf

remove_asterisks_and_reformat
../tools/esl-reformat.cwl (CommandLineTool)
normalize to fasta

normalizes input sequeces to FASTA with fixed number of sequence characters per line using esl-reformat from https://github.com/EddyRivasLab/easel

index_scaffolds
../tools/esl-sfetch-index.cwl (CommandLineTool)
index a sequence file for use by esl-sfetch

https://github.com/EddyRivasLab/easel

extract_SSUs
../tools/esl-sfetch-manyseqs.cwl (CommandLineTool)
extract by names from an indexed sequence file

https://github.com/EddyRivasLab/easel

assembly
../tools/metaspades.cwl (CommandLineTool)
metaSPAdes: de novo metagenomics assembler

https://arxiv.org/abs/1604.03071 http://cab.spbu.ru/files/release3.10.1/manual.html#meta

classify_SSUs
../tools/mapseq.cwl (CommandLineTool)
MAPseq

sequence read classification tools designed to assign taxonomy and OTU classifications to ribosomal RNA sequences. http://meringlab.org/software/mapseq/

interproscan
../tools/InterProScan5.21-60.cwl (CommandLineTool)
InterProScan: protein sequence classifier

Version 5.21-60 can be downloaded here: https://github.com/ebi-pf-team/interproscan/wiki/HowToDownload

Documentation on how to run InterProScan 5 can be found here: https://github.com/ebi-pf-team/interproscan/wiki/HowToRun

get_SSU_coords

Outputs

ID Type Label Doc
classifications File
pCDS File
scaffolds File
SSUs File
annotations File
Permalink: https://w3id.org/cwl/view/git/30397448563d06c342b25a3603c97b6fff7ba7d3/workflows/emg-assembly.cwl