Workflow: EMG core analysis

Fetched 2022-10-02 18:48:17 GMT
children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
mapseq_ref File [FASTA]
input_sequences File [FASTA]
mapseq_taxonomy File
go_summary_config File
sequencing_run_id String
fraggenescan_model https://w3id.org/cwl/view/git/7bb76f33bf40b5cd2604001cac46f967a209c47f/tools/FragGeneScan-model.yaml#model
ncRNA_other_models File[]
ncRNA_ribosomal_models File[]
ncRNA_other_model_clans File
ncRNA_ribosomal_model_clans File

Steps

ID Runs Label Doc
ipr_stats
../tools/ipr_stats.cwl (CommandLineTool)
gather stats from InterProScan
orf_stats
../tools/orf_stats.cwl (CommandLineTool)
gather stats from ORF caller
extract_5Ss
../tools/esl-sfetch-manyseqs.cwl (CommandLineTool)
extract by names from an indexed sequence file

https://github.com/EddyRivasLab/easel

index_reads
../tools/esl-sfetch-index.cwl (CommandLineTool)
index a sequence file for use by esl-sfetch

https://github.com/EddyRivasLab/easel

extract_LSUs
../tools/esl-sfetch-manyseqs.cwl (CommandLineTool)
extract by names from an indexed sequence file

https://github.com/EddyRivasLab/easel

extract_SSUs
../tools/esl-sfetch-manyseqs.cwl (CommandLineTool)
extract by names from an indexed sequence file

https://github.com/EddyRivasLab/easel

classify_SSUs
../tools/mapseq.cwl (CommandLineTool)
MAPseq

sequence read classification tools designed to assign taxonomy and OTU classifications to ribosomal RNA sequences. http://meringlab.org/software/mapseq/

get_5S_coords
ORF_prediction
orf_prediction.cwl (Workflow)
Find reads with predicted coding sequences above 60 AA in length
categorisation
../tools/create_categorisations.cwl (CommandLineTool)
categorise sequences
get_LSU_coords
get_SSU_coords
sequence_stats
../tools/qc-stats.cwl (CommandLineTool)
Post QC-ed input analysis of sequence file
find_other_ncRNAs
functional_analysis functional analysis prediction with InterProScan
visualize_otu_counts
../tools/krona.cwl (CommandLineTool)
visualize using krona
find_ribosomal_ncRNAs
convert_otu_counts_to_hdf5
../tools/biom-convert.cwl (CommandLineTool)
convert_otu_counts_to_json
../tools/biom-convert.cwl (CommandLineTool)
remove_asterisks_and_reformat
../tools/esl-reformat.cwl (CommandLineTool)
normalize to fasta

normalizes input sequeces to FASTA with fixed number of sequence characters per line using esl-reformat from https://github.com/EddyRivasLab/easel

convert_classifications_to_otu_counts
../tools/mapseq2biom.cwl (CommandLineTool)

Outputs

ID Type Label Doc
pCDS_seqs File
go_summary File
numberOrfs Integer
match_count Integer
qc_stats_gc File
stats_reads File
5S_sequences File
interproscan File
other_ncRNAs File
readsWithOrf File
LSU_sequences File
SSU_sequences File
predicted_CDS File
go_summary_slim File
qc_stats_gc_bin File
qc_stats_seq_len File
qc_stats_summary File
no_functions_seqs File
qc_stats_gc_pcbin File
qc_stats_nuc_dist File
numberReadsWithOrf Integer
ssu_classifications File
ssu_otu_counts_hdf5 File
ssu_otu_counts_json File
CDS_with_match_count Integer
qc_stats_seq_len_bin File
ssu_otu_visualization File
functional_annotations File
qc_stats_seq_len_pcbin File
reads_with_match_count Integer
Permalink: https://w3id.org/cwl/view/git/7bb76f33bf40b5cd2604001cac46f967a209c47f/workflows/emg-core-analysis-v4.cwl