Workflow: EMG core analysis

Fetched 2019-11-20 11:19:33 GMT
children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
ncRNA_ribosomal_models File[]
input_sequences File [FASTA]
go_summary_config File
mapseq_ref File [FASTA]
ncRNA_ribosomal_model_clans File
sequencing_run_id String
mapseq_taxonomy File
ncRNA_other_models File[]
ncRNA_other_model_clans File
fraggenescan_model https://w3id.org/cwl/view/git/4d2e0a49a7581fb2a73c60a6833e8cb97a282e55/tools/FragGeneScan-model.yaml#model

Steps

ID Runs Label Doc
convert_otu_counts_to_json
../tools/biom-convert.cwl (CommandLineTool)
remove_asterisks_and_reformat
../tools/esl-reformat.cwl (CommandLineTool)
normalize to fasta

normalizes input sequeces to FASTA with fixed number of sequence characters per line using esl-reformat from https://github.com/EddyRivasLab/easel

convert_otu_counts_to_hdf5
../tools/biom-convert.cwl (CommandLineTool)
get_LSU_coords
extract_SSUs
../tools/esl-sfetch-manyseqs.cwl (CommandLineTool)
extract by names from an indexed sequence file

https://github.com/EddyRivasLab/easel

ipr_stats
../tools/ipr_stats.cwl (CommandLineTool)
gather stats from InterProScan
classify_SSUs
../tools/mapseq.cwl (CommandLineTool)
MAPseq

sequence read classification tools designed to assign taxonomy and OTU classifications to ribosomal RNA sequences. http://meringlab.org/software/mapseq/

categorisation
../tools/create_categorisations.cwl (CommandLineTool)
categorise sequences
get_SSU_coords
functional_analysis functional analysis prediction with InterProScan
find_ribosomal_ncRNAs
visualize_otu_counts
../tools/krona.cwl (CommandLineTool)
visualize using krona
sequence_stats
../tools/qc-stats.cwl (CommandLineTool)
Post QC-ed input analysis of sequence file
extract_LSUs
../tools/esl-sfetch-manyseqs.cwl (CommandLineTool)
extract by names from an indexed sequence file

https://github.com/EddyRivasLab/easel

find_other_ncRNAs
convert_classifications_to_otu_counts
../tools/mapseq2biom.cwl (CommandLineTool)
orf_stats
../tools/orf_stats.cwl (CommandLineTool)
gather stats from ORF caller
get_5S_coords
ORF_prediction
orf_prediction.cwl (Workflow)
Find reads with predicted coding sequences above 60 AA in length
index_reads
../tools/esl-sfetch-index.cwl (CommandLineTool)
index a sequence file for use by esl-sfetch

https://github.com/EddyRivasLab/easel

extract_5Ss
../tools/esl-sfetch-manyseqs.cwl (CommandLineTool)
extract by names from an indexed sequence file

https://github.com/EddyRivasLab/easel

Outputs

ID Type Label Doc
go_summary File
match_count Integer
numberOrfs Integer
qc_stats_gc_pcbin File
stats_reads File
qc_stats_seq_len_bin File
LSU_sequences File
ssu_otu_visualization File
interproscan File
qc_stats_nuc_dist File
other_ncRNAs File
no_functions_seqs File
numberReadsWithOrf Integer
predicted_CDS File
CDS_with_match_count Integer
5S_sequences File
qc_stats_gc File
reads_with_match_count Integer
ssu_otu_counts_json File
qc_stats_seq_len File
pCDS_seqs File
SSU_sequences File
readsWithOrf File
qc_stats_gc_bin File
qc_stats_seq_len_pcbin File
ssu_classifications File
qc_stats_summary File
go_summary_slim File
functional_annotations File
ssu_otu_counts_hdf5 File
Permalink: https://w3id.org/cwl/view/git/4d2e0a49a7581fb2a73c60a6833e8cb97a282e55/workflows/emg-core-analysis-v4.cwl