- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
mills | File | mills: File specifying common polymorphic indels from mills et al. |
mills provides known polymorphic indels recommended by GATK for a variety of tools including the BaseRecalibrator. This file is part of the GATK resource bundle available at http://www.broadinstitute.org/gatk/guide/article?id=1213 Essentially it is a list of known indels originally discovered by mill et al. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1557762/ File should be in vcf format, and tabix indexed. |
strand | |||
refFlat | File | ||
docm_vcf | File | ||
expn_val | Float (Optional) | ||
omni_vcf | File | ||
rna_bams | File[] | ||
tdna_cov | Integer (Optional) | ||
tdna_vaf | Float (Optional) | ||
trna_cov | Integer (Optional) | ||
trna_vaf | Float (Optional) | ||
vep_pick | |||
dbsnp_vcf | File | dbsnp_vcf: File specifying common polymorphic indels from dbSNP |
dbsnp_vcf provides known indels reecommended by GATK for a variety of tools including the BaseRecalibrator. This file is part of the GATK resource bundle available at http://www.broadinstitute.org/gatk/guide/article?id=1213 Essintially it is a list of known indels from dbSNP. File should be in vcf format, and tabix indexed. |
reference | File | reference: Reference fasta file for a desired assembly |
reference contains the nucleotide sequence for a given assembly (hg37, hg38, etc.) in fasta format for the entire genome. This is what reads will be aligned to. Appropriate files can be found on ensembl at https://ensembl.org/info/data/ftp/index.html When providing the reference secondary files corresponding to reference indices must be located in the same directory as the reference itself. These files can be created with samtools index, bwa index, and picard CreateSequenceDictionary. |
cosmic_vcf | File (Optional) | ||
fasta_size | Integer (Optional) | ||
normal_cov | Integer (Optional) | ||
normal_vaf | Float (Optional) | ||
tumor_name | String (Optional) | tumor_name: String specifying the name of the MT sample |
tumor_name provides a string for what the MT sample will be referred to in the various outputs, for exmaple the VCF files. |
exclude_nas | Boolean (Optional) | ||
netmhc_stab | Boolean (Optional) | netmhc_stab: sets an option whether to run NetMHCStabPan or not |
netmhc_stab sets an option that decides whether it will run NetMHCStabPan after all filtering and add stability predictions to predicted epitopes. |
normal_name | String (Optional) | normal_name: String specifying the name of the WT sample |
normal_name provides a string for what the WT sample will be referred to in the various outputs, for exmaple the VCF files. |
sample_name | String | ||
known_indels | File | known_indels: File specifying common polymorphic indels from 1000G |
known_indels provides known indels reecommended by GATK for a variety of tools including the BaseRecalibrator. This file is part of the GATK resource bundle available at http://www.broadinstitute.org/gatk/guide/article?id=1213 Essintially it is a list of known indels from 1000 Genomes Phase I indel calls. File should be in vcf format, and tabix indexed. |
somalier_vcf | File | ||
gvcf_gq_bands | String[] | ||
interval_list | File | ||
manta_non_wgs | Boolean (Optional) | ||
optitype_name | String (Optional) | ||
synonyms_file | File (Optional) | ||
vep_cache_dir | Directory | ||
bait_intervals | File | bait_intervals: interval_list file of baits used in the sequencing experiment |
bait_intervals is an interval_list corresponding to the baits used in sequencing reagent. These are essentially coordinates for regions you were able to design probes for in the reagent. Typically the reagent provider has this information available in bed format and it can be converted to an interval_list with Picards BedToIntervalList. AstraZeneca also maintains a repo of baits for common sequencing reagents available at https://github.com/AstraZeneca-NGS/reference_data |
bqsr_intervals | String[] | bqsr_intervals: Array of strings specifying regions for base quality score recalibration |
bqsr_intervals provides an array of genomic intervals for which to apply GATK base quality score recalibrations. Typically intervals are given for the entire chromosome (i.e. chr1, chr2, etc.), these names should match the format in the reference file. |
cle_vcf_filter | Boolean | ||
kallisto_index | File | ||
known_variants | File (Optional) |
Previously discovered variants to be flagged in this pipelines's output vcf |
|
reference_dict | File | ||
rna_readgroups | String[] | ||
tumor_sequence | https://w3id.org/cwl/view/git/844c10a4466ab39c02e5bfa7a210c195b8efa77a/definitions/types/sequence_data.yml#sequence_data[] | tumor_sequence: file specifying the location of MT sequencing data |
tumor_sequence is a data structure described in sequence_data.yml used to pass information regarding sequencing data for single sample (i.e. fastq files). If more than one fastq file exist for a sample, as in the case for multiple instrument data, the sequence tag is simply repeated with the additional data (see example input file). Note that in the @RG field ID and SM are required. |
epitope_lengths | Integer[] (Optional) | ||
net_chop_method | net_chop_method: NetChop prediction method to use ('cterm' for C term 3.0, '20s' for 20S 3.0) |
net_chop_method is used to specify which NetChop prediction method to use (\"cterm\" for C term 3.0, \"20s\" for 20S 3.0). C-term 3.0 is trained with publicly available MHC class I ligands and the authors believe that is performs best in predicting the boundaries of CTL epitopes. 20S is trained with in vitro degradation data. |
|
normal_sequence | https://w3id.org/cwl/view/git/844c10a4466ab39c02e5bfa7a210c195b8efa77a/definitions/types/sequence_data.yml#sequence_data[] | normal_sequence: file specifying the location of WT sequencing data |
normal_sequence is a data structure described in sequence_data.yml used to pass information regarding sequencing data for single sample (i.e. fastq files). If more than one fastq file exist for a sample, as in the case for multiple instrument data, the sequence tag is simply repeated with the additional data (see example input file). Note that in the @RG field ID and SM are required. |
pvacseq_threads | Integer (Optional) | pvacseq_threads: Number of threads to use for parallelizing pvacseq prediction |
pvacseq_threads specifies the number of threads to use for parallelizing peptide-MHC binding prediction calls. |
reference_index | File | ||
varscan_p_value | Float (Optional) | ||
target_intervals | File | target_intervals: interval_list file of targets used in the sequencing experiment |
target_intervals is an interval_list corresponding to the targets for the sequencing reagent. These are essentially coordinates for regions you wanted to design probes for in the reagent. Bed files with this information can be converted to interval_lists with Picards BedToIntervalList. In general for a WES exome reagent bait_intervals and target_intervals are the same. |
top_score_metric | |||
binding_threshold | Integer (Optional) | ||
read_group_fields | 0f994929d80e75fc80fdc20527bb685f[] | ||
summary_intervals | https://w3id.org/cwl/view/git/844c10a4466ab39c02e5bfa7a210c195b8efa77a/definitions/types/labelled_file.yml#labelled_file[] | ||
trimming_adapters | File | ||
tumor_sample_name | String | tumor_sample_name: Name of the tumor sample |
tumor_sample_name is the name of the tumor sample being processed. When processing a multi-sample VCF the sample name must be a sample ID in the input VCF #CHROM header line. |
manta_call_regions | File (Optional) | ||
net_chop_threshold | Float (Optional) | net_chop_threshold: NetChop prediction threshold |
net_chop_threshold specifies the threshold to use for NetChop prediction; increasing the threshold results in better specificity, but worse sensitivity. |
normal_sample_name | String | tumor_sample_name: Name of the normal sample |
normal_sample_name is the name of the normal sample to use for phasing of germline variants. |
per_base_intervals | https://w3id.org/cwl/view/git/844c10a4466ab39c02e5bfa7a210c195b8efa77a/definitions/types/labelled_file.yml#labelled_file[] | ||
pindel_insert_size | Integer | ||
minimum_fold_change | Float (Optional) | ||
ribosomal_intervals | File (Optional) | ||
vep_ensembl_species | String |
ensembl species - Must be present in the cache directory. Examples: homo_sapiens or mus_musculus |
|
vep_ensembl_version | String |
ensembl version - Must be present in the cache directory. Example: 95 |
|
vep_to_table_fields | String[] | ||
annotate_coding_only | Boolean (Optional) | ||
clinical_hla_alleles | String[] (Optional) | clinical_calls: Clinical HLA typing results; element format: HLA-X*01:02[/HLA-X...] |
clinical_calls is used to provide clinical HLA typing results in the format HLA-X*01:02[/HLA-X...] when available. |
filter_docm_variants | Boolean (Optional) | ||
manta_output_contigs | Boolean (Optional) | ||
mutect_scatter_count | Integer | ||
panel_of_normals_vcf | File (Optional) | ||
per_target_intervals | https://w3id.org/cwl/view/git/844c10a4466ab39c02e5bfa7a210c195b8efa77a/definitions/types/labelled_file.yml#labelled_file[] | ||
reference_annotation | File | ||
strelka_cpu_reserved | Integer (Optional) | ||
varscan_min_coverage | Integer (Optional) | ||
varscan_min_var_freq | Float (Optional) | ||
vep_ensembl_assembly | String |
genome assembly to use in vep. Examples: GRCh38 or GRCm38 |
|
prediction_algorithms | String[] | ||
trimming_max_uncalled | Integer | ||
varscan_strand_filter | Integer (Optional) | ||
vep_custom_annotations | https://w3id.org/cwl/view/git/844c10a4466ab39c02e5bfa7a210c195b8efa77a/definitions/types/vep_custom_annotation.yml#vep_custom_annotation[] |
custom type, check types directory for input format |
|
peptide_sequence_length | Integer (Optional) | ||
qc_minimum_base_quality | Integer (Optional) | ||
trimming_min_readlength | Integer | ||
varscan_max_normal_freq | Float (Optional) | ||
variants_to_table_fields | String[] | ||
additional_report_columns | |||
emit_reference_confidence | |||
trimming_adapter_trim_end | String | ||
downstream_sequence_length | String (Optional) | ||
qc_minimum_mapping_quality | Integer (Optional) | ||
gene_transcript_lookup_table | File | ||
phased_proximal_variants_vcf | File (Optional) | ||
trimming_adapter_min_overlap | Integer | ||
gatk_haplotypecaller_intervals | 48ab372cd7bbb4d162497c93b6f171f5[] | ||
mutect_artifact_detection_mode | Boolean | ||
readcount_minimum_base_quality | Integer (Optional) | ||
maximum_transcript_support_level | |||
picard_metric_accumulation_level | String | ||
readcount_minimum_mapping_quality | Integer (Optional) | ||
variants_to_table_genotype_fields | String[] | ||
allele_specific_binding_thresholds | Boolean (Optional) | ||
mutect_max_alt_alleles_in_normal_count | Integer (Optional) | ||
mutect_max_alt_allele_in_normal_fraction | Float (Optional) |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
rnaseq |
rnaseq.cwl
(Workflow)
|
RNA-Seq alignment and transcript/gene abundance workflow | |
pvacseq |
../subworkflows/pvacseq.cwl
(Workflow)
|
Workflow to run pVACseq from detect_variants and rnaseq pipeline outputs | |
somatic |
somatic_exome.cwl
(Workflow)
|
somatic_exome: exome alignment and somatic variant detection |
somatic_exome is designed to perform processing of mutant/wildtype H.sapiens
exome sequencing data. It features BQSR corrected alignments, 4 caller variant
detection, and vep style annotations. Structural variants are detected via
manta and cnvkit. In addition QC metrics are run, including
somalier concordance metrics. |
germline |
germline_exome_hla_typing.cwl
(Workflow)
|
exome alignment and germline variant detection, with optitype for HLA typing | |
phase_vcf |
../subworkflows/phase_vcf.cwl
(Workflow)
|
phase VCF | |
hla_consensus |
../tools/hla_consensus.cwl
(CommandLineTool)
|
Script to create consensus from optitype and clinical HLA typing | |
extract_alleles |
../tools/extract_hla_alleles.cwl
(CommandLineTool)
|
Outputs
ID | Type | Label | Doc |
---|---|---|---|
cram | File | ||
gvcf | File[] | ||
chart | File (Optional) | ||
metrics | File | ||
final_bam | File | ||
final_tsv | File | ||
flagstats | File | ||
cn_diagram | File (Optional) | ||
hs_metrics | File | ||
phased_vcf | File | ||
tumor_cram | File | ||
normal_cram | File | ||
optitype_tsv | File | ||
allele_string | String[] | ||
annotated_tsv | File | ||
annotated_vcf | File | ||
optitype_plot | File | ||
all_candidates | File | ||
gene_abundance | File | ||
hla_call_files | Directory | ||
cn_scatter_plot | File (Optional) | ||
tumor_flagstats | File | ||
diploid_variants | File (Optional) | ||
intervals_target | File (Optional) | ||
normal_flagstats | File | ||
small_candidates | File | ||
somatic_variants | File (Optional) | ||
tumor_hs_metrics | File | ||
consensus_alleles | String[] | ||
docm_filtered_vcf | File | ||
normal_hs_metrics | File | ||
somatic_final_vcf | File | ||
final_filtered_vcf | File | ||
germline_final_vcf | File | ||
mhc_i_all_epitopes | File (Optional) | ||
reference_coverage | File (Optional) | ||
summary_hs_metrics | File[] | ||
insert_size_metrics | File | ||
mhc_ii_all_epitopes | File (Optional) | ||
mutect_filtered_vcf | File | ||
per_base_hs_metrics | File[] | ||
pindel_filtered_vcf | File | ||
somatic_vep_summary | File | ||
tumor_only_variants | File (Optional) | ||
verify_bam_id_depth | File | ||
germline_vep_summary | File | ||
intervals_antitarget | File (Optional) | ||
strelka_filtered_vcf | File | ||
varscan_filtered_vcf | File | ||
combined_all_epitopes | File (Optional) | ||
germline_filtered_vcf | File | ||
insert_size_histogram | File | ||
mhc_i_ranked_epitopes | File (Optional) | ||
mutect_unfiltered_vcf | File | ||
per_target_hs_metrics | File[] | ||
pindel_unfiltered_vcf | File | ||
tumor_target_coverage | File | ||
verify_bam_id_metrics | File | ||
mhc_ii_ranked_epitopes | File (Optional) | ||
normal_target_coverage | File | ||
strelka_unfiltered_vcf | File | ||
tumor_bin_level_ratios | File | ||
tumor_segmented_ratios | File | ||
varscan_unfiltered_vcf | File | ||
mark_duplicates_metrics | File | ||
mhc_i_filtered_epitopes | File (Optional) | ||
transcript_abundance_h5 | File | ||
combined_ranked_epitopes | File (Optional) | ||
mhc_ii_filtered_epitopes | File (Optional) | ||
stringtie_transcript_gtf | File | ||
transcript_abundance_tsv | File | ||
tumor_summary_hs_metrics | File[] | ||
alignment_summary_metrics | File | ||
normal_summary_hs_metrics | File[] | ||
per_base_coverage_metrics | File[] | ||
tumor_antitarget_coverage | File | ||
tumor_insert_size_metrics | File | ||
tumor_per_base_hs_metrics | File[] | ||
tumor_verify_bam_id_depth | File | ||
combined_filtered_epitopes | File (Optional) | ||
normal_antitarget_coverage | File | ||
normal_insert_size_metrics | File | ||
normal_per_base_hs_metrics | File[] | ||
normal_verify_bam_id_depth | File | ||
per_target_coverage_metrics | File[] | ||
tumor_per_target_hs_metrics | File[] | ||
tumor_snv_bam_readcount_tsv | File | ||
tumor_verify_bam_id_metrics | File | ||
normal_per_target_hs_metrics | File[] | ||
normal_snv_bam_readcount_tsv | File | ||
normal_verify_bam_id_metrics | File | ||
somalier_concordance_metrics | File | ||
stringtie_gene_expression_tsv | File | ||
tumor_indel_bam_readcount_tsv | File | ||
tumor_mark_duplicates_metrics | File | ||
normal_indel_bam_readcount_tsv | File | ||
normal_mark_duplicates_metrics | File | ||
somalier_concordance_statistics | File | ||
tumor_alignment_summary_metrics | File | ||
tumor_per_base_coverage_metrics | File[] | ||
normal_alignment_summary_metrics | File | ||
normal_per_base_coverage_metrics | File[] | ||
tumor_per_target_coverage_metrics | File[] | ||
normal_per_target_coverage_metrics | File[] |
https://w3id.org/cwl/view/git/844c10a4466ab39c02e5bfa7a210c195b8efa77a/definitions/pipelines/immuno.cwl