- Selected
- |
- Default Values
- Nested Workflows
- Tools
- Inputs/Outputs
Inputs
ID | Type | Title | Doc |
---|---|---|---|
mills | File | mills: File specifying common polymorphic indels from mills et al. |
mills provides known polymorphic indels recommended by GATK for a variety of tools including the BaseRecalibrator. This file is part of the GATK resource bundle available at http://www.broadinstitute.org/gatk/guide/article?id=1213 Essentially it is a list of known indels originally discovered by mill et al. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1557762/ File should be in vcf format, and tabix indexed. |
ploidy | Integer (Optional) | ||
strand | |||
refFlat | File | ||
docm_vcf | File | ||
expn_val | Float (Optional) | ||
omni_vcf | File | ||
rna_bams | File[] | ||
tdna_cov | Integer (Optional) | ||
tdna_vaf | Float (Optional) | ||
trna_cov | Integer (Optional) | ||
trna_vaf | Float (Optional) | ||
vep_pick | |||
dbsnp_vcf | File | dbsnp_vcf: File specifying common polymorphic indels from dbSNP |
dbsnp_vcf provides known indels reecommended by GATK for a variety of tools including the BaseRecalibrator. This file is part of the GATK resource bundle available at http://www.broadinstitute.org/gatk/guide/article?id=1213 Essintially it is a list of known indels from dbSNP. File should be in vcf format, and tabix indexed. |
reference | File | reference: Reference fasta file for a desired assembly |
reference contains the nucleotide sequence for a given assembly (hg37, hg38, etc.) in fasta format for the entire genome. This is what reads will be aligned to. Appropriate files can be found on ensembl at https://ensembl.org/info/data/ftp/index.html When providing the reference secondary files corresponding to reference indices must be located in the same directory as the reference itself. These files can be created with samtools index, bwa index, and picard CreateSequenceDictionary. |
cosmic_vcf | File (Optional) | ||
fasta_size | Integer (Optional) | ||
normal_cov | Integer (Optional) | ||
normal_vaf | Float (Optional) | ||
tumor_name | String (Optional) | tumor_name: String specifying the name of the MT sample |
tumor_name provides a string for what the MT sample will be referred to in the various outputs, for exmaple the VCF files. |
exclude_nas | Boolean (Optional) | ||
netmhc_stab | Boolean (Optional) | netmhc_stab: sets an option whether to run NetMHCStabPan or not |
netmhc_stab sets an option that decides whether it will run NetMHCStabPan after all filtering and add stability predictions to predicted epitopes. |
normal_name | String (Optional) | normal_name: String specifying the name of the WT sample |
normal_name provides a string for what the WT sample will be referred to in the various outputs, for exmaple the VCF files. |
sample_name | String | ||
known_indels | File | known_indels: File specifying common polymorphic indels from 1000G |
known_indels provides known indels reecommended by GATK for a variety of tools including the BaseRecalibrator. This file is part of the GATK resource bundle available at http://www.broadinstitute.org/gatk/guide/article?id=1213 Essintially it is a list of known indels from 1000 Genomes Phase I indel calls. File should be in vcf format, and tabix indexed. |
somalier_vcf | File | ||
gvcf_gq_bands | String[] | ||
manta_non_wgs | Boolean (Optional) | ||
optitype_name | String (Optional) | ||
synonyms_file | File (Optional) | ||
vep_cache_dir | Directory | ||
bait_intervals | File | bait_intervals: interval_list file of baits used in the sequencing experiment |
bait_intervals is an interval_list corresponding to the baits used in sequencing reagent. These are essentially coordinates for regions you were able to design probes for in the reagent. Typically the reagent provider has this information available in bed format and it can be converted to an interval_list with Picard BedToIntervalList. AstraZeneca also maintains a repo of baits for common sequencing reagents available at https://github.com/AstraZeneca-NGS/reference_data |
bqsr_intervals | String[] | bqsr_intervals: Array of strings specifying regions for base quality score recalibration |
bqsr_intervals provides an array of genomic intervals for which to apply GATK base quality score recalibrations. Typically intervals are given for the entire chromosome (i.e. chr1, chr2, etc.), these names should match the format in the reference file. |
cle_vcf_filter | Boolean | ||
kallisto_index | File | ||
known_variants | File (Optional) |
Previously discovered variants to be flagged in this pipelines's output vcf |
|
reference_dict | File | ||
rna_readgroups | String[] | ||
tumor_sequence | https://w3id.org/cwl/view/git/04d21c33a5f2950e86db285fa0a32a6659198d8a/definitions/types/sequence_data.yml#sequence_data[] | tumor_sequence: file specifying the location of MT sequencing data |
tumor_sequence is a data structure described in sequence_data.yml used to pass information regarding sequencing data for single sample (i.e. fastq files). If more than one fastq file exist for a sample, as in the case for multiple instrument data, the sequence tag is simply repeated with the additional data (see example input file). Note that in the @RG field ID and SM are required. |
epitope_lengths | Integer[] (Optional) | ||
net_chop_method | net_chop_method: NetChop prediction method to use ('cterm' for C term 3.0, '20s' for 20S 3.0) |
net_chop_method is used to specify which NetChop prediction method to use (\"cterm\" for C term 3.0, \"20s\" for 20S 3.0). C-term 3.0 is trained with publicly available MHC class I ligands and the authors believe that is performs best in predicting the boundaries of CTL epitopes. 20S is trained with in vitro degradation data. |
|
normal_sequence | https://w3id.org/cwl/view/git/04d21c33a5f2950e86db285fa0a32a6659198d8a/definitions/types/sequence_data.yml#sequence_data[] | normal_sequence: file specifying the location of WT sequencing data |
normal_sequence is a data structure described in sequence_data.yml used to pass information regarding sequencing data for single sample (i.e. fastq files). If more than one fastq file exist for a sample, as in the case for multiple instrument data, the sequence tag is simply repeated with the additional data (see example input file). Note that in the @RG field ID and SM are required. |
pvacseq_threads | Integer (Optional) | pvacseq_threads: Number of threads to use for parallelizing pvacseq prediction |
pvacseq_threads specifies the number of threads to use for parallelizing peptide-MHC binding prediction calls. |
reference_index | File | ||
varscan_p_value | Float (Optional) | ||
target_intervals | File | target_intervals: interval_list file of targets used in the sequencing experiment |
target_intervals is an interval_list corresponding to the targets for the capture reagent. BED files with this information can be converted to interval_lists with Picard BedToIntervalList. In general for a WES exome reagent bait_intervals and target_intervals are the same. |
top_score_metric | |||
binding_threshold | Integer (Optional) | ||
read_group_fields | 9e1f9ee45a365577d99e22dc6cd8acb8[] | ||
summary_intervals | https://w3id.org/cwl/view/git/04d21c33a5f2950e86db285fa0a32a6659198d8a/definitions/types/labelled_file.yml#labelled_file[] | ||
trimming_adapters | File | ||
tumor_sample_name | String | tumor_sample_name: Name of the tumor sample |
tumor_sample_name is the name of the tumor sample being processed. When processing a multi-sample VCF the sample name must be a sample ID in the input VCF #CHROM header line. |
manta_call_regions | File (Optional) | ||
net_chop_threshold | Float (Optional) | net_chop_threshold: NetChop prediction threshold |
net_chop_threshold specifies the threshold to use for NetChop prediction; increasing the threshold results in better specificity, but worse sensitivity. |
normal_sample_name | String | tumor_sample_name: Name of the normal sample |
normal_sample_name is the name of the normal sample to use for phasing of germline variants. |
per_base_intervals | https://w3id.org/cwl/view/git/04d21c33a5f2950e86db285fa0a32a6659198d8a/definitions/types/labelled_file.yml#labelled_file[] | ||
pindel_insert_size | Integer | ||
minimum_fold_change | Float (Optional) | ||
ribosomal_intervals | File (Optional) | ||
vep_ensembl_species | String |
ensembl species - Must be present in the cache directory. Examples: homo_sapiens or mus_musculus |
|
vep_ensembl_version | String |
ensembl version - Must be present in the cache directory. Example: 95 |
|
vep_to_table_fields | String[] | ||
annotate_coding_only | Boolean (Optional) | ||
filter_docm_variants | Boolean (Optional) | ||
manta_output_contigs | Boolean (Optional) | ||
mutect_scatter_count | Integer | ||
panel_of_normals_vcf | File (Optional) | ||
per_target_intervals | https://w3id.org/cwl/view/git/04d21c33a5f2950e86db285fa0a32a6659198d8a/definitions/types/labelled_file.yml#labelled_file[] | ||
reference_annotation | File | ||
strelka_cpu_reserved | Integer (Optional) | ||
varscan_min_coverage | Integer (Optional) | ||
varscan_min_var_freq | Float (Optional) | ||
vep_ensembl_assembly | String |
genome assembly to use in vep. Examples: GRCh38 or GRCm38 |
|
prediction_algorithms | String[] | ||
trimming_max_uncalled | Integer | ||
varscan_strand_filter | Integer (Optional) | ||
vep_custom_annotations | https://w3id.org/cwl/view/git/04d21c33a5f2950e86db285fa0a32a6659198d8a/definitions/types/vep_custom_annotation.yml#vep_custom_annotation[] |
custom type, check types directory for input format |
|
peptide_sequence_length | Integer (Optional) | ||
qc_minimum_base_quality | Integer (Optional) | ||
target_interval_padding | Integer | target_interval_padding: number of bp flanking each target region in which to allow variant calls |
The effective coverage of capture products generally extends out beyond the actual regions targeted. This parameter allows variants to be called in these wingspan regions, extending this many base pairs from each side of the target regions. |
trimming_min_readlength | Integer | ||
varscan_max_normal_freq | Float (Optional) | ||
variants_to_table_fields | String[] | ||
additional_report_columns | |||
emit_reference_confidence | |||
trimming_adapter_trim_end | String | ||
downstream_sequence_length | String (Optional) | ||
qc_minimum_mapping_quality | Integer (Optional) | ||
clinical_mhc_classI_alleles | String[] (Optional) | Clinical HLA typing results, limited to MHC Class I alleles; element format: HLA-X*01:02[/HLA-X...] |
used to provide clinical HLA typing results in the format HLA-X*01:02[/HLA-X...] when available. |
clinical_mhc_classII_alleles | String[] (Optional) | Clinical HLA typing results, limited to MHC Class II alleles |
used to provide clinical HLA typing results; separated from class I due to nomenclature inconsistencies |
gene_transcript_lookup_table | File | ||
phased_proximal_variants_vcf | File (Optional) | ||
trimming_adapter_min_overlap | Integer | ||
gatk_haplotypecaller_intervals | f170caffb40a8ff38b5af51cc579cdbc[] | ||
mutect_artifact_detection_mode | Boolean | ||
readcount_minimum_base_quality | Integer (Optional) | ||
maximum_transcript_support_level | |||
picard_metric_accumulation_level | String | ||
readcount_minimum_mapping_quality | Integer (Optional) | ||
variants_to_table_genotype_fields | String[] | ||
allele_specific_binding_thresholds | Boolean (Optional) | ||
mutect_max_alt_alleles_in_normal_count | Integer (Optional) | ||
mutect_max_alt_allele_in_normal_fraction | Float (Optional) |
Steps
ID | Runs | Label | Doc |
---|---|---|---|
rnaseq |
rnaseq.cwl
(Workflow)
|
RNA-Seq alignment and transcript/gene abundance workflow | |
pvacseq |
../subworkflows/pvacseq.cwl
(Workflow)
|
Workflow to run pVACseq from detect_variants and rnaseq pipeline outputs | |
somatic |
somatic_exome.cwl
(Workflow)
|
somatic_exome: exome alignment and somatic variant detection |
somatic_exome is designed to perform processing of mutant/wildtype H.sapiens
exome sequencing data. It features BQSR corrected alignments, 4 caller variant
detection, and vep style annotations. Structural variants are detected via
manta and cnvkit. In addition QC metrics are run, including
somalier concordance metrics. |
germline |
germline_exome_hla_typing.cwl
(Workflow)
|
exome alignment and germline variant detection, with optitype for HLA typing | |
phase_vcf |
../subworkflows/phase_vcf.cwl
(Workflow)
|
phase VCF | |
hla_consensus |
../tools/hla_consensus.cwl
(CommandLineTool)
|
Script to create consensus from optitype and clinical HLA typing | |
extract_alleles |
../tools/extract_hla_alleles.cwl
(CommandLineTool)
|
Outputs
ID | Type | Label | Doc |
---|---|---|---|
cram | File | ||
gvcf | File[] | ||
chart | File (Optional) | Plot for RNA-seq diagnosis/quality metrics |
PDF file for the plot of RNA sequencing coverage at the normalized position across transcript as RNA-seq diagnosis/quality metrics, created by picard CollectRnaSeqMetrics tool |
metrics | File | RNA-seq Diagnosis/quality metrics from tumor RNA |
RNA-seq Diagnosis/quality metrics showing the distribution of the bases within the transcripts, created by picard CollectRnaSeqMetrics tool |
final_bam | File | Sorted BAM from tumor RNA |
Sorted BAM file of sequencing read alignments by HISAT2 with duplicate reads tagged |
final_tsv | File | ||
flagstats | File | ||
cn_diagram | File (Optional) | ||
hs_metrics | File | ||
phased_vcf | File | ||
tumor_cram | File | Sorted CRAM from tumor DNA |
Sorted CRAM file of sequencing read alignments by bwa-mem from a tumor DNA sample with duplicate reads tagged |
normal_cram | File | Sorted CRAM from normal DNA |
Sorted CRAM file of sequencing read alignments by bwa-mem from a normal DNA sample with duplicate reads tagged |
optitype_tsv | File | ||
allele_string | String[] | ||
annotated_tsv | File | ||
annotated_vcf | File | ||
optitype_plot | File | ||
all_candidates | File | ||
gene_abundance | File | Gene-level abundance output by tximport with kallisto output |
Tab-delimited file containing the abundance estimates summarized in the gene level with kallisto output by Bioconductor tximport tool |
hla_call_files | Directory | ||
cn_scatter_plot | File (Optional) | ||
tumor_flagstats | File | Sequencing count metrics based on SAM FLAG field from tumor sample |
Summary with the count numbers of alignments for each FLAG type from a tumor DNA sample, including 13 categories based on the bit flags in the FLAG field |
diploid_variants | File (Optional) | ||
germline_raw_vcf | File | ||
intervals_target | File (Optional) | ||
normal_flagstats | File | Sequencing count metrics based on SAM FLAG field from normal sample |
Summary with the count numbers of alignments for each FLAG type from a normal DNA sample, including 13 categories based on the bit flags in the FLAG field |
small_candidates | File | ||
somatic_variants | File (Optional) | ||
tumor_hs_metrics | File | Sequencing coverage summary of target intervals from tumor DNA |
Diagnosis/quality metrics specific for sequencing data generated through hybrid-selection (e.g. whole exome) from a tumor DNA sample, for example to assess target coverage of WES |
consensus_alleles | String[] | ||
docm_filtered_vcf | File | ||
normal_hs_metrics | File | Sequencing coverage summary of target intervals from normal DNA |
Diagnosis/quality metrics specific for sequencing data generated through hybrid-selection (e.g. whole exome) from a normal DNA sample, for example to assess target coverage |
somatic_final_vcf | File | ||
final_filtered_vcf | File | ||
germline_final_vcf | File | ||
reference_coverage | File (Optional) | ||
summary_hs_metrics | File[] | ||
insert_size_metrics | File | ||
mutect_filtered_vcf | File | ||
per_base_hs_metrics | File[] | ||
pindel_filtered_vcf | File | ||
pvacseq_predictions | Directory | ||
somatic_vep_summary | File | ||
tumor_only_variants | File (Optional) | ||
verify_bam_id_depth | File | ||
germline_vep_summary | File | ||
intervals_antitarget | File (Optional) | ||
strelka_filtered_vcf | File | ||
varscan_filtered_vcf | File | ||
germline_filtered_vcf | File | ||
insert_size_histogram | File | ||
mutect_unfiltered_vcf | File | ||
per_target_hs_metrics | File[] | ||
pindel_unfiltered_vcf | File | ||
tumor_target_coverage | File | ||
verify_bam_id_metrics | File | ||
normal_target_coverage | File | ||
strelka_unfiltered_vcf | File | ||
tumor_bin_level_ratios | File | ||
tumor_segmented_ratios | File | ||
varscan_unfiltered_vcf | File | ||
mark_duplicates_metrics | File | ||
transcript_abundance_h5 | File | Transcript-level abundance table in HDF5 format by kallisto |
HDF5 binary file containing transcript-level abundance esimates, bootstrap estimate, and so on, created by kallisto |
stringtie_transcript_gtf | File | Transcript GTF assembled from tumor RNA by StringTie |
GTF file containing the transcripts assembled from the tumor RNA sample, created by StringTie |
transcript_abundance_tsv | File | Transcript-level abundance table by kallisto |
Tab-delimited file containing transcript-level abundance estimates in TPM, created by kallisto |
tumor_summary_hs_metrics | File[] | ||
alignment_summary_metrics | File | ||
normal_summary_hs_metrics | File[] | ||
per_base_coverage_metrics | File[] | ||
tumor_antitarget_coverage | File | ||
tumor_insert_size_metrics | File | Paired-end sequencing diagnosis/quality metrics from tumor DNA |
Diagnosis/quality metrics including the insert size distribution and read orientation of the paired-end libraries from a tumor DNA sample |
tumor_per_base_hs_metrics | File[] | Sequencing coverage summary at target sites from tumor DNA |
Diagnosis/quality metrics for sequencing coverage at target sites (optional, known variant sites of clinical significance from ClinVar for example) from a tumor DNA sample |
tumor_verify_bam_id_depth | File | Sequencing quality assessment metric for tumor sample genotyping |
verifyBamID output files showing the sequencing depth distribution at the marker positions from Omni genotype data with a tumor DNA sample, across all readGroups and per readGroup separately |
normal_antitarget_coverage | File | ||
normal_insert_size_metrics | File | Paired-end sequencing diagnosis/quality metrics from normal DNA |
Diagnosis/quality metrics including the insert size distribution and read orientation of the paired-end libraries from a normal DNA sample |
normal_per_base_hs_metrics | File[] | Sequencing coverage summary at target sites from normal DNA |
Diagnosis/quality metrics for sequencing coverage at target sites (optional, known variant sites of clinical significance from ClinVar for example) from a normal DNA sample |
normal_verify_bam_id_depth | File | Sequencing quality assessment metric for normal sample genotyping |
verifyBamID output files showing the sequencing depth distribution at the marker positions from Omni genotype data with a normal DNA sample, across all readGroups and per readGroup separately |
per_target_coverage_metrics | File[] | ||
tumor_per_target_hs_metrics | File[] | Sequencing coverage summary of target intervals from tumor DNA |
Diagnosis/quality metrics for sequencing coverage for target intervals (optional, 59 genes recommended by ACMG for clinical exome and genome sequencing for example) from a tumor DNA sample |
tumor_snv_bam_readcount_tsv | File | ||
tumor_verify_bam_id_metrics | File | Sequencing quality assessment metric for tumor sample contamination |
verifyBamID output files containing the contamination estimate in a tumor DNA sample, across all readGroups and per readGroup separately |
normal_per_target_hs_metrics | File[] | Sequencing coverage summary of target intervals from normal DNA |
Diagnosis/quality metrics for sequencing coverage for target intervals (optional, 59 genes recommended by ACMG for clinical exome and genome sequencing for example) from a normal DNA sample |
normal_snv_bam_readcount_tsv | File | ||
normal_verify_bam_id_metrics | File | Sequencing quality assessment metric for normal sample contamination |
verifyBamID output files containing the contamination estimate in a normal DNA sample, across all readGroups and per readGroup separately |
somalier_concordance_metrics | File | ||
stringtie_gene_expression_tsv | File | Gene abundance table from tumor RNA by StringTie |
Tab-delimited file containing gene abundances in FPKM and TPM, created by StringTie |
tumor_indel_bam_readcount_tsv | File | ||
tumor_mark_duplicates_metrics | File | Sequencing duplicate metrics from tumor DNA |
Duplication metrics on duplicate sequencing reads from a tumor DNA sample, identified by picard MarkDuplicates tool |
normal_indel_bam_readcount_tsv | File | ||
normal_mark_duplicates_metrics | File | Sequencing duplicate metrics from normal DNA |
Duplication metrics on duplicate sequencing reads from a normal DNA sample, identified by picard MarkDuplicates tool |
somalier_concordance_statistics | File | ||
tumor_alignment_summary_metrics | File | Sequencign alignment summary from tumor DNA |
Diagnosis/quality metrics summarizing the quality of sequencing read alignments from a tumor DNA sample, reported by the picard CollectAlignmentSummaryMetrics tool |
tumor_per_base_coverage_metrics | File[] | Sequencing per-base coverage summary at target sites from tumor DNA |
Diagnosis/quality metrics showing detailed sequencing coverage per target site (optional, known variant sites of clinical significance from ClinVar for example) from a tumor DNA sample |
normal_alignment_summary_metrics | File | Sequencign alignment summary from normal DNA |
Diagnosis/quality metrics summarizing the quality of sequencing read alignments from a normal DNA sample, reported by the picard CollectAlignmentSummaryMetrics tool |
normal_per_base_coverage_metrics | File[] | Sequencing per-base coverage summary at target sites from normal DNA |
Diagnosis/quality metrics showing detailed sequencing coverage per target site (optional, known variant sites of clinical significance from ClinVar for example) from a normal DNA sample |
tumor_per_target_coverage_metrics | File[] | Sequencing per-target coverage summary of target intervals from tumor DNA |
Diagnosis/quality metrics showing detailed sequencing coverage per target interval (optional, 59 genes recommended by ACMG for clinical exome and genome sequencing for example) from a tumor DNA sample |
normal_per_target_coverage_metrics | File[] | Sequencing per-target coverage summary of target intervals from normal DNA |
Diagnosis/quality metrics showing detailed sequencing coverage per target interval (optional, 59 genes recommended by ACMG for clinical exome and genome sequencing for example) from a normal DNA sample |
https://w3id.org/cwl/view/git/04d21c33a5f2950e86db285fa0a32a6659198d8a/definitions/pipelines/immuno.cwl