API
Import
using VariantVisualization
Index
VariantVisualization.add_pheno_matrix_to_dp_data_for_plotting
VariantVisualization.add_pheno_matrix_to_gt_data_for_plotting
VariantVisualization.avg_dp_samples
VariantVisualization.avg_dp_variant
VariantVisualization.avg_sample_dp_scatter
VariantVisualization.avg_variant_dp_line_chart
VariantVisualization.build_set_from_list
VariantVisualization.checkfor_outputdirectory
VariantVisualization.chromosome_label_generator
VariantVisualization.clean_column1!
VariantVisualization.clean_column1_siglist!
VariantVisualization.combined_all_genotype_array_functions
VariantVisualization.combined_all_read_depth_array_functions
VariantVisualization.combined_all_read_depth_array_functions_for_avg_dp
VariantVisualization.create_chr_dict
VariantVisualization.define_geno_dict
VariantVisualization.dp_heatmap2
VariantVisualization.dp_heatmap2_with_groups
VariantVisualization.find_group_label_indices
VariantVisualization.generate_chromosome_positions_for_hover_labels
VariantVisualization.generate_genotype_array
VariantVisualization.generate_hover_text_array
VariantVisualization.generate_hover_text_array_grouped
VariantVisualization.generate_legend_increments_grouped
VariantVisualization.generate_legend_increments_ungrouped
VariantVisualization.genomic_range_siglist_filter
VariantVisualization.genotype_heatmap2_new_legend
VariantVisualization.genotype_heatmap_with_groups
VariantVisualization.get_sample_names
VariantVisualization.index_vcf
VariantVisualization.io_genomic_range_vcf_filter
VariantVisualization.io_pass_filter
VariantVisualization.io_sig_list_vcf_filter
VariantVisualization.jupyter_main
VariantVisualization.list_sample_names_low_dp
VariantVisualization.list_variant_positions_low_dp
VariantVisualization.load_siglist
VariantVisualization.make_chromosome_labels
VariantVisualization.match_siglist_to_index
VariantVisualization.pass_genomic_range_filter
VariantVisualization.pass_genomic_range_siglist_filter
VariantVisualization.pass_siglist_filter
VariantVisualization.process_plot_inputs
VariantVisualization.process_plot_inputs_for_grouped_data
VariantVisualization.read_depth_threshhold
VariantVisualization.returnXY_column1!
VariantVisualization.returnXY_column1_siglist!
VariantVisualization.save_graphic
VariantVisualization.save_numerical_array
VariantVisualization.select_columns
VariantVisualization.sort_genotype_array
VariantVisualization.sortcols_by_phenotype_matrix
VariantVisualization.test_parse_main
VariantVisualization.translate_genotype_to_num_array
VariantVisualization.translate_readdepth_strings_to_num_array
VariantVisualization.translate_readdepth_strings_to_num_array_for_avg_dp
Functions
#
VariantVisualization.add_pheno_matrix_to_dp_data_for_plotting
— Method.
add_pheno_matrix_to_dp_data_for_plotting(pheno_matrix,dp_num_array,trait_labels,chrom_label_info,number_rows)
add the pheno matrix used to group samples to the data array for input into plotting functions. Resizes the pheno matrix to maintain correct dimensions for heatmap viz by finding value=0.05numberrows*data to multiply each pheno row by before vcat.
#
VariantVisualization.add_pheno_matrix_to_gt_data_for_plotting
— Method.
add_pheno_matrix_to_gt_data_for_plotting(pheno_matrix,gt_num_array,trait_labels,chrom_label_info,number_rows)
add the pheno matrix used to group samples to the data array for input into plotting functions. Resizes the pheno matrix to maintain correct dimensions for heatmap viz by finding value=0.05numberrows*data to multiply each pheno row by before vcat.
#
VariantVisualization.avg_dp_samples
— Method.
avg_dp_samples(dp_num_array::Array{Int64,2})
create sampleavglist vector that lists averages of read depth for each sample for input into avgsampledplinechart(sampleavglist) dpnumarray must contain dp values as Int64 and be without chromosome position columns
#
VariantVisualization.avg_dp_variant
— Method.
avg_dp_variant(dp_num_array::Array{Int64,2})
create variantavglist vector that lists averages of read depth for each variant for input into avgvariantdplinechart(variantavglist)
#
VariantVisualization.avg_sample_dp_scatter
— Method.
avg_sample_dp_scatter(sample_avg_list::Array{Float64,1},sample_names,x_axis_label_option)
generate line chart of average read depths of each sample.
#
VariantVisualization.avg_variant_dp_line_chart
— Method.
avg_variant_dp_line_chart(variant_avg_list::Array{Float64,1},chr_pos_tuple_list,y_axis_label_option,chrom_label_info)
generate line chart of average read depths of each variant.
#
VariantVisualization.build_set_from_list
— Method.
build_set_from_list(sig_list)
build set of tuples of chrom and pos of each record in vcf for use in siglistfilters. Method 1: build list from input variant locations in Int64 and String chromosomes in variant list (chr1-XYM)
#
VariantVisualization.checkfor_outputdirectory
— Method.
checkfor_outputdirectory(path::String)
Checks to see if output directory exists already. If it doesn't, it creates the new directory to write output files to.
#
VariantVisualization.chromosome_label_generator
— Method.
chromosome_label_generator(chromosome_labels::Array{Any,1})
Returns vector of chr labels and indices to mark chromosomes in plotly heatmap Specifically, saves indexes and chrom labels in vectors to pass into heatmap function to ticvals and tictext respectively. Input is either gtchromosomelabels or dpchromosomelabels from translategt/dptonumarray()
#
VariantVisualization.clean_column1!
— Method.
clean_column1!(matrix_with_chr_column)
Replace String "X","Y","M" from chromosome column with 23,24,25 respectively so variants can be sorted by descending chr position for plotting
#
VariantVisualization.clean_column1_siglist!
— Method.
clean_column1_siglist!(siglist)
Replaces strings "X","Y","M" with 23,24,25 {Int} in array generated in loadsiglist() use in loadsiglist() because X and Y need to be replaced with Int
#
VariantVisualization.combined_all_genotype_array_functions
— Method.
combined_all_genotype_array_functions(sub)
convert sub from variant filters to gtnumarray and gtchromosomelabels for plot functions.
#
VariantVisualization.combined_all_read_depth_array_functions
— Method.
combined_all_read_depth_array_functions(sub)
convert sub from variant filters to dpnumarray and dpchromosomelabels for plot functions.
#
VariantVisualization.combined_all_read_depth_array_functions_for_avg_dp
— Method.
combined_all_read_depth_array_functions_for_avg_dp(sub)
convert sub from variant filters to dpnumarray and dpchromosomelabels for plot functions.
#
VariantVisualization.define_geno_dict
— Method.
define_geno_dict()
returns dictionary of values for use in replacegenotypewith_vals()
#
VariantVisualization.dp_heatmap2
— Method.
dp_heatmap2(input::Array{Int64,2},title::String,chrom_label_info::Tuple{Array{String,1},Array{Int64,1},String}, sample_names,chr_pos_tuple_list_rev,y_axis_label_option,x_axis_label_option)
generate heatmap of read depth data.
#
VariantVisualization.dp_heatmap2_with_groups
— Method.
dp_heatmap2_with_groups(input::Array{Int64,2},title::String,chrom_label_info::Tuple{Array{String,1},Array{Int64,1},String},group_label_pack::Array{Any,1},id_list,chr_pos_tuple_list_rev,y_axis_label_option,trait_label_array,x_axis_label_option,number_rows)
generate heatmap of read depth data with grouped samples.
#
VariantVisualization.find_group_label_indices
— Method.
find_group_label_indices(pheno,trait_to_group_by,row_to_sort_by)
find indices and determines names for group 1 and group 2 labels on plots. finds index of center of each sample group to place tick mark and label.
#
VariantVisualization.generate_chromosome_positions_for_hover_labels
— Method.
generate_chromosome_positions_for_hover_labels(chr_labels::Array{Any,2})
creates tuple of genomic locations to set as tick labels. This is automatically store chromosome positions in hover labels. However tick labels are set to hidden with showticklabels=false so they will not crowd the y axis.
#
VariantVisualization.generate_genotype_array
— Method.
generate_genotype_array(record_sub::Array{Any,1},genotype_field::String)
Returns numerical array of genotype values (either genotype or readdepth values) which are translated by another function into numarray Where genotypefield is either GT or DP to visualize genotype or readdepth
#
VariantVisualization.generate_hover_text_array
— Method.
generate_hover_text_array(chr_pos_tuple_list,sample_names,input,mode)
Generate array of data for hovertext to use as custom hover text for ungrouped heatmaps. Where mode is GT or DP.
#
VariantVisualization.generate_hover_text_array_grouped
— Method.
generate_hover_text_array_grouped(chr_pos_tuple_list,sample_names,input,mode)
Generate array of data for hovertext to use as custom hover text for grouped heatmaps. Where mode is GT or DP.
#
VariantVisualization.generate_legend_increments_grouped
— Method.
generate_legend_increments_grouped(input)
Dynamically generates positons for shapes that build categorical colorscale including two color boxes for traits 1 and 2
#
VariantVisualization.generate_legend_increments_ungrouped
— Method.
generate_legend_increments_ungrouped(input)
Dynamically generates positons for shapes that build categorical colorscale.
#
VariantVisualization.genomic_range_siglist_filter
— Method.
genomic_range_siglist_filter(vcf_filename,sig_list,chr_range::AbstractString)
returns subarray of vcf records with iopassfilter, iosiglistvcffilter, and iogenomicrangevcffilter applied.
#
VariantVisualization.genotype_heatmap2_new_legend
— Method.
genotype_heatmap2_new_legend(input::Array{Any,2},title::AbstractString,chrom_label_info,sample_names,chr_pos_tuple_list_rev,y_axis_label_option,x_axis_label_option)
generate heatmap of genotype data.
#
VariantVisualization.genotype_heatmap_with_groups
— Method.
genotypeheatmapwithgroups(input::Array{Int64,2},title::String,chromlabelinfo::Tuple{Array{String,1},Array{Int64,1},String},grouplabelpack::Array{Any,1},idlist,chrpostuplelistrev,yaxislabeloption,traitlabelarray,xaxislabeloption,number_rows) generate heatmap of genotype data.
#
VariantVisualization.get_sample_names
— Method.
get_sample_names(reader)
returns sample ids of vcf file as a vector of symbols for naming columns of num_array dataframe object for column filter functions
#
VariantVisualization.io_genomic_range_vcf_filter
— Method.
iogenomicrangevcffilter(chrrange::String, vcffilename::AbstractString) create subarray of vcf variant records matching user specified chromosome range in format: (e.g. chr1:0-30000000)
#
VariantVisualization.io_pass_filter
— Method.
io_pass_filter(vcf_filename)
returns subarray of vcf records including only records with FILTER status = PASS
#
VariantVisualization.io_sig_list_vcf_filter
— Method.
io_sig_list_vcf_filter(sig_list,vcf_filename)
returns subarray of variant records matching a list of variant positions returned from load_siglist()
#
VariantVisualization.jupyter_main
— Method.
jupytermain(vcffilename,savingoptions,variantfilters,sampleselection,plottingoptions)
filters, plots visualization, and saves as figure. utilizes all global variables set in first cell of jupyter notebook
#
VariantVisualization.list_sample_names_low_dp
— Method.
list_sample_names_low_dp(sample_avg_list::Array{Float64,2},sample_names)
returns list of sample ids that have an average read depth less than 15 across all variant positions. Developers can implement this in the VIVA script - search script for function name and read notes.
#
VariantVisualization.list_variant_positions_low_dp
— Method.
list_variant_positions_low_dp(variant_avg_list::Array{Float64,2},chrom_labels)
finds variant positions that have an average read depth less than 15 across all patients. Developers can implement this in the VIVA script - search script for function name and read notes.
#
VariantVisualization.load_siglist
— Method.
load_siglist(filename::AbstractString)
where x = filename of significant SNP variant location list in comma delimited format (saved as .csv)
#
VariantVisualization.pass_genomic_range_filter
— Method.
pass_genomic_range_filter(reader::GeneticVariation.VCF.Reader,chr_range::AbstractString,vcf_filename)
returns subarray of vcf records with iopassfilter and iogenomicrangevcffilter applied.
#
VariantVisualization.pass_genomic_range_siglist_filter
— Method.
pass_genomic_range_siglist_filter(vcf_filename,sig_list,chr_range::AbstractString)
returns subarray of vcf records with iopassfilter, iosiglistvcffilter, and iogenomicrangevcffilter applied.
#
VariantVisualization.pass_siglist_filter
— Method.
pass_siglist_filter(vcf_filename,sig_list,chr_range::AbstractString)
returns subarray of vcf records with iopassfilter, iosiglistvcffilter, and iogenomicrangevcffilter applied.
#
VariantVisualization.process_plot_inputs
— Method.
process_plot_inputs(chrom_label_info,sample_names,chr_pos_tuple_list_rev)
Prepares input for heatmap plot function for both genotype and read depth plots without –group_samples flag.
#
VariantVisualization.process_plot_inputs_for_grouped_data
— Method.
process_plot_inputs_for_grouped_data(chrom_label_info::Tuple{Array{String,1},Array{Int64,1},String},group_label_pack::Array{Any,1},id_list,chr_pos_tuple_list_rev,trait_label_array)
Prepares input for heatmap plot function for both genotype and read depth plots with –group_samples flag.
#
VariantVisualization.read_depth_threshhold
— Method.
read_depth_threshhold(dp_array::Array{Int64,2})
Caps read depth outlier values at user defined threshhold. threshhold defaults to dp = 100. All dp over 100 are set to 100 to visualize read depth values between 0 < dp > 100 in better definition.
#
VariantVisualization.returnXY_column1!
— Method.
returnXY_column1!(chr_label_vector)
Replace String "23","24","25" with "X","Y","M" in chromosome label vector used for plot labels
#
VariantVisualization.returnXY_column1_siglist!
— Method.
returnXY_column1_siglist!(siglist_sorted)
Replace String "23","24","25" with "X","Y","M" in siglist for filtering
#
VariantVisualization.save_graphic
— Method.
save_graphic(graphic,output_directory,save_ext,title,remote_option)
Save plot in either html or static image formats incuding eps, png, svg, and pdf
#
VariantVisualization.save_numerical_array
— Method.
save_numerical_array(num_array::Matrix{Any},sample_names,chr_labels,title,output_directory)
save numerical array with chr labels and sample ids to working directory
#
VariantVisualization.select_columns
— Method.
select_columns(filename_sample_list::AbstractString, num_array::Array{Int64,2}, sample_names)
returns numarray with columns matching user generated list of sample ids to select for analysis. numarray now has sample ids in first row.
#
VariantVisualization.sortcols_by_phenotype_matrix
— Method.
sortcols_by_phenotype_matrix(pheno_matrix_filename::String,trait_to_group_by::String,num_array::Array{Int64,2}, sample_names::Array{Symbol,2})
group samples by a common trait using a user generated key matrix ("phenotype matrix") returns numarray,grouplabel_pack,
#
VariantVisualization.test_parse_main
— Method.
test_parse_main(ARGS::Vector{String})
Defines argument parsing rules for viva script.
#
VariantVisualization.translate_genotype_to_num_array
— Method.
translate_genotype_to_num_array(genotype_array,geno_dict)
returns a tuple of numarray for plotting, and chromosome labels for plotting as label bar. Translates array of genotype values to numerical array of categorical values. Genotype values are converted to categorical values. Nocall=0, 0/0=1, heterozygousvariant=2, homozygousvariant=3
#
VariantVisualization.translate_readdepth_strings_to_num_array
— Method.
translate_readdepth_strings_to_num_array(read_depth_array::Array{Any,2})
Returns array of readdepth as int for plotting and average calculation. Ceiling of dp=100 is set to prevent high dp value from hiding (or "blowing out") low dp values. (see readdepththreshhold() ). Where readdeptharray is output of generategenotypearray() for DP option returns a tuple of numarray type Int for average calculation and plotting, and chromosome labels for plotting as label bar
#
VariantVisualization.create_chr_dict
— Method.
create_chr_dict()
creates dict for use in combinedallgenotypearrayfunctions() for removing 'chr' from chromosome labels to allow sorting variant records by chromosome position.
#
VariantVisualization.index_vcf
— Method.
index_vcf(vcf_filename)
Creates and saves index file with three column array of vcf chrom, position, and row number to be used by significant list filter functions.
#
VariantVisualization.make_chromosome_labels
— Method.
make_chromosome_labels(chrom_label_info)
Returns vector of values to use as tick vals to show first chromosome label per chromosome with blank spaces between each first chromosome position for use with –yaxislabels=chromosomes. duplicatelastlabel tells if last chrom label is single or mutiple which affects numbertofill value.
#
VariantVisualization.match_siglist_to_index
— Method.
match_siglist_to_index(sig_list,vcf_index)
Returns vcf row indices of each variant position in sig_list for reader function to allow fast filtering in significant list filter funcitons.
#
VariantVisualization.sort_genotype_array
— Method.
sort_genotype_array(genotype_array)
sorts genotype data array (which can contain genotype or read depth values) for GT or DP by chromosomal location
#
VariantVisualization.translate_readdepth_strings_to_num_array_for_avg_dp
— Method.
translate_readdepth_strings_to_num_array_for_avg_dp(read_depth_array::Array{Any,2})
Returns array of readdepth as int for plotting and average calculation. 'readdeptharray' is output of generategenotypearray() for DP option returns a tuple of numarray type Int for average calculation and plotting, and chromosome labels for plotting as label bar No call is replaced with 0 for avg_calculation. Ceiling of dp=100 is set to prevent high dp value from hiding (or "blowing out") low dp values.