Documentation¶
The gelviz.basic
module¶
-
gelviz.basic.
createGeneNameMap
(gene_name_mapping_filename)¶ Function that creates a mapping between gene ids
- Parameters
gene_name_mapping_file (str) – Path to a tab separated file, for which the first column is a ensemble gene id, and the second column is the HUGO gene name
- Returns
Dictionary containing the gene id mapping.
- Return type
dictionary
-
gelviz.basic.
determineYPosGene
(genes_bed, region_size, distance_ratio)¶ Function that determines the max y position for gene plotting via function plotGenes.
- Parameters
genes_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing genes to be plotted.region_size (int) – Size of region to be plotted in base pairs.
distance_ratio (float) – Minimal distance between two genes, as ratio of ax width, such that two genes are plotted side by side. If this ratio is underwent, the genes will be stacked.
- Returns
Tuple of
max_y_pos: Defines the number of stacked genes.
y_pos_dict: Dictionary with keys = gene ids and values = y position of gene.
- Return type
tuple
-
gelviz.basic.
distanceEqualizer
(genomic_segments, start, end, direction='top_down', color='k', ax=None)¶ Function that plots arcs from unequal distances of genomic segments to equal distances.
- Parameters
genomic_segments (list) – List of segments for which distances shall be equalized (each segment is of the form [<chrom>, <start>, <end>])
start (int) – Start position of the genomic region.
end (int) – End position of the genomic region.
color (str, optional) – Color of lines equalizing distances, defaults to “k”.
direction (str, optional.) – Direction of distance equalization (top_down | bottom_up), defaults to “top_down”.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis on which to plot, defaults to None.
- Returns
List of equalized region midpoints.
- Return type
list
-
gelviz.basic.
plotCNVs
(cnvs_bed, chromosome, start, end, ploidy=2, cnv_threshold=0.7, color_gain='g', color_loss='r', color_neutral='k', ax=None)¶ Function for plotting CNV segments
- Parameters
cnvs_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing CNVs with following entries:Chromosome,
Start Position,
End Position,
Deviation from ploidy,
True Copy Number)
chromosome (str) – Chromosome for which to plot CNVs.
start (int) – Start position on chromosome.
end (int) – End position on chromosome.
ploidy (int, optional) – Assumed ploidy of tumor, defaults to 2.
cnv_threshold (float, optional) – Minimal deviation from ploidy to be considered as a CNV, defaults to 0.7.
color_gain (str, optional) – Plot color of copy number gains, defaults to “g”.
color_loss (str, optional) – Plot color of copy number losses, defaults to “r”.
color_neutral (str, optional) – Plot color of copy number neutral regions, defaults to “k”.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis used for plotting.
- Returns
Nothing to be returned
- Return type
None
-
gelviz.basic.
plotCNVsHeat
(cnvs_bed, chromosome, start, end, ploidy=2, cnv_threshold=0.7, cmap='bwr', max_dev=None, ax=None)¶ Function for plotting CNV segments as heatmap
- Parameters
cnvs_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing CNVs with following entries:Chromosome,
Start Position,
End Position,
Deviation from ploidy,
True Copy Number)
chromosome (str) – Chromosome for which to plot CNVs.
start (int) – Start position on chromosome.
end (int) – End position on chromosome.
ploidy (int, optional) – Assumed ploidy of tumor, defaults to 2.
cnv_threshold (float, optional) – Minimal deviation from ploidy to be considered as a CNV, defaults to 0.7.
cmap (str, optional) – Colormap used for plotting CNVs, defaults to “bwr”.
max_dev (float, optional) – Maximal deviation from ploidy to plot, defaults to None.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis used for plotting, defaults to None.
- Returns
Nothing to be returned.
- Return type
None
-
gelviz.basic.
plotChIPSignals
(chip_signals, r_chrom, r_start, r_end, ax=None, color='b', offset=None, merge=None)¶ Function that plots bedGraph like iterators.
- Parameters
chip_signals (iterator) –
Iterator for which each element is a list-ike object containing:
Chromosome
Start postion
End position
Value to be plotted as bar
r_chrom (str) – Chromosome of region to be plotted.
r_start (int) – Start position of region to be plotted.
r_end (int) – End position of region to be plotted.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis of plotcolor (str, optional) – color of bars, defaults to “b”.
offset (int, optional) – Length of intervals, defaults to None.
merge (int, optional) – Number of elements to be merged. If this value is not equal to 0, than merge elements will be averaged an plotted, defaults to 0.
- Returns
Nothing to be returned.
- Return type
None
-
gelviz.basic.
plotCoordinates
(chrom, start, end, color='k', ax=None, upper=True, loc_coordinates='up', revert_coordinates=False, rotation=0)¶ Function that plots genomic coordinates in a linea fashion.
- Parameters
chrom (str) – Chromosome of the region to be plotted.
start (int) – Start position of the region to be plotted.
end (int) – End position of the region to be plotted.
color (str, optional) – Color of the genomic scales elements, defaults to “k”.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis of plot, defaults to None.upper (bool, optional) – If True, make less ticks, else if False make more ticks.
loc_coordinates (str, optional) – Either of “up” | “down”. If “up”, plot ticks to upper direction, else if “down”, plot ticks to lower direction, defaults to “up”.
revert_coordinates (bool, optional) – If True, coordinates are reverted to decreasing order. Else, coordinates stay in increasing order, defaults to False.
rotation (int, optional) – Rotational angle of coordinate strings, defaults to 0.
- Returns
Nothing to be returned.
- Return type
None
-
gelviz.basic.
plotGeneExpression
(genes_bed, region_bed, expression_df_g1, expression_df_g2, gene_names_map, blacklist=None, ax=None, plot_legend=False, color_g1='#fb8072', color_g2='#80b1d3', g1_id='tumor', g2_id='normal', plot_gene_names=True)¶ Function for plotting paired gene expression (e.g. tumor and normal) on a gene region scale retaining the position of genes.
- Parameters
genes_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing TXstart, and TXend of genes.region_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing the region to be plottedexpression_df_g1 (
pandas.DataFrame
) –pandas.Dataframe
containing the expression values of g1 samples (columns: sample ids; index: gene ids)expression_df_g2 (
pandas.DataFrame
) –pandas.Dataframe
containing the expression values of g2 samples (columns: sample ids; index: gene ids)gene_names_map (dict.) – Dictionary with keys: ENSEMBL GENE IDs, and values: HUGO GENE SYMBOLs.
blacklist (set, optional) – Set containing gene ids not to be plotted, default to None.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis used for plotting, defaults to None.plot_legend (bool) – If True legend is plotted, False otherwise, defaults to False.
color_g1 (str, optional) – Color used for plotting g1 samples expression, defaults to “#fb8072”.
color_g2 (str, optional) – Color used for plotting g2 samples expression, defaults to “#80b1d3”.
g1_id (str, optional) – ID of g1 used for legend plotting, defaults to “tumor”.
g2_id (str, optional) – ID of g2 used for legend plotting, defaults to “normal”.
plot_gene_names (bool.) – If True, the HUGO GENE SYMBOLs will be shown, else the GENE SYMBOLs are hidden.
- Returns
Axis on which plot was placed.
- Return type
matplotlib.axes._subplots.AxesSubplot
-
gelviz.basic.
plotGeneExpressionEqualDist
(genes_bed, gene_mid_points, region, expression_df, groups, gene_names_map=None, blacklist=None, ax=None, plot_legend=False, colors=None, ids=None, plot_gene_names=True, position_gene_names='bottom', log_transformed=True, plot_points=False, alpha=0.5)¶ Function for plotting grouped gene expression (e.g. tumor and normal) on a gene region scale equalizing the position of genes.
- Parameters
genes_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing gene regions.gene_mid_points (list) – list of integer values containing center positions of genes.
region (list) – List containing the region to be plotted ([<chrom>, <start>, <end>]).
groups (list) – List of lists containing the IDs of the different groups.
gene_names_map (dict.) – Dictionary with keys: ENSEMBL GENE IDs, and values: HUGO GENE SYMBOLs.
expression_df (class:pandas.DataFrame) – class:pandas.DataFrame object containing the expression values of all samples (columns: sample ids; index: gene ids).
blacklist (set, optional) – Set containing gene ids not to be plotted, defaults to None,
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – (default: None) Axis used for plotting, defaults to None.plot_legend (bool, optional) – If True plot legend, False otherwise, defaults to False.
colors (str, optional) – List of colors used for plotting samples expression. The number of colors must be the same as the number of groups, defaults to None.
ids (list, optional.) – IDs used for legend plotting, defaults to None. Number of ids must be the same as the number of groups.
plot_gene_names (bool, optional) – True if gene names shall be plotted, False otherwise, defaults to True.
position_gene_names (str, optional) – Either of “top”, or “bottom”, defaults to “bottom”.
log_transformed (bool, optional) – If True use log transformed values for plotting, non-transformed values otherwise.
plot_points (bool, optional) – If True, a point per expression value is plotted in addition to the boxplot, no points are plotted otherwise, defaults to False.
alpha (float, optional) – Alpha value for the background color of the boxplots boxes, defaults to 0.5.
- Returns
Plots axis.
- Return type
matplotlib.axes._subplots.AxesSubplot
-
gelviz.basic.
plotGenes
(genes_bed, exons_bed, introns_bed, region_bed, blacklist=None, gene_map=None, plot_gene_ids=True, y_max=None, distance_ratio=0.1, ax=None, plot_legend=False, legend_loc='lower right', color_plus='#80b1d3', color_minus='#fb8072')¶ Function for plotting gene structures, i.e. introns exons of genes.
- Parameters
genes_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing TX start, and TX end of genes.exons_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing exons of genes.introns_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing intronsregion_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing the one region, for which the gene plot is created.blacklist (list, optional) – List of gene names, for genes that should not be shown on the plot, default is None
plot_gene_ids (bool, optional) – If True, all gene ids will be included in the plot, False otherwise, default is True
y_max (bool, optional) – Max y value in the gene plot. If not set, then y_max is the max number of stacked genes, default is None.
distance_ratio (float, optional) – Minimal distance between two genes, as ratio of ax width, such that two genes are plotted side by side. If this ratio is underwent, the genes will be stacked, default is 0.1.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axes instance on which the genes are plotted, default is None.plot_legend (bool, optional) – If True, a legend describing plus or minus stranded genes is plotted, False otherwise. Default is False.
legend_loc (str, optional) – Location of the legend. Either of “lower left”, “lower right”, “upper left”, “upper right”, default is “lower right”.
color_plus (str, optional.) – Color code for plus stranded genes, default is “#80b1d3”.
color_minus (str, optional.) – Color code for minus stranded genes, default is “#fb8072”.
- Returns
Tuple of max_y_pos+1.5, patch_list, patch_description_list, where
max_y_pos+1.5 is the max_y_position + 1.5. max_y_pos defines the number of stacked genes.
patch_list is the list of patches drawn on the ax.
patch_description_list is the list of descriptions for the patches drawn on the ax.
- Return type
list
-
gelviz.basic.
plotGenomicSegments
(segments_list, chrom, start, end, ax=None)¶ Function for plotting genomix segments in different colors
- Parameters
segments_tabix_filename – Path to tabixed bed file containing (chrom, start, end, name, score, strand, start, end, color). The color field is used to determine the color for plotting (R,G,B).
chrom (str) – Chromosome of the region to be plotted.
start (str) – Start position of the region to be plotted.
end (str) – End position of the region to be plotted.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis used for plotting, defaults to None.
- Returns
Dictionary with keys = names of segments, and values patch
- Return type
dict
-
gelviz.basic.
plotHiCContactMap
(contact_map, start, end, segment_size, cmap='Greys', vmin=None, vmax=None, location='top', ax=None)¶ Function that plots HiC contact maps as pyramid plots
- Parameters
contact_map (
pandas.DataFrame
) – Matrix that contains the intensity values of HiC contacts.start (int) – Chromosomal start position of region to be plotted.
end (int) – Chromosomal end position of region to be plotted.
segment_size (int) – Size of the segments for which contacts were called.
cmap (str, optional) – Name of the colormap to be used for plotting HiC intensities, defaults to “Greys”.
vmin (float, optional) – Minimal value of intensity range to be plotted, defaults to None
vmax (float, optional) – Maximal value of intensity range to be plotted, defaults to None.
location (str, optional) – Either of “top” | “bottom”. If location == “top”, the pyramid points upwards, else if location == “bottom” the pyramid points downwards, defaults to top,
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis on which to plot contact map, defaults to None.
- Returns
Nothing to be returned.
- Return type
None
-
gelviz.basic.
plotMethylationProfile
(meth_calls, chrom, start, end, color='k', ax=None)¶ Function that plots methylation values as dot plots.
- Parameters
meth_calls –
Iterator containing list-like elements with the following entries:
Chromsome
Start position
end position
Number methylated cytosines
Number unmethylated cytosines
chrom (str) – Chromosome of region to be plotted.
start (int) – Start position of region to be plotted.
end (int) – End position of region to be plotted.
color (str, optional) – Color of points representing methylation values, defaults to “k”.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis of plot, defaults to None.
- Returns
Nothing to be returned
- Return type
None
-
gelviz.basic.
plotMethylationProfileHeat
(methylation_bed, chrom, start, end, bin_size=1000, ax=None)¶ Function for plotting methylation values as heatmap
- Parameters
methylation_bed (
pybedtools.BedTool
) – Methylation calls. Following fields must be included: Chrom, Start, End, Methylated Cs, Unmethylated Cs.chrom (str) – Chromosome of region to be plotted.
start (int) – Start position of region to be plotted.
end (int) – End position of region to be plotted.
bin_size (int, optional) – size of bin to average methylation values, defaults to 1000.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis to be used for plotting, defaults to None.
- Returns
Nothing to be returned
- Return type
None
-
gelviz.basic.
plotMotifDirections
(motifs_bed, start, end, head_width=0.2, head_length=1000, overhang=0, color_plus='#80b1d3', color_minus='#fb8072', ax=None)¶ Function that plots TF motifs as arrows, indicating their directionality.
- Parameters
motifs_bed (
pybedtools.BedTool
) –pybedtools.BedTool
object containing regions of the TF sited to be plotted.start (int) – Start position of the region to be plotted.
end (int) – End position of the region to be plotted.
head_width (float, optional) – Width of the arrow head as proportion of the arrow, defaults to 0.2
head_length (int, optional) – Length of the arrow in bp (depends on the region that is plotted), defaults to 1000.
overhang (float, optional) – Fraction that the arrow is swept back (0 overhang means triangular shape). Can be negative or greater than one. Defaults to 0.
color_plus (str, optional) – Color of plus stranded TF regions, defaults to “#80b1d3”.
color_minus (str, optional) – Color of plus stranded TF regions, defaults to “#fb8072”.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis on which to plot contact map, defaults to None.
- Returns
Nothing to be returned.
- Return type
None
-
gelviz.basic.
plotRegions
(regions, start, end, color='#cbebc4', edgecolor=False, alpha=1, ax=None)¶ Functions that plots genomic regions as simple rectangles.
- Parameters
regions (iterator) –
Iterator containig list-like elements with the following entries:
Chromosome
Start position
End position
start (int) – Start position of the region to be plotted.
end (int) – End position of the region to be plotted.
color (str, optional) – Color of the rectangles representing the regions to be plotted, defaults to “#cbebc4”.
edge_color (str, optional) – Color of region edge. If False, no edge is plotted, defaults to False.
alpha (float, optional.) – Alpha value of the rectangle, representing the region to be plotted, defaults to 1.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis of plot, defaults to None.
- Returns
Nothing to be returned
- Return type
None
-
gelviz.basic.
plotTX
(chrom_r, start_r, end_r, TX_pos, direction='right', color='k', ax=None)¶ Function that plots a translocation event as a bar, showing the part of the genome that is translocated.
- Parameters
chrom_r (str) – Chromosome of the region to be plotted.
start_r (int) – Start position of the region to be plotted.
end_r (int) – End position of the region to be plotted.
TX_pos (int) – Position of the translocation.
direction (str, optional) – Direction of the genomic part that is translocated. Either of “left” (upstream), or “right” (downstream), defaults to “left”.
color (str, optional) – Color of the bar representing the translocation, defaults to “k”.
ax (
matplotlib.axes._subplots.AxesSubplot
, optional) – Axis of plot, defaults to None.
- Returns
Nothing to be returned.
- Return type
None
-
gelviz.basic.
readACESeqAsBed
(input_filename)¶ Function that reads CNVs from ACESeq (“most_important”) files and converts them to pybedtools.BedTool object
- Parameters
input_filename (str) – Full path to ACESeq “most_important” file
- Returns
pybedtools.BedTool
object containing CNVs from ACESeq- Return type
pybedtools.BedTool