Plots¶
Implementation of different plots.
Note
Most plots are now directly available as methods of the data.Data
,
e.g. plot_clusters_scatter()
is equivalent to
cp = ClusterPlot(data)
cp.scatter()
Warning
These implementations are still subject to change in the near future, so
it is recommended to use the methods of the data.Data
class as
advertised above.
ClusterPlot
¶
- class
clusterking.plots.
ClusterPlot
(data)[source]¶Bases:
object
Plot clusters in parameter space.
After initialization, use the ‘scatter’ or ‘fill’ method for plotting.
You can modify the attributes of this class to tweak some properties of the plots.
log
= None¶logging.Logger object
data
= None¶Instance of pandas.DataFrame
color_scheme
= None¶Color scheme
markers
= None¶List of markers of the get_clusters (scatter plot only).
max_subplots
= None¶Maximal number of subplots
max_cols
= None¶Maximal number of columns of the subplot grid
kv_formatter
= None¶Formatting of key-value pairs in title of plots
fig_base_size
= None¶figure size of each subplot
aspect_ratio
= None¶Automatically inferred
Type: Ratio of height/width. None
cluster_column
= None¶The name of the column that holds the cluster index
bpoint_column
= None¶The name of the column that holds the benchmark yes/no information
default_marker_size
= None¶Default marker size
bpoint_marker_size
= None¶Marker size of benchmark points
draw_legend
= None¶If true, a legend is drawn
fig
¶The figure.
figsize
¶Figure size per subplot (width, height)
scatter
(cols: List[str], clusters=None, **kwargs)[source]¶Create scatter plot, specifying the columns to be on the axes of the plot. If 3 column are specified, 3D scatter plots are presented, else 2D plots. If the dataframe contains more columns, such that each row is not only specified by the columns on the axes, a selection of subplots is created, showing ‘cuts’. Benchmark points are marked by enlarged plot markers.
Parameters:
- cols – The names of the columns to be shown on the x, y (and z) axis of the plots.
- clusters – The get_clusters to be plotted (default: all)
- **kwargs – Kwargs for ax.scatter
Returns: The figure (unless the ‘inline’ setting of matplotllib is detected).
fill
(cols: List[str], kwargs_imshow=None)[source]¶Call this method with two column names, x and y. The results are similar to those of 2D scatter plots as created by the scatter method, except that the coloring is expanded to the whole xy plane. Note: This method only works with uniformly sampled NP!
Parameters:
- cols – List of name of column to be plotted on x-axis and on y-axis
- kwargs_imshow – Additional keyword arguments to be passed to imshow
Returns: The figure (unless the ‘inline’ setting of matplotllib is detected).
BundlePlot
¶
- class
clusterking.plots.
BundlePlot
(data)[source]¶Bases:
object
Plotting class to plot distributions by cluster in order to analyse which distributions get assigned to which cluster.
log
= None¶logging.Logger object
data
= None¶pandas dataframe
cluster_column
= None¶Name of the column holding the cluster number
draw_legend
= None¶Draw legend?
title
= None¶Override default titles with this title. If None, the default title is used.
ax
= None¶Instance of matplotlib.axes.Axes
fig
¶Instance of matplotlib.pyplot.figure
xrange
¶Range of the xaxis
xlabel
¶
ylabel
¶
plot_bundles
(clusters: Union[None, int, Iterable[int]] = None, nlines=None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, hist_kwargs_bp: Optional[Dict[str, Any]] = None) → None[source]¶Plot several examples of distributions for each cluster specified
Parameters:
- clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.
- nlines – Number of example distributions of each cluster to be plotted. Defaults to 0 if we plot benchmark points and 3 otherwise.
- ax – Instance of matplotlib.axes.Axes to be plotted on. If None (default), a new axes object and figure is initialized and saved as self.ax and self.fig.
- bpoints – Draw benchmark curve
- hist_kwargs – Keyword arguments passed on to
plot_histogram()
- hist_kwargs_bp – Like
hist_kwargs
but used for benchmark points. IfNone
,hist_kwargs
is used.Returns: None
plot_minmax
(clusters: Union[int, Iterable[int], None] = None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, fill_kwargs: Optional[Dict[str, Any]] = None) → None[source]¶Plot the minimum and maximum of each bin for the specified clusters.
Parameters:
- clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.
- ax – Instance of
matplotlib.axes.Axes
to plot on. If None, a new one is instantiated.- bpoints – Plot benchmark points
- hist_kwargs – Keyword arguments to
plot_histogram()
- fill_kwargs – Keyword arguments to`matplotlib.pyplot.fill_between`
Returns: None
err_plot
(clusters: Union[None, int, Iterable[int]] = None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, hist_fill_kwargs: Optional[Dict[str, Any]] = None)[source]¶Plot distributions with errors.
Parameters:
- clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.
- ax – Instance of matplotlib.axes.Axes to plot on. If None, a new one is instantiated.
- bpoints – Plot benchmark points? If False or benchmark points are not available, distributions correponding to random sample points are chosen.
- hist_kwargs – Keyword arguments to
plot_histogram()
- hist_fill_kwargs – Keyword arguments to
plot_histogram_fill()
Returns: None
box_plot
(clusters: Union[int, Iterable[int], None] = None, ax=None, whiskers=2.5, bpoints=True, boxplot_kwargs: Optional[Dict[str, Any]] = None, hist_kwargs: Optional[Dict[str, Any]] = None) → None[source]¶Box plot of the bin contents of the distributions corresponding to selected clusters.
Parameters:
- clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.
- ax – Instance of matplotlib.axes.Axes to plot on. If None, a new one is instantiated.
- whiskers – Length of the whiskers of the box plot in units of IQR (interquartile range, containing 50% of all values). Default 2.5.
- bpoints – Draw benchmarks?
- boxplot_kwargs – Arguments to matplotlib.pyplot.boxplot
- hist_kwargs – Keyword arguments to
plot_histogram()
plot_histogram
¶
clusterking.plots.
plot_histogram
(ax, edges, contents, normalize=False, **kwargs)[source]¶Plot a histogram.
Parameters:
- ax – Instance of matplotlib.axes.Axes to plot on. If
None
, a new figure will be initialized.- edges – Edges of the bins or None (to use bin numbers on the x axis)
- contents – bin contents
- normalize (bool) – Normalize histogram. Default False.
- **kwargs – passed on to matplotlib.pyplot.step
Returns: Instance of matplotlib.axes.Axes
Colors
¶
- class
clusterking.plots.
ColorScheme
(clusters: Optional[List[int]] = None, colors: Optional[List[str]] = None)[source]¶Bases:
object
Class holding color scheme. We want to assign a unique color to every cluster and keep it consistent accross different plots. Subclass and overwrite color lists to implement different schemes.
__init__
(clusters: Optional[List[int]] = None, colors: Optional[List[str]] = None)[source]¶Initialize ColorScheme object.
Parameters:
- clusters – List of cluster names
- colors – List of colors
cluster_colors
¶List of colors
get_cluster_color
(cluster: int)[source]¶Returns base color for cluster.
Parameters: cluster – Name of cluster. Has to be in clusters
Returns: Color
faded_colormap
(cluster: int, nlines: int, name='MyFadedColorMap', **kwargs)[source]¶Returns colormap for one cluster, including the faded colors.
Parameters:
- cluster – Name of cluster
- nlines – Number of shades
- name – Name of colormap
- **kwargs – Arguments for
get_cluster_colors_faded()
Returns: Colormap
demo_faded
(cluster: Optional[int] = None, nlines=10, **kwargs)[source]¶Plot the color shades for different lines corresponding to the same cluster
Parameters:
- cluster – Name of cluster
- nlines – Number of shades
- **kwargs – Arguments for
get_cluster_colors_faded()
Returns: figure
get_cluster_colors_faded
(cluster: int, nlines: int, max_alpha=0.7, min_alpha=0.3)[source]¶Shades of the base color, for cases where we want to draw multiple lines for one cluster
Parameters:
- cluster – Name of cluster
- nlines – Number of shades
- max_alpha – Maximum alpha value
- min_alpha – Minimum alpha value
Returns: List of colors