Plots

Implementation of different plots.

Note

Most plots are now directly available as methods of the data.Data, e.g. plot_clusters_scatter() is equivalent to

cp = ClusterPlot(data)
cp.scatter()

Warning

These implementations are still subject to change in the near future, so it is recommended to use the methods of the data.Data class as advertised above.

ClusterPlot

class clusterking.plots.ClusterPlot(data)[source]

Bases: object

Plot clusters in parameter space.

After initialization, use the ‘scatter’ or ‘fill’ method for plotting.

You can modify the attributes of this class to tweak some properties of the plots.

__init__(data)[source]
Parameters:dataData object
log = None

logging.Logger object

data = None

Instance of pandas.DataFrame

color_scheme = None

Color scheme

markers = None

List of markers of the get_clusters (scatter plot only).

max_subplots = None

Maximal number of subplots

max_cols = None

Maximal number of columns of the subplot grid

kv_formatter = None

Formatting of key-value pairs in title of plots

fig_base_size = None

figure size of each subplot

aspect_ratio = None

Automatically inferred

Type:Ratio of height/width. None
cluster_column = None

The name of the column that holds the cluster index

bpoint_column = None

The name of the column that holds the benchmark yes/no information

default_marker_size = None

Default marker size

bpoint_marker_size = None

Marker size of benchmark points

draw_legend = None

If true, a legend is drawn

fig

The figure.

figsize

Figure size per subplot (width, height)

scatter(cols: List[str], clusters=None, **kwargs)[source]

Create scatter plot, specifying the columns to be on the axes of the plot. If 3 column are specified, 3D scatter plots are presented, else 2D plots. If the dataframe contains more columns, such that each row is not only specified by the columns on the axes, a selection of subplots is created, showing ‘cuts’. Benchmark points are marked by enlarged plot markers.

Parameters:
  • cols – The names of the columns to be shown on the x, y (and z) axis of the plots.
  • clusters – The get_clusters to be plotted (default: all)
  • **kwargs – Kwargs for ax.scatter
Returns:

The figure (unless the ‘inline’ setting of matplotllib is detected).

fill(cols: List[str], kwargs_imshow=None)[source]

Call this method with two column names, x and y. The results are similar to those of 2D scatter plots as created by the scatter method, except that the coloring is expanded to the whole xy plane. Note: This method only works with uniformly sampled NP!

Parameters:
  • cols – List of name of column to be plotted on x-axis and on y-axis
  • kwargs_imshow – Additional keyword arguments to be passed to imshow
Returns:

The figure (unless the ‘inline’ setting of matplotllib is detected).

savefig(*args, **kwargs)[source]

Equivalent to ClusterPlot.fig.savefig(*args, **kwargs): Saves figure to file, e.g. ClusterPlot.savefig("test.pdf").

BundlePlot

class clusterking.plots.BundlePlot(data)[source]

Bases: object

Plotting class to plot distributions by cluster in order to analyse which distributions get assigned to which cluster.

__init__(data)[source]
Parameters:dataData object
log = None

logging.Logger object

data = None

pandas dataframe

cluster_column = None

Name of the column holding the cluster number

draw_legend = None

Draw legend?

title = None

Override default titles with this title. If None, the default title is used.

ax = None

Instance of matplotlib.axes.Axes

fig

Instance of matplotlib.pyplot.figure

xrange

Range of the xaxis

xlabel
ylabel
plot_bundles(clusters: Union[None, int, Iterable[int]] = None, nlines=None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, hist_kwargs_bp: Optional[Dict[str, Any]] = None) → None[source]

Plot several examples of distributions for each cluster specified

Parameters:
  • clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.
  • nlines – Number of example distributions of each cluster to be plotted. Defaults to 0 if we plot benchmark points and 3 otherwise.
  • ax – Instance of matplotlib.axes.Axes to be plotted on. If None (default), a new axes object and figure is initialized and saved as self.ax and self.fig.
  • bpoints – Draw benchmark curve
  • hist_kwargs – Keyword arguments passed on to plot_histogram()
  • hist_kwargs_bp – Like hist_kwargs but used for benchmark points. If None, hist_kwargs is used.
Returns:

None

animate_bundle(cluster, n, benchmark=True)[source]
plot_minmax(clusters: Union[int, Iterable[int], None] = None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, fill_kwargs: Optional[Dict[str, Any]] = None) → None[source]

Plot the minimum and maximum of each bin for the specified clusters.

Parameters:
  • clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.
  • ax – Instance of matplotlib.axes.Axes to plot on. If None, a new one is instantiated.
  • bpoints – Plot benchmark points
  • hist_kwargs – Keyword arguments to plot_histogram()
  • fill_kwargs – Keyword arguments to`matplotlib.pyplot.fill_between`
Returns:

None

err_plot(clusters: Union[None, int, Iterable[int]] = None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, hist_fill_kwargs: Optional[Dict[str, Any]] = None)[source]

Plot distributions with errors.

Parameters:
  • clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.
  • ax – Instance of matplotlib.axes.Axes to plot on. If None, a new one is instantiated.
  • bpoints – Plot benchmark points? If False or benchmark points are not available, distributions correponding to random sample points are chosen.
  • hist_kwargs – Keyword arguments to plot_histogram()
  • hist_fill_kwargs – Keyword arguments to plot_histogram_fill()
Returns:

None

box_plot(clusters: Union[int, Iterable[int], None] = None, ax=None, whiskers=2.5, bpoints=True, boxplot_kwargs: Optional[Dict[str, Any]] = None, hist_kwargs: Optional[Dict[str, Any]] = None) → None[source]

Box plot of the bin contents of the distributions corresponding to selected clusters.

Parameters:
  • clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.
  • ax – Instance of matplotlib.axes.Axes to plot on. If None, a new one is instantiated.
  • whiskers – Length of the whiskers of the box plot in units of IQR (interquartile range, containing 50% of all values). Default 2.5.
  • bpoints – Draw benchmarks?
  • boxplot_kwargs – Arguments to matplotlib.pyplot.boxplot
  • hist_kwargs – Keyword arguments to plot_histogram()

plot_histogram

clusterking.plots.plot_histogram(ax, edges, contents, normalize=False, **kwargs)[source]

Plot a histogram.

Parameters:
  • ax – Instance of matplotlib.axes.Axes to plot on. If None, a new figure will be initialized.
  • edges – Edges of the bins or None (to use bin numbers on the x axis)
  • contents – bin contents
  • normalize (bool) – Normalize histogram. Default False.
  • **kwargs – passed on to matplotlib.pyplot.step
Returns:

Instance of matplotlib.axes.Axes

Colors

class clusterking.plots.ColorScheme(clusters: Optional[List[int]] = None, colors: Optional[List[str]] = None)[source]

Bases: object

Class holding color scheme. We want to assign a unique color to every cluster and keep it consistent accross different plots. Subclass and overwrite color lists to implement different schemes.

__init__(clusters: Optional[List[int]] = None, colors: Optional[List[str]] = None)[source]

Initialize ColorScheme object.

Parameters:
  • clusters – List of cluster names
  • colors – List of colors
cluster_colors

List of colors

get_cluster_color(cluster: int)[source]

Returns base color for cluster.

Parameters:cluster – Name of cluster. Has to be in clusters
Returns:Color
to_colormap(name='MyColorMap')[source]

Returns colormap with color for each cluster.

faded_colormap(cluster: int, nlines: int, name='MyFadedColorMap', **kwargs)[source]

Returns colormap for one cluster, including the faded colors.

Parameters:
  • cluster – Name of cluster
  • nlines – Number of shades
  • name – Name of colormap
  • **kwargs – Arguments for get_cluster_colors_faded()
Returns:

Colormap

demo()[source]

Plot the colors for all clusters.

Returns:figure
demo_faded(cluster: Optional[int] = None, nlines=10, **kwargs)[source]

Plot the color shades for different lines corresponding to the same cluster

Parameters:
Returns:

figure

get_cluster_colors_faded(cluster: int, nlines: int, max_alpha=0.7, min_alpha=0.3)[source]

Shades of the base color, for cases where we want to draw multiple lines for one cluster

Parameters:
  • cluster – Name of cluster
  • nlines – Number of shades
  • max_alpha – Maximum alpha value
  • min_alpha – Minimum alpha value
Returns:

List of colors

get_err_color(cluster: int)[source]

Get color for error shades.

Parameters:cluster – Cluster name
Returns:color