Plots

Implementation of different plots.

Note

Most plots are now directly available as methods of the data.Data, e.g. plot_clusters_scatter() is equivalent to

cp = ClusterPlot(data)
cp.scatter()

Warning

These implementations are still subject to change in the near future, so it is recommended to use the methods of the data.Data class as advertised above.

ClusterPlot

class clusterking.plots.ClusterPlot(data)[source]

Bases: object

Plot clusters in parameter space.

After initialization, use the ‘scatter’ or ‘fill’ method for plotting.

You can modify the attributes of this class to tweak some properties of the plots.

__init__(data)[source]
Parameters

dataData object

log

logging.Logger object

data

Instance of pandas.DataFrame

color_scheme

Color scheme

markers

List of markers of the get_clusters (scatter plot only).

max_subplots

Maximal number of subplots

max_cols

Maximal number of columns of the subplot grid

kv_formatter

Formatting of key-value pairs in title of plots

fig_base_size

figure size of each subplot

aspect_ratio

Automatically inferred

Type

Ratio of height/width. None

cluster_column

The name of the column that holds the cluster index

bpoint_column

The name of the column that holds the benchmark yes/no information

default_marker_size

Default marker size

bpoint_marker_size

Marker size of benchmark points

draw_legend

If true, a legend is drawn

property fig

The figure.

property figsize

Figure size per subplot (width, height)

scatter(cols: List[str], clusters=None, **kwargs)[source]

Create scatter plot, specifying the columns to be on the axes of the plot. If 3 column are specified, 3D scatter plots are presented, else 2D plots. If the dataframe contains more columns, such that each row is not only specified by the columns on the axes, a selection of subplots is created, showing ‘cuts’. Benchmark points are marked by enlarged plot markers.

Parameters
  • cols – The names of the columns to be shown on the x, y (and z) axis of the plots.

  • clusters – The get_clusters to be plotted (default: all)

  • **kwargs – Kwargs for ax.scatter

Returns

The figure (unless the ‘inline’ setting of matplotllib is detected).

fill(cols: List[str], kwargs_imshow=None)[source]

Call this method with two column names, x and y. The results are similar to those of 2D scatter plots as created by the scatter method, except that the coloring is expanded to the whole xy plane. Note: This method only works with uniformly sampled NP!

Parameters
  • cols – List of name of column to be plotted on x-axis and on y-axis

  • kwargs_imshow – Additional keyword arguments to be passed to imshow

Returns

The figure (unless the ‘inline’ setting of matplotllib is detected).

savefig(*args, **kwargs)[source]

Equivalent to ClusterPlot.fig.savefig(*args, **kwargs): Saves figure to file, e.g. ClusterPlot.savefig("test.pdf").

BundlePlot

class clusterking.plots.BundlePlot(data)[source]

Bases: object

Plotting class to plot distributions by cluster in order to analyse which distributions get assigned to which cluster.

__init__(data)[source]
Parameters

dataData object

log

logging.Logger object

data

pandas dataframe

cluster_column

Name of the column holding the cluster number

draw_legend

Draw legend?

title

Override default titles with this title. If None, the default title is used.

ax

Instance of matplotlib.axes.Axes

property fig

Instance of matplotlib.pyplot.figure

property xrange

Range of the xaxis

property xlabel
property ylabel
plot_bundles(clusters: Union[None, int, Iterable[int]] = None, nlines=None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, hist_kwargs_bp: Optional[Dict[str, Any]] = None) None[source]

Plot several examples of distributions for each cluster specified

Parameters
  • clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.

  • nlines – Number of example distributions of each cluster to be plotted. Defaults to 0 if we plot benchmark points and 3 otherwise.

  • ax – Instance of matplotlib.axes.Axes to be plotted on. If None (default), a new axes object and figure is initialized and saved as self.ax and self.fig.

  • bpoints – Draw benchmark curve

  • hist_kwargs – Keyword arguments passed on to plot_histogram()

  • hist_kwargs_bp – Like hist_kwargs but used for benchmark points. If None, hist_kwargs is used.

Returns

None

animate_bundle(cluster, n, benchmark=True)[source]
plot_minmax(clusters: Union[None, int, Iterable[int]] = None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, fill_kwargs: Optional[Dict[str, Any]] = None) None[source]

Plot the minimum and maximum of each bin for the specified clusters.

Parameters
  • clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.

  • ax – Instance of matplotlib.axes.Axes to plot on. If None, a new one is instantiated.

  • bpoints – Plot benchmark points

  • hist_kwargs – Keyword arguments to plot_histogram()

  • fill_kwargs – Keyword arguments to`matplotlib.pyplot.fill_between`

Returns

None

err_plot(clusters: Union[None, int, Iterable[int]] = None, ax=None, bpoints=True, hist_kwargs: Optional[Dict[str, Any]] = None, hist_fill_kwargs: Optional[Dict[str, Any]] = None)[source]

Plot distributions with errors.

Parameters
  • clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.

  • ax – Instance of matplotlib.axes.Axes to plot on. If None, a new one is instantiated.

  • bpoints – Plot benchmark points? If False or benchmark points are not available, distributions corresponding to random sample points are chosen.

  • hist_kwargs – Keyword arguments to plot_histogram()

  • hist_fill_kwargs – Keyword arguments to plot_histogram_fill()

Returns

None

box_plot(clusters: Union[None, int, Iterable[int]] = None, ax=None, whiskers=2.5, bpoints=True, boxplot_kwargs: Optional[Dict[str, Any]] = None, hist_kwargs: Optional[Dict[str, Any]] = None) None[source]

Box plot of the bin contents of the distributions corresponding to selected clusters.

Parameters
  • clusters – List of clusters to selected or single cluster. If None (default), all clusters are chosen.

  • ax – Instance of matplotlib.axes.Axes to plot on. If None, a new one is instantiated.

  • whiskers – Length of the whiskers of the box plot in units of IQR (interquartile range, containing 50% of all values). Default 2.5.

  • bpoints – Draw benchmarks?

  • boxplot_kwargs – Arguments to matplotlib.pyplot.boxplot

  • hist_kwargs – Keyword arguments to plot_histogram()

plot_histogram

clusterking.plots.plot_histogram(ax, edges, contents, normalize=False, **kwargs)[source]

Plot a histogram.

Parameters
  • ax – Instance of matplotlib.axes.Axes to plot on. If None, a new figure will be initialized.

  • edges – Edges of the bins or None (to use bin numbers on the x axis)

  • contents – bin contents

  • normalize (bool) – Normalize histogram. Default False.

  • **kwargs – passed on to matplotlib.pyplot.step

Returns

Instance of matplotlib.axes.Axes

Colors

class clusterking.plots.ColorScheme(clusters: Optional[List[int]] = None, colors: Optional[List[str]] = None)[source]

Bases: object

Class holding color scheme. We want to assign a unique color to every cluster and keep it consistent across different plots. Subclass and overwrite color lists to implement different schemes.

__init__(clusters: Optional[List[int]] = None, colors: Optional[List[str]] = None)[source]

Initialize ColorScheme object.

Parameters
  • clusters – List of cluster names

  • colors – List of colors

property cluster_colors

List of colors

get_cluster_color(cluster: int)[source]

Returns base color for cluster.

Parameters

cluster – Name of cluster. Has to be in clusters

Returns

Color

to_colormap(name='MyColorMap')[source]

Returns colormap with color for each cluster.

faded_colormap(cluster: int, nlines: int, name='MyFadedColorMap', **kwargs)[source]

Returns colormap for one cluster, including the faded colors.

Parameters
  • cluster – Name of cluster

  • nlines – Number of shades

  • name – Name of colormap

  • **kwargs – Arguments for get_cluster_colors_faded()

Returns

Colormap

demo()[source]

Plot the colors for all clusters.

Returns

figure

demo_faded(cluster: Optional[int] = None, nlines=10, **kwargs)[source]

Plot the color shades for different lines corresponding to the same cluster

Parameters
Returns

figure

get_cluster_colors_faded(cluster: int, nlines: int, max_alpha=0.7, min_alpha=0.3)[source]

Shades of the base color, for cases where we want to draw multiple lines for one cluster

Parameters
  • cluster – Name of cluster

  • nlines – Number of shades

  • max_alpha – Maximum alpha value

  • min_alpha – Minimum alpha value

Returns

List of colors

get_err_color(cluster: int)[source]

Get color for error shades.

Parameters

cluster – Cluster name

Returns

color