Cluster

Cluster

class clusterking.cluster.cluster.Cluster(data: clusterking.data.data.Data)[source]

Bases: object

Abstract baseclass of the Cluster classes. This class is subclassed to implement specific clustering algorithms and defines common functions.

__init__(data: clusterking.data.data.Data)[source]
md = None

Metadata

cluster(**kwargs)[source]

Performs the clustering. This method is a wrapper around the _cluster implementation in the subclasses. See there for additional arguments.

write(cluster_column='cluster')[source]

Write results back in the Data object.

HierarchyCluster

class clusterking.cluster.HierarchyCluster(data)[source]

Bases: clusterking.cluster.cluster.Cluster

__init__(data)[source]
metric = None

Function that, applied to Data or DWE object returns the metric as a condensed distance matrix.

set_metric(*args, **kwargs) → None[source]

Select a metric in one of the following ways:

  1. If no positional arguments are given, we choose the euclidean metric.
  2. If the first positional argument is string, we pick one of the metrics
that are defined in scipy.spatical.distance.pdist by that name (all additional arguments will be past to this function).

3. If the first positional argument is a function, we take this function (and add all additional arguments to it).

Examples:

  • ...(): Euclidean metric
  • ...("euclidean"): Also Euclidean metric
  • ...(lambda data: scipy.spatial.distance.pdist(data.data(), 'euclidean'): Also Euclidean metric
  • ...("minkowski", p=2): Minkowsky distance with p=2.

See https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html for more information.

Parameters:
  • *args
  • **kwargs
Returns:

Function that takes Data object as only parameter and returns a reduced distance matrix.

build_hierarchy(method='complete', optimal_ordering=False) → None[source]

Build the hierarchy object.

Parameters:
  • method – See reference on scipy.cluster.hierarchy.linkage
  • optimal_ordering – See reference on scipy.cluster.hierarchy.linkage
dendrogram(output: Union[None, str, pathlib.Path] = None, ax=None, show=False, **kwargs) → Optional[<sphinx.ext.autodoc.importer._MockObject object at 0x7f9006c03f60>][source]

Creates dendrogram

Parameters:
  • output – If supplied, we save the dendrogram there
  • ax – An axes object if you want to add the dendrogram to an existing axes rather than creating a new one
  • show – If true, the dendrogram is shown in a viewer.
  • **kwargs – Additional keyword options to scipy.cluster.hierarchy.dendrogram
Returns:

The matplotlib.pyplot.Axes object

KmeansCluster

class clusterking.cluster.KmeansCluster(data)[source]

Bases: clusterking.cluster.cluster.Cluster

__init__(data)[source]