Cluster analysis


This modifier decomposes a particle system into disconnected groups of particles (clusters) based on a local neighboring criterion. The neighboring criterion can either be distance-based (i.e. a cutoff) or bond-based (graph topology).

A cluster is defined as a set of connected particles, each of which is within the (indirect) reach of one or more other particles from the same cluster. Thus, any two particles from the same cluster are connected by a continuous path consisting of steps all fulfilling the selected neighboring criterion. Conversely, two particles will not belong to the same cluster if there is no such continuous path on the neighbor network leading from one particle to the other.

You can choose between the distance-based neighbor criterion, in which case two particles are considered neighbors if they are within a specified range of each other, and the bond-based criterion, in which case two particles are considered neighbors if they are connected by a bond. A particle without any neighbors forms a single-particle cluster of its own.

The modifier assigns numeric IDs to the clusters it forms (ranging from 1 to N, the total number of clusters). Each particle is assigned to one of these clusters and this information is output by the modifier as a new particle property named Cluster. Note that the numbering of clusters is arbitrary by default and depends on the order in which input particles are stored. You can activate the Sort clusters by size option to request an ordering of cluster IDs by number of contained particles. This guarantees that the first cluster (ID 1) will be the one with the largest number of particles.


Neighbor mode

Selects the criterion which is used to determine whether two particles are neighbors or not.

Cutoff distance

The range up to which two particles are considered neighbors.

Use only selected particles

This option restrict the clustering algorithm to currently selected particles. Unselected particles will be treated as if they do not exist and will be assigned the special cluster ID 0.

Sort clusters by size

This option sorts the clusters by size (in descending order). Cluster ID 1 will be the largest cluster, cluster ID 2 the second largest, and so on.

Exporting the modifier's results

Total number of clusters

To export the total number of clusters generated by the modifier to a text file (possibly as a function of time), use OVITO's standard file export function. Choose "Table of values" as output format and make sure that the ClusterAnalysis.cluster_count global value, emitted by the modifier to the data pipeline, gets exported.

Cluster particles

To export the list of particles belonging to each individual cluster, also use OVITO's standard file export function. Choose the XYZ output file format and select the Cluster property for export. This will produce a text file with the cluster ID of each particle.

Cluster sizes

Determining the size (i.e. the number of particles) of all clusters generated by the Cluster Analysis modifier requires usage of the Python scripting interface of OVITO. You can copy/past the following Python script to a .py file and execute it using the ScriptingRun Script File in the menu:

import ovito
import numpy

output_filepath = "cluster_sizes.txt"
data = ovito.dataset.selected_pipeline.compute()
cluster_sizes = numpy.bincount(data.particles['Cluster'])
numpy.savetxt(output_filepath, cluster_sizes)

Your should adjust the output file path in the script as needed. The script makes use of the bincount() Numpy function to count the number of particles belonging to each cluster. Note that the array returned by this function includes cluster ID 0, which is normally not assigned by the modifier and therefore typically has size zero. For more information on OVITO's scripting interface, see this section.

It is possible to perform the file export for every frame in a simulation sequence by adding a for-loop to the script:

import ovito
import numpy

for frame in range(ovito.dataset.anim.last_frame + 1):
    output_filepath = "cluster_sizes.%i.txt" % frame
    data = ovito.dataset.selected_pipeline.compute(frame)
    cluster_sizes = numpy.bincount(data.particles['Cluster'])
    numpy.savetxt(output_filepath, cluster_sizes)

See also

ClusterAnalysisModifier (Python API)