Data model
OVITO organizes the information it processes into data objects, each representing a specific fragment of a dataset.
For example, a dataset may be composed of a SimulationCell
object holding the box dimensions and boundary conditions,
a Particles
object storing information associated with the particles, and a Bonds
sub-object storing the
list of bonds between particles. For each type of data object you will find a corresponding Python class in the ovito.data
module.
All of them derive from one common base class: DataObject
.
Data objects can contain other data objects, forming a nested structure with parent-child relationships.
For example, the Particles
object is a container, which manages
a number of Property
objects, each being an array of property values associated with the particles.
Furthermore, the Particles
object can also contain a Bonds
object,
which in turn is a container for the Property
objects storing the per-bond property values:
The DataCollection
class is always at the topmost level of this nested object hierarchy.
It is the fundamental unit representing a complete dataset that was loaded from one or more input simulation files,
and which then gets processed by the modifier steps of a data pipeline. Modifiers may alter individual data objects within the DataCollection
,
add new data objects to the top-level collection, or create additional sub-objects in nested container objects.
When you call the Pipeline.compute()
method, you receive back a DataCollection
holding the computation results of the pipeline. The DataCollection
class provides various property fields for accessing the different kinds
of sub-objects it contains.
It is important to note that a DataCollection
object represents just a single animation frame
and not an entire simulation trajectory. Thus, in OVITO’s data model, a simulation trajectory is rather represented as a series of
DataCollection
instances. A data pipeline operates on and produces only a single DataCollection
at a time, i.e., it works on a frame-by-frame basis.
Particles
The Particles
data object, accessible via the DataCollection.particles
field, holds all particle or molecule-related data. OVITO uses a property-centered representation of particles, where information is stored as a set of uniform memory
arrays of the same length. Each array represents one particle property, such as position, type, mass, color, etc., and holds the values for all N particles
in the system. A property data array is an instance of the Property
data object class, which is not only used by OVITO for storing
particle properties but also bond properties, voxel grid properties, and more.
Thus, a system of particles is nothing more than a loose collection of Property
objects, which are held
together by a container, the Particles
object, a specialization of the generic PropertyContainer
base class. Each particle property has a unique name that identifies the meaning
of the property. OVITO defines a set of standard property names, which have a specific meaning to the program and a prescribed data format.
The Position
standard property, for example, holds the XYZ coordinates of all particles and is mandatory. Other standard
properties, such as Color
or Mass
, are optional and may or may not be present in a Particles
container.
Furthermore, Property
objects with non-standard names are supported, representing user-defined particle properties.
The Particles
container object mimics the programming interface of a Python dictionary, which lets you look up properties by name.
To find out which properties are present, you can query the dictionary for its keys:
>>> data = pipeline.compute()
>>> list(data.particles.keys())
['Particle Identifier', 'Particle Type', 'Position', 'Color']
Individual particle properties can be looked up by their name:
>>> color_property = data.particles['Color']
Some standard properties can also be accessed through convenient getter fields, for example, the Particles.colors
field:
>>> color_property = data.particles.colors
The Particles
class is a sub-class of the generic
PropertyContainer
base class. OVITO defines several property container types, such as the Bonds
,
DataTable
, and VoxelGrid
types, which all work like the Particles
type.
They all have in common that they represent an array of uniform data elements, which may be associated with an arbitrary set of properties.
Property objects
A PropertyContainer
manages a variable set of Property
objects, each Property
storing the
values for one particular property of all data elements in an array. A Property
object behaves pretty much like a standard NumPy array:
>>> coordinates = data.particles.positions
>>> print(coordinates[...])
[[ 73.24230194 -5.77583981 -0.87618297]
[-49.00170135 -35.47610092 -27.92519951]
[-50.36349869 -39.02569962 -25.61310005]
...,
[ 42.71210098 59.44919968 38.6432991 ]
[ 42.9917984 63.53770065 36.33330154]
[ 44.17670059 61.49860001 37.5401001 ]]
Property arrays can be one-dimensional (in the case of scalar properties) or two-dimensional (in the case of vector properties).
The size of the first array dimension is always equal to the number of data elements (e.g. particles) stored in the parent PropertyContainer
.
The container reports the current number of elements via its count
attribute:
>>> data.particles.count # Number of particles
28655
>>> data.particles['Mass'].shape # 1-dim. array
(28655,)
>>> data.particles['Color'].shape # 2-dim. array
(28655, 3)
>>> data.particles['Color'].dtype # Data type of property array
float64
OVITO currently supports three different numeric data types for property arrays: float64
, int32
, and int64
. For built-in standard properties,
the data type and the dimensionality are prescribed by OVITO. For user-defined properties, they can be chosen by the user when
creating a new property.
Global attributes
Global attributes are simple tokens of information associated with a DataCollection
as a whole,
organized as key-value pairs in the Python dictionary DataCollection.attributes
.
File readers automatically generate certain global attributes at the source of a data pipeline to associate
the imported dataset with relevant information, such as the current simulation timestep number or the name of the input file.
In the graphical user interface of OVITO you can inspect the current set of global attributes by opening the
Data Inspector panel.
Modifiers in a data pipeline may associate a DataCollection
with additional attributes to
report their computation results. For example, the ClusterAnalysisModifier
outputs the attribute
named ClusterAnalysis.cluster_count
, which reflects the total number of particles clusters that have been found
by the clustering algorithm at the current timestep. OVITO provides functions to export such attributes to an output text file,
or to embed them in rendered images and animations as a dynamic TextLabelOverlay
.
Please refer to the DataCollection.attributes
documentation for more on global attributes.
Data tables
Tabulated data is represented in OVITO by DataTable
objects, which are a specialized type of PropertyContainer
.
A DataTable
consists of a variable number of rows and columns. Each column is an instance of the Property
class.
Data tables are typically generated dynamically by modifiers performing computations, for example, the HistogramModifier
or the CommonNeighborAnalysisModifier
,
to store their results. In the graphical user interface of OVITO, data tables are rendered as graphs or charts (line, scatter, histogram plots), found in the data inspector panel.
In Python, all data tables generated by the modifiers in the current pipeline can be accessed from the DataCollection.tables
dictionary
returned by the pipeline. Each table has a unique identifier
string, which serves as lookup key in that dictionary.
Furthermore, OVITO provides export functions for writing data tables to an output text file. Please see the DataTable
class for further details.
Surface meshes
OVITO can import or generate surface mesh data structures for visualization purposes and other applications. For instance,
the ConstructSurfaceModifier
can be inserted into a data pipeline to construct a triangulated
surface mesh representing the spatial region filled with particles. The output of this modifier is a SurfaceMesh
object, which holds the vertices and faces of the mesh. See also the corresponding section in the user manual.
In Python, surface meshes generated by the modifiers in the current pipeline can be accessed from the DataCollection.surfaces
dictionary
returned by the pipeline. Each surface mesh has a unique identifier
string, which serves as lookup key in that dictionary.
Please see the SurfaceMesh
class for further details.