Data pipelines
Modifiers are composable function objects arranged in a sequence to form a data processing pipeline. They dynamically modify, filter, analyze or extend the data that flows down the pipeline. Here, with data we mean any form of information that OVITO can process, e.g., particles and their properties, bonds, the simulation cell, triangles meshes, voxel data, etc. The main purpose of the pipeline concept is to enable non-destructive and repeatable workflows, i.e., once a modification pipeline has been set up, it can be re-used repeatedly on multiple input datasets.
A processing pipeline is represented by an instance of the Pipeline
class in OVITO.
Initially, a pipeline contains no modifiers. That means its output will be identical to its input. A pipeline’s input data
is provided by a separate source object that is attached to the pipeline.
Typically, this source
object is an instance of the FileSource
class, which reads the input data
from an external data file.
You can insert a modifier into a Pipeline
by creating a new
instance of the corresponding modifier type (see the ovito.modifiers
module for all available modifier types) and then
adding it to the pipeline’s modifiers
list:
from ovito.modifiers import AssignColorModifier
modifier = AssignColorModifier(color = (0.5, 1.0, 0.0))
pipeline.modifiers.append(modifier)
The modifiers in the Pipeline.modifiers
list are executed in sequential order:
Appending a modifier to the end of the list makes it the last one to process
the data that flows down the pipeline. In other words, it will only see data that has already been processed and modified by the preceding modifiers in the list.
Note that inserting a new modifier into the pipeline does not immediately trigger a new computation of the pipeline results. This happens only when pipeline results are requested, either by you or the system. For example, evaluation of the pipeline may be triggered implicitly when
rendering an image or movie,
updating the interactive viewports in OVITO’s graphical user interface,
or exporting data using the
ovito.io.export_file()
function.
You can explicitly request an evaluation of a pipeline by calling its compute()
method.
This method returns a new DataCollection
object holding the data that has left the pipeline
after all modifiers currently in the pipeline have processed the input data:
>>> data = pipeline.compute()
The Data model section will take a closer look at the data structure returned by this function.
Note that it is possible to change an existing pipeline and the parameters of its modifiers at any time. Such changes do not
immediately trigger a recomputation of the pipeline results (unlike in the graphical user interface, where changing a modifier’s parameters
lets OVITO immediately recompute the results and update the interactive viewports). In a Python script, we have to
call the pipeline’s compute()
method again to request a new evaluation of the modifiers
in the pipeline after making a change to the pipeline:
# Set up a new pipeline containing one modifier:
pipeline = import_file("simulation.dump")
pipeline.modifiers.append(AssignColorModifier(color = (0.5, 1.0, 0.0)))
# Evaluate the current pipeline a first time:
data1 = pipeline.compute()
# Now altering the pipeline by e.g. changing parameters or appending modifiers:
pipeline.modifiers[0].color = (0.8, 0.8, 1.0)
pipeline.modifiers.append(CoordinationAnalysisModifier(cutoff = 5.0))
# Evaluate the pipeline a second time, now yielding new results:
data2 = pipeline.compute()
Processing simulation trajectories
As mentioned in the File I/O section, importing a simulation trajectory consisting of a sequence of frames is possible.
A pipeline typically processes one frame at a time of the sequence. You can request the pipeline results for a specific simulation frame by
passing the frame number to the pipeline’s compute()
method, e.g.:
pipeline = import_file("trajectory_*.dump")
data_frame0 = pipeline.compute(0)
data_frame1 = pipeline.compute(1)
data_frame2 = pipeline.compute(2)
...
The numbering of animation frames starts at 0 in OVITO. Typically, a for
-loop of the following form is used to iterate over all frames of a simulation sequence:
for frame in range(pipeline.num_frames):
data = pipeline.compute(frame)
...
Alternatively, you can use the frames
property of the pipeline to iterate over all frames of the simulation sequence.
This iterator gives you direct access to the computed DataCollection
:
for data in pipeline.frames:
...
The num_frames
property of the pipeline source tells you how many frames the input trajectory contains.
Note
When using a Pipeline
in a loop to process a sequence of simulation frames, make sure you
do not populate the pipeline with modifiers inside the loop. Repeatedly adding modifiers to the pipeline as part of a for-loop is
usually wrong:
# WRONG!!!
for data in pipeline.frames:
pipeline.modifiers.append(AtomicStrainModifier(cutoff = 3.2))
...
Since the loop body gets executed multiple times, this code keeps appending additional modifiers to the pipeline,
making it longer and longer with every iteration.
As a result, several AtomicStrainModifier
instances end up in the pipeline, each performing the same
computation over and over again when compute()
is called.
Instead, you should completely set up the pipeline just once before entering the loop:
# CORRECT:
# Step I: Populate the pipeline with modifiers:
pipeline.modifiers.append(AtomicStrainModifier(cutoff = 3.2))
# Step II: Evaluate the pipeline in a loop over all frames:
for data in pipeline.frames:
...
Note that it is sometimes necessary (and valid) to update parameters of existing modifiers inside the for-loop.