Data pipelines

Modifiers are composable function objects arranged in a sequence to form a data processing pipeline. They dynamically modify, filter, analyze or extend the data that flows down the pipeline. Here, with data we mean any form of information that OVITO can process, e.g., particles and their properties, bonds, the simulation cell, triangles meshes, voxel data, etc. The main purpose of the pipeline concept is to enable non-destructive and repeatable workflows, i.e., once a modification pipeline has been set up, it can be re-used repeatedly on multiple input datasets.

A processing pipeline is represented by an instance of the Pipeline class in OVITO. Initially, a pipeline contains no modifiers. That means its output will be identical to its input. A pipeline’s input data is provided by a separate source object that is attached to the pipeline. Typically, this source object is an instance of the FileSource class, which reads the input data from an external data file.

You can insert a modifier into a Pipeline by creating a new instance of the corresponding modifier type (see the ovito.modifiers module for all available modifier types) and then adding it to the pipeline’s modifiers list:

from ovito.modifiers import AssignColorModifier

modifier = AssignColorModifier(color = (0.5, 1.0, 0.0))
pipeline.modifiers.append(modifier)

The modifiers in the Pipeline.modifiers list are executed in sequential order: Appending a modifier to the end of the list makes it the last one to process the data that flows down the pipeline. In other words, it will only see data that has already been processed and modified by the preceding modifiers in the list.

../_images/Pipeline.svg

Note that inserting a new modifier into the pipeline does not immediately trigger a new computation of the pipeline results. This happens only when pipeline results are requested, either by you or the system. For example, evaluation of the pipeline may be triggered implicitly when

  • rendering an image or movie,

  • updating the interactive viewports in OVITO’s graphical user interface,

  • or exporting data using the ovito.io.export_file() function.

You can explicitly request an evaluation of a pipeline by calling its compute() method. This method returns a new DataCollection object holding the data that has left the pipeline after all modifiers currently in the pipeline have processed the input data:

>>> data = pipeline.compute()

The Data model section will take a closer look at the data structure returned by this function.

Note that it is possible to change an existing pipeline and the parameters of its modifiers at any time. Such changes do not immediately trigger a recomputation of the pipeline results (unlike in the graphical user interface, where changing a modifier’s parameters lets OVITO immediately recompute the results and update the interactive viewports). In a Python script, we have to call the pipeline’s compute() method again to request a new evaluation of the modifiers in the pipeline after making a change to the pipeline:

# Set up a new pipeline containing one modifier:
pipeline = import_file("simulation.dump")
pipeline.modifiers.append(AssignColorModifier(color = (0.5, 1.0, 0.0)))

# Evaluate the current pipeline a first time:
data1 = pipeline.compute()

# Now altering the pipeline by e.g. changing parameters or appending modifiers:
pipeline.modifiers[0].color = (0.8, 0.8, 1.0)
pipeline.modifiers.append(CoordinationAnalysisModifier(cutoff = 5.0))

# Evaluate the pipeline a second time, now yielding new results:
data2 = pipeline.compute()

Processing simulation trajectories

As mentioned in the File I/O section, importing a simulation trajectory consisting of a sequence of frames is possible. A pipeline typically processes one frame at a time of the sequence. You can request the pipeline results for a specific simulation frame by passing the frame number to the pipeline’s compute() method, e.g.:

pipeline = import_file("trajectory_*.dump")
data_frame0 = pipeline.compute(0)
data_frame1 = pipeline.compute(1)
data_frame2 = pipeline.compute(2)
...

The numbering of animation frames starts at 0 in OVITO. Typically, a for-loop of the following form is used to iterate over all frames of a simulation sequence:

for frame in range(pipeline.source.num_frames):
    data = pipeline.compute(frame)
    ...

The num_frames property of the pipeline source tells you how many frames the input trajectory contains.

Note

When using a Pipeline in a loop to process a sequence of simulation frames, make sure you do not populate the pipeline with modifiers inside the loop. Repeatedly adding modifiers to the pipeline as part of a for-loop is usually wrong:

# WRONG!!!
for frame in range(pipeline.source.num_frames):
    pipeline.modifiers.append(AtomicStrainModifier(cutoff = 3.2))
    data = pipeline.compute(frame)
    ...

Since the loop body gets executed multiple times, this code keeps appending additional modifiers to the pipeline, making it longer and longer with every iteration. As a result, several AtomicStrainModifier instances end up in the pipeline, each performing the same computation over and over again when compute() is called. Instead, you should completely set up the pipeline just once before entering the loop:

# CORRECT:
# Step I: Populate the pipeline with modifiers:
pipeline.modifiers.append(AtomicStrainModifier(cutoff = 3.2))

# Step II: Evaluate the pipeline in a loop over all frames:
for frame in range(pipeline.source.num_frames):
    data = pipeline.compute(frame)
    ...

Note that it is sometimes necessary (and valid) to update parameters of existing modifiers inside the for-loop.