Advanced topics

This section covers several advanced topics related to OVITO’s scripting interface:

Saving and loading pipelines

Controlling how many processor cores OVITO uses

Using OVITO with Python’s multiprocessing module

Packaging and installation of user extensions for OVITO

Saving and loading pipelines

The ovito.io.export_file() function lets you save the computed output of a pipeline to disk. But how do you save the definition of the pipeline itself, including the modifiers and their parameters, to a file? OVITO can save the entire visualization scene to a .ovito state file using the Scene.save() method. Thus, in order to save a Pipeline you need to first make it part of the scene by calling its add_to_scene() method:

import ovito
from ovito.io import import_file
from ovito.modifiers import CoordinationAnalysisModifier

pipeline = import_file("input/simulation.dump")
pipeline.modifiers.append(CoordinationAnalysisModifier(cutoff = 3.4))
# ... 
pipeline.add_to_scene()
ovito.scene.save("output/mypipeline.ovito")

The saved state can be restored by calling the Scene.load() method. This function loads all saved data pipelines and makes them available in the Scene.pipelines list:

import ovito

ovito.scene.load("output/mypipeline.ovito")
pipeline = ovito.scene.pipelines[0]
data = pipeline.compute()

Of course, it is also possible to open the .ovito state file with the graphical OVITO application or, conversely, to use the graphical application to create a data pipeline by hand and then save that pipeline to a .ovito file for future use.

You can either load the state file from a script at runtime using the Scene.load() method or preload the state file when running the script using the ovitos integrated interpreter. This is done by specifying the -o command line option:

ovitos -o mypipeline.ovito script.py

The code in script.py will now be executed in an environment where the Scene was already populated with the state loaded from the .ovito scene file. Instead of setting up a new pipeline from scratch, the script can now work with the existing pipeline(s) that were restored from the state file:

import ovito

pipeline = ovito.scene.pipelines[0]

# Replace the data file that serves as pipeline input with a different one:
pipeline.source.load("input/second_file.dump") 
# Adjust modifier parameters of the existing pipeline:
pipeline.modifiers[0].cutoff = 3.1

data = pipeline.compute()

Controlling how many processor cores OVITO uses

Many computation functions in OVITO have been parallelized in order to make use of all available processor cores. This includes, for example, computationally heavy analysis rountines such as PolyhedralTemplateMatchingModifier, VoronoiAnalysisModifier, ClusterAnalysisModifier, ComputePropertyModifier, and many more. The software-based rendering engines TachyonRenderer and OSPRayRenderer also harness multiple cores in parallel to speed up image generation.

By default, these algorithms will make use of all available cores of your CPU. This default number is determined by OVITO using the function QThread.idealThreadCount(), which can be queried as follows in a Python program:

>>> from ovito.qt_compat import QtCore
>>> print(QtCore.QThread.idealThreadCount())
4

Sometimes it is desirable to restrict OVITO to a single CPU core only, for example, when running multiple instances of OVITO in parallel on the same node. This can be acocmplished in two ways. The graphical application ovito and the script interpreter ovitos both support the command line parameter --nthreads, which lets you override the number of CPU cores used by parallel algorithms:

ovitos --nthreads 1 yourscript.py

Your second option is to set the OVITO_THREAD_COUNT environment variable prior to invoking or importing OVITO. This approach always works, including for Python scripts running in an external Python interpreter and importing the ovito module:

export OVITO_THREAD_COUNT=1
python3 yourscript.py

Using OVITO with Python’s `multiprocessing` module

In a nutshell:

Python’s multiprocessing module can be used to parallelize computations across multiple independent inputs.

Because the multiprocessing module spawns new interpreter processes, it cannot be used in the OVITO Pro desktop environment. But multiprocessing works in standalone Python scripts executed by an external Python interpreter.

Native objects from the ovito package do not support pickling and cannot be sent across process boundaries. So, when writing parallel functions, make sure they take and return only regular Python objects.

Since OVITO launches background worker threads, it is not compatible with the “fork” start method of the multiprocessing module. Is it necessary to use the safer “spawn” method on all OS platforms.

OVITO’s data pipeline and rendering systems perform certain calculations using all available CPU cores (see previous section). This means that such functions will automatically run faster when more processor cores are available – even if you call them serially in a Python script. However, some types of calculations in OVITO always use only a single processor core because their algorithms are inherently difficult to parallelize or are based on third-party libraries that lack multi-threading support. Then it can make sense to exploit a different type of parallelism.

Python’s multiprocessing module allows parallelizing the execution of a Python function across multiple independent inputs. This approach works well if each input, for example the frames of a simulation trajectory, can be processed independently from each other. Take the following script for example, which calculates an output number for every frame of a simulation trajectory:

pipeline = import_file('input/simulation.*.dump')
pipeline.modifiers.append(ClusterAnalysisModifier(cutoff = 3.0))

results = []
for frame in range(pipeline.source.num_frames):
    data = pipeline.compute(frame)
    num_clusters = data.attributes['ClusterAnalysis.cluster_count']
    results.append(num_clusters)

Note that it does not matter here in which order we process the trajectory frames. This is an important requirement for problems to be parallelized with the multiprocessing approach.

To parallelize the problem using the multiprocessing module, we need to move the processing code into a callable function:

def process_frame(frame):
    data = pipeline.compute(frame)
    return data.attributes['ClusterAnalysis.cluster_count']

The multiprocessing Pool class provides a method to invoke the process_frame() function for a range of trajectory frames. Depending on the number of CPU cores in the system, the Pool will spawn multiple worker processes to process requests in parallel.

with multiprocessing.Pool(None) as pool:
    results = list(pool.imap(process_frame, range(pipeline.source.num_frames)))

The computed values for all trajectory frames are collected and returned as a list.

Two important points are worth discussing:

Since the multiprocessing module takes care of distributing the work across all available processor cores, we should now keep OVITO from doing the same, because this would lead to an overscription of the cores. As explained in Controlling how many processor cores OVITO uses, we can restrict each OVITO instance to a single worker thread by setting the environment variable OVITO_THREAD_COUNT=1. Whether this step improves or hurts overall performance is problem dependent, and you should measure the influence of this setting on the execution time.
The multiprocessing module needs to send the input parameters and the return value of our worker function across process boundaries. This happens via Python’s pickle mechanism, which is something most OVITO objects do not support yet. Things like a DataCollection and its contents, or a Pipeline and its modifiers cannot be transmitted from the main program to the parallel worker function. We therefore need to restrict ourselves to simple Python values, objects, or NumPy arrays that can be pickled.

The following complete program demonstrates the use of the multiprocessing approach. It provides a configuration flag that switches between a conventional for-based and the parallel processing mode, which allows comparing the execution times required by both approaches.

# This configuration flag enables the use of the Python multiprocessing module.
# Set it to false to switch back to conventional for-loop processing (included for comparison).
use_multiprocessing = True

if use_multiprocessing:
    # Disable internal parallelization of OVITO's pipeline system, which otherwise would perform certain
    # computations using all available processor cores. Setting the environment variable OVITO_THREAD_COUNT=1
    # must be done BEFORE importing the ovito module and will restrict OVITO to a single CPU core per process.
    import os
    os.environ["OVITO_THREAD_COUNT"] = "1"

from ovito.io import *
from ovito.modifiers import *

# Set up the OVITO data pipeline for performing some heavy computations for each trajectory frame.
# Keep in mind that this initialization code gets executed by EVERY worker process in the multiprocessing pool.
# So we should only prepare but not carry out any expensive operations here.
pipeline = import_file('input/simulation.*.dump')
pipeline.modifiers.append(ClusterAnalysisModifier(cutoff = 3.0))

# Define the worker function which gets called for every trajectory frame. It evaluates the pipeline
# at the requested simulation time and returns the computed results for that frame back to the caller.
#
# IMPORTANT: The function may only return regular Python objects but not OVITO objects
# such as DataCollection or Property. That's because OVITO objects do not currently support pickling,
# which means they cannot be sent back to the main program across process boundaries.
#
# In this simple example, the return value is a simple scalar (a global attribute) computed per frame.
# Note that it is possible to return NumPy arrays, which includes NumPy views of OVITO property arrays.
# Thus, to return the cluster assignment of each atom computed by the ClusterAnalysisModifier
# we could write:
#
#    return data.particles['Cluster'][...]
#
def process_frame(frame):
    data = pipeline.compute(frame)
    return data.attributes['ClusterAnalysis.cluster_count']

# Main program entry point:
if __name__ == '__main__':

    # Measure time for benchmarking purposes.
    from time import time
    t_start = time()

    if use_multiprocessing:

        # Force "spawn" start method on all platforms, because OVITO is not compatible with "fork" method.
        import multiprocessing as mp
        mp.set_start_method('spawn')

        # Create a pool of processes which will process the trajectory frames in parallel.
        with mp.Pool(None) as pool:
            results = list(pool.imap(process_frame, range(pipeline.source.num_frames)))

    else:

        # Conventional for-loop iterating over all frames of the trajectory and processing one by one.
        results = [process_frame(frame) for frame in range(pipeline.source.num_frames)]

    t_end = time()

    print(f"Computation took {t_end - t_start} seconds")

Packaging and installation of user extensions for OVITO

OVITO’s Python interface allows you to implement custom modifiers, file readers, and viewport overlays that seamlessly extend the functionality of OVITO Pro and the standalone OVITO Python package. To make such extensions available to the user in the graphical user interface or in Python programs, they need to be properly installed and registered for the system to automatically find them at runtime.

The automatic discovery of user-defined extension classes is based on Python’s entry point specification. This standard mechanism helps to keep the code of your extension class cleanly separated from the main OVITO software package. That means both may be updated or changed independently, and the standard Python packaging mechanism facilitates the easy deployment of your extension to other OVITO users in case you want to share your work with the community.

Tip

To aid you in the development of new extensions, we provide GitHub repository templates for each OVITO extension type, which include step-by-step instructions, an initial directory structure, and a pyproject.toml skeleton that simplifies packaging and distribution of your extension to other users:

Furthermore, the OVITO team hosts a public repository where you can publish your extensions to share them with the user community:

OVITO Extensions Directory: A collection of Python extensions submitted by the community, maintained by their respective authors, and curated by the OVITO developers.

In the following, we’ll consider a custom modifier class written for OVITO Pro with the name MyModifier, which will serve as an example for an extension to be registered in the OVITO system. We start with a simple code project layout having the following directory structure:

<root-dir>/
├── pyproject.toml
└── src/
    └── MyModifier/
        └── __init__.py

The MyModifier class is defined in src/MyModifier/__init__.py:

from ovito.data import DataCollection
from ovito.pipeline import ModifierInterface

class MyModifier(ModifierInterface):
   def modify(self, data: DataCollection, frame: int, **kwargs):
      ...

If you are developing a complex extension, you can also split your code into multiple .py files or nested packages if needed.

The pyproject.toml package metadata file is needed for the installation of MyModifier and any dependencies it requires (e.g. the SciPy package) using pip:

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "MyModifier"
version = "1.0"
description = "A custom Python modifier for OVITO."
keywords = ["ovito", "python-file-reader"]
authors = [{name = "Author Name", email = "author@organization.org"}]
maintainers = [{name = "Maintainer", email = "maintainer@organization.org"}]
license = {text = "MIT License"}
readme = "README.md"
requires-python = ">=3.7"
dependencies = [
   "ovito >= 3.9.1"
]

[project.urls]
repository = "https://github.com/author/MyModifier"

[tool.setuptools.packages.find]
where = ["src"]

Thanks to this pyproject.toml file, as we will see later, the code of MyModifier can be installed in any Python interpreter (including the OVITO Pro integrated interpreter) using the pip standard utility – just like regular Python packages. After the installation, we could call import MyModifier from a Python program to load and use our module.

However, we still have to tell OVITO Pro that the MyModifier package contains the definition of an OVITO ModifierInterface class (or some other kind of OVITO extension) that should be automatically loaded at runtime and made available in the graphical user interface of OVITO Pro. This requires the definition of an entry point in pyproject.toml (see the setuptools website for more details).

To register our custom modifier class as an entry point in the standard group OVITO.Modifier, we add an entry-points section to pyproject.toml:

[project.entry-points.'OVITO.Modifier']
"My Modifier" = "MyModifier:MyModifier"

The second line first specifies the human-readable name of our extension to be displayed in the OVITO Pro GUI. The part after the = specifies the module name and the Python class that implements the OVITO extension. In our case, both the module and the class are named MyModifier.

OVITO Pro automatically loads all installed packages that did register entry points in the OVITO.Modifier group and displays them to the user as part of the list of available modifiers. For other extension types similar entry point groups exist:

Python extension class	Entry point group	Template
`ModifierInterface`	`[project.entry-points.'OVITO.Modifier']`	pyproject.toml
`PipelineSourceInterface`	`[project.entry-points.'OVITO.PipelineSource']`	pyproject.toml
`FileReaderInterface`	`[project.entry-points.'OVITO.FileReader']`	pyproject.toml
`ViewportOverlayInterface`	`[project.entry-points.'OVITO.ViewportOverlay']`	pyproject.toml

After preparing the pyproject.toml file, you can now install your extension package locally into OVITO Pro by invoking the pip install command via ovitos:

ovitos -m pip install --user --editable <root-dir>

where <root-dir> is the directory containing the pyproject.toml file. The --editable option ensures that you can still make code changes without the need to reinstall the package. -m tells ovitos to invoke the pip module and run its install command.

If you decide to upload the code of your OVITO extension to a public Git repository, other users can install it directly from your online repository as explained in Installing Python-based extensions for OVITO with pip.

Advanced topics

Saving and loading pipelines

Controlling how many processor cores OVITO uses

Using OVITO with Python’s multiprocessing module

Packaging and installation of user extensions for OVITO

Using OVITO with Python’s `multiprocessing` module