Advanced topics

This section covers several advanced topics related to OVITO’s scripting interface:

Saving and loading pipelines

The ovito.io.export_file() function lets you save the computed output of a pipeline to disk. But how do you save the definition of the pipeline itself, including the modifiers and their parameters, to a file? OVITO can save the entire visualization scene to a .ovito state file using the Scene.save() method. Thus, in order to save a Pipeline you need to first make it part of the scene by calling its add_to_scene() method:

import ovito
from ovito.io import import_file
from ovito.modifiers import CoordinationAnalysisModifier

pipeline = import_file("input/simulation.dump")
pipeline.modifiers.append(CoordinationAnalysisModifier(cutoff = 3.4))
# ... 
pipeline.add_to_scene()
ovito.scene.save("output/mypipeline.ovito")

The saved state can be restored by calling the Scene.load() method. This function loads all saved data pipelines and makes them available in the Scene.pipelines list:

import ovito

ovito.scene.load("output/mypipeline.ovito")
pipeline = ovito.scene.pipelines[0]
data = pipeline.compute()

Of course, it is also possible to open the .ovito state file with the graphical OVITO Pro application or, conversely, to use the GUI to interactively create a data pipeline and then save it to a .ovito file for future use.

You can either load the state file from a script at runtime using the Scene.load() method or preload the state file when running the script using the ovitos integrated interpreter. This is done by specifying the -o command line option:

ovitos -o mypipeline.ovito script.py

The code in script.py will now be executed in an environment where the Scene was already populated with the state loaded from the .ovito scene file. Instead of setting up a new pipeline from scratch, the script can now work with the existing pipeline(s) that were restored from the state file:

import ovito

pipeline = ovito.scene.pipelines[0]

# Replace the data file that serves as pipeline input with a different one:
pipeline.source.load("input/second_file.dump") 
# Adjust modifier parameters of the existing pipeline:
pipeline.modifiers[0].cutoff = 3.1

data = pipeline.compute()

Controlling how many processor cores OVITO uses

Many computation functions in OVITO have been parallelized in order to make use of all available processor cores. This includes, for example, computationally heavy analysis rountines such as PolyhedralTemplateMatchingModifier, VoronoiAnalysisModifier, ClusterAnalysisModifier, ComputePropertyModifier, and many more. The software-based rendering engines TachyonRenderer and OSPRayRenderer also harness multiple cores in parallel to speed up image generation.

By default, these algorithms will make use of all available cores of your CPU. This default number is determined by OVITO using the function QThread.idealThreadCount(), which can be queried as follows in a Python program:

>>> from ovito.qt_compat import QtCore
>>> print(QtCore.QThread.idealThreadCount())
4

Sometimes it is desirable to restrict OVITO to a single CPU core only, for example, when running multiple instances of OVITO in parallel on the same node. This can be acocmplished in two ways. The graphical application ovito and the script interpreter ovitos both support the command line parameter --nthreads, which lets you override the number of CPU cores used by parallel algorithms:

ovitos --nthreads 1 yourscript.py

Your second option is to set the OVITO_THREAD_COUNT environment variable prior to invoking or importing OVITO. This approach always works, including for Python scripts running in an external Python interpreter and importing the ovito module:

export OVITO_THREAD_COUNT=1
python3 yourscript.py

Using OVITO with Python’s multiprocessing module

In a nutshell:

  • Python’s multiprocessing module can be used to parallelize computations across multiple independent inputs.

  • Because the multiprocessing module spawns new interpreter processes, it cannot be used in the OVITO Pro desktop environment. But multiprocessing works in standalone Python scripts executed by an external Python interpreter.

  • Native objects from the ovito package do not support pickling and cannot be sent across process boundaries. So, when writing parallel functions, make sure they take and return only regular Python objects.

  • Since OVITO launches background worker threads, it is not compatible with the “fork” start method of the multiprocessing module. Is it necessary to use the safer “spawn” method on all OS platforms.

OVITO’s data pipeline and rendering systems perform certain calculations using all available CPU cores (see previous section). This means that such functions will automatically run faster when more processor cores are available – even if you call them serially in a Python script. However, some types of calculations in OVITO always use only a single processor core because their algorithms are inherently difficult to parallelize or are based on third-party libraries that lack multi-threading support. Then it can make sense to exploit a different type of parallelism.

Python’s multiprocessing module allows parallelizing the execution of a Python function across multiple independent inputs. This approach works well if each input, for example the frames of a simulation trajectory, can be processed independently from each other. Take the following script for example, which calculates an output number for every frame of a simulation trajectory:

pipeline = import_file('input/simulation.*.dump')
pipeline.modifiers.append(ClusterAnalysisModifier(cutoff = 3.0))

results = []
for data in pipeline.frames:
    num_clusters = data.attributes['ClusterAnalysis.cluster_count']
    results.append(num_clusters)

Note that it does not matter here in which order we process the trajectory frames. This is an important requirement for problems to be parallelized with the multiprocessing approach.

To parallelize the problem using the multiprocessing module, we need to move the processing code into a callable function:

def process_frame(frame):
    data = pipeline.compute(frame)
    return data.attributes['ClusterAnalysis.cluster_count']

The multiprocessing Pool class provides a method to invoke the process_frame() function for a range of trajectory frames. Depending on the number of CPU cores in the system, the Pool will spawn multiple worker processes to process requests in parallel.

with multiprocessing.Pool(None) as pool:
    results = list(pool.imap(process_frame, range(pipeline.num_frames)))

The computed values for all trajectory frames are collected and returned as a list.

Two important points are worth discussing:

  1. Since the multiprocessing module takes care of distributing the work across all available processor cores, we should now keep OVITO from doing the same, because this would lead to an overscription of the cores. As explained in Controlling how many processor cores OVITO uses, we can restrict each OVITO instance to a single worker thread by setting the environment variable OVITO_THREAD_COUNT=1. Whether this step improves or hurts overall performance is problem dependent, and you should measure the influence of this setting on the execution time.

  2. The multiprocessing module needs to send the input parameters and the return value of our worker function across process boundaries. This happens via Python’s pickle mechanism, which is something most OVITO objects do not support yet. Things like a DataCollection and its contents, or a Pipeline and its modifiers cannot be transmitted from the main program to the parallel worker function. We therefore need to restrict ourselves to simple Python values, objects, or NumPy arrays that can be pickled.

The following complete program demonstrates the use of the multiprocessing approach. It provides a configuration flag that switches between a conventional for-based and the parallel processing mode, which allows comparing the execution times required by both approaches.

# This configuration flag enables the use of the Python multiprocessing module.
# Set it to false to switch back to conventional for-loop processing (included for comparison).
use_multiprocessing = True

if use_multiprocessing:
    # Disable internal parallelization of OVITO's pipeline system, which otherwise would perform certain
    # computations using all available processor cores. Setting the environment variable OVITO_THREAD_COUNT=1
    # must be done BEFORE importing the ovito module and will restrict OVITO to a single CPU core per process.
    import os
    os.environ["OVITO_THREAD_COUNT"] = "1"

from ovito.io import *
from ovito.modifiers import *

# Set up the OVITO data pipeline for performing some heavy computations for each trajectory frame.
# Keep in mind that this initialization code gets executed by EVERY worker process in the multiprocessing pool.
# So we should only prepare but not carry out any expensive operations here.
pipeline = import_file('input/simulation.*.dump')
pipeline.modifiers.append(ClusterAnalysisModifier(cutoff = 3.0))

# Define the worker function which gets called for every trajectory frame. It evaluates the pipeline
# at the requested simulation time and returns the computed results for that frame back to the caller.
#
# IMPORTANT: The function may only return regular Python objects but not OVITO objects
# such as DataCollection or Property. That's because OVITO objects do not currently support pickling,
# which means they cannot be sent back to the main program across process boundaries.
#
# In this simple example, the return value is a simple scalar (a global attribute) computed per frame.
# Note that it is possible to return NumPy arrays, which includes NumPy views of OVITO property arrays.
# Thus, to return the cluster assignment of each atom computed by the ClusterAnalysisModifier
# we could write:
#
#    return data.particles['Cluster'][...]
#
def process_frame(frame):
    data = pipeline.compute(frame)
    return data.attributes['ClusterAnalysis.cluster_count']

# Main program entry point:
if __name__ == '__main__':

    # Measure time for benchmarking purposes.
    from time import time
    t_start = time()

    if use_multiprocessing:

        # Force "spawn" start method on all platforms, because OVITO is not compatible with "fork" method.
        import multiprocessing as mp
        mp.set_start_method('spawn')

        # Create a pool of processes which will process the trajectory frames in parallel.
        with mp.Pool(None) as pool:
            results = list(pool.imap(process_frame, range(pipeline.num_frames)))

    else:

        # Conventional for-loop iterating over all frames of the trajectory and processing one by one.
        results = [process_frame(frame) for frame in range(pipeline.num_frames)]

    t_end = time()

    print(f"Computation took {t_end - t_start} seconds")

Packaging and installation of user extensions for OVITO

OVITO’s Python interface allows you to implement custom modifiers, file format readers and writers, viewport overlays and utility applets that seamlessly extend the functionality of OVITO Pro and the standalone OVITO Python package. To make such extensions available to the user, they need to be properly installed and registered for the system to automatically find them at runtime.

The automatic discovery of user-defined extensions is based on Python’s entry point specification. This standard mechanism helps to keep the source code of your extension cleanly separated from the main program package. It also makes easy deployment of your extension to other OVITO users possible, in case you want to share your work with the community.

Tip

To aid you in the development of new extensions, we provide GitHub repository templates for each OVITO extension type, which include step-by-step instructions, an initial directory structure, and a pyproject.toml skeleton that simplifies packaging and distribution of your extension:

Furthermore, the OVITO team hosts a public repository where you can publish your extensions to share them with the user community:

OVITO Extensions Directory: A collection of Python extensions submitted by the community, maintained by their respective authors, and curated by the OVITO developers.

To demonstrate the automatic registration and packaging of a custom OVITO extension, we will use a user-defined example modifier written for OVITO Pro. We start with a simple code project layout having the following directory structure:

<root-dir>/
├── pyproject.toml
└── src/
    └── MyModifier/
        └── __init__.py

The MyModifier class is defined in src/MyModifier/__init__.py:

from ovito.data import DataCollection
from ovito.pipeline import ModifierInterface

class MyModifier(ModifierInterface):
   def modify(self, data: DataCollection, frame: int, **kwargs):
      ...

If you develop a complex extension, you can split your code into multiple .py files or nested packages if needed.

The pyproject.toml package metadata file is needed for the easy installation of the MyModifier module and any dependencies it requires (e.g. SciPy or matplotlib) via pip:

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "MyModifier"
version = "1.0"
description = "A custom Python modifier for OVITO."
keywords = ["ovito", "python-file-reader"]
authors = [{name = "Author Name", email = "author@organization.org"}]
maintainers = [{name = "Maintainer", email = "maintainer@organization.org"}]
license = {text = "MIT License"}
readme = "README.md"
requires-python = ">=3.7"
dependencies = [
   "ovito >= 3.9.1"
]

[project.urls]
repository = "https://github.com/author/MyModifier"

[tool.setuptools.packages.find]
where = ["src"]

Thanks to this pyproject.toml file, as we will see later, the code of MyModifier can be installed in any Python interpreter (including the OVITO Pro integrated interpreter) using the pip standard utility – just like regular Python packages. After the installation, you could call import MyModifier from a Python program to load and use our module.

Tip

Since version 3.12.0, OVITO Pro provides a GUI function for installing Python packages in its embedded interpreter. It’s no longer necessary to run the pip utility from the command line.

We still have to tell OVITO Pro that the MyModifier package contains the definition of an OVITO ModifierInterface class (or a similar kind of OVITO extension) that should be automatically loaded at runtime and made available in the graphical user interface. This requires the definition of an entry point in pyproject.toml (see the setuptools website for more details).

To register our custom class as an entry point under the standard group OVITO.Modifier, we add an entry-points section to pyproject.toml:

[project.entry-points.'OVITO.Modifier']
"My Modifier" = "MyModifier:MyModifier"

The second line first specifies the human-readable name of our extension to be displayed in the GUI. The part after the = specifies the name of the module and the name of the Python class that implement the OVITO extension. In our case, the module and the class are both named MyModifier.

Upon application start, OVITO Pro automatically loads all installed packages that did register entry points under the OVITO.Modifier group and displays them in the list of available modifiers. For other extension types similar entry point groups exist:

Python extension class

Entry point group

Template

ModifierInterface

[project.entry-points.'OVITO.Modifier']

pyproject.toml

PipelineSourceInterface

[project.entry-points.'OVITO.PipelineSource']

pyproject.toml

FileReaderInterface

[project.entry-points.'OVITO.FileReader']

pyproject.toml

FileWriterInterface

[project.entry-points.'OVITO.FileWriter']

pyproject.toml

ViewportOverlayInterface

[project.entry-points.'OVITO.ViewportOverlay']

pyproject.toml

UtilityInterface

[project.entry-points.'OVITO.Utility']

pyproject.toml

After preparing the pyproject.toml file, you can now install your extension package locally into OVITO Pro by invoking the pip install command via ovitos from the command line:

ovitos -m pip install --user --editable <root-dir>

where <root-dir> is the directory containing the pyproject.toml file. The --editable option ensures that you can still make code changes without the need to reinstall the package. -m tells ovitos to invoke the pip module and run its install command.

If you decide to upload the code of your OVITO extension to a public Git repository, other users can install it directly from your online repository as explained in Installing Python-based extensions for OVITO with pip. OVITO Pro users can use the GUI function for installing Python packages to obtain your extension. Or you can submit your extension to the OVITO Extensions Directory for the community to discover and use it, making it even easier for other users to install your extension.

If you need additional help with authoring and packaging OVITO extensions, please contact the OVITO development team at support@ovito.org.