This is the first version of a draft for an Open Digital Twin Standard (ODTWS) created by the Technical Committee for the Open Digital Twin Standard which is currently constituted of the authors of the Open Digital Twin Platform (ODTP). The standard is meant to be inclusive and overarching technical implementations of digital twins across all potential applications in business, governance, and research. As such, we encourage each body interested in the subject to contact us to join the Technical Committee for following drafts and the eventual standard. We are also open to input from International organizations, governmental and non-governmental, to take part in the work.
The procedures used to develop the first draft of this document and those intended for its further maintenance are described as follows:
Any trade name used in this document is information given for users' convenience and does not constitute an endorsement.
For an explanation of the voluntary nature of standards, we refer to the outline by the ISO: https://www.iso.org/foreword-supplementary-information.html. They do not include contractual, legal, or statutory requirements. Voluntary standards do not replace national laws, with whose standards users are understood to comply with and which take precedence.
We follow the ISO guidelines for normative ISO deliverables to understand how to implement the standard:
In order to improve the quality of international standards, we follow the 6 principles introduced by the World Trade Organisation Technical Barrier to Trade Committee that clarify and strengthen the concept of international standards as far as applicable:
For details of the principles, see Annex 4 to the Second Triennial Review of the TBT Agreement. This document was prepared by the Technical Committee for the Open Digital Twin Standard (TC-ODTWS) formed at the Swiss Data Science Center [[SDSC]] and the Center for Sustainable Future Mobility [[CSFM]] in the ETH domain In Switzerland. Currently, the members of the TC-ODTWS are Jascha Grübel, Sabine Maennel, and Carlos Vivar Rios. Additional contributions were made by Robin M. Franken (on semantic validation) and Sabrina Ossey (on authentication). Milos Balac, Chenyu Zuo, and Stefan Ivanovic also reviewed the document.
Any feedback or questions on this document should be directed to the contact on the official repository [[ODTP-Organization]].
The ODTWS series defines a framework to support the creation of digital twins of observable processes in the real world (physical twins).
A digital twin monitors the observable processes and assists with analyzing these processes. Digital twins allow to increase the visibility into complex processes and make their digital representation transparent.
The digital twins supported by the ODTWS framework depend on the implementation of the reference framework (e.g., ODTP) and the available components (and their technology and software requirements). Different domain applications may require different data standards. As a framework, this document does not prescribe specific data formats and communication protocols but informs on architectures that should be supported across different implementations to attain high-level interoperability.
Figure 1 shows the theoretical framework for digital twins with five environments, based on previous work on digital twins. The ODTWS framework aims to provide a standard for these five environments. The environments are split into two groups. The first group manages how to orchestrate generic digital twins with the data and connection environment, and the second group provides individualized features for specific digital twins from preexisting independent components of three kinds matching the environments:
Most digital twins vary in their features strongly depending on the area of application. Requirements of realtimeness differ with regard to data acquisition and data output. Sometimes, data acquisition varies from working only with historical data to the Internet of Things (IoT) real-time data acquisition and data output of simulation results may take days to compute compared to minutely updates of prediction models. Another issue that digital twin development often faces is that these twins are developed by a small group or a developer/researcher with a narrow focus on just one topic (e.g., data visualization, data simulation, data analysis) without the expertise to build an interoperable digital twin.
ODTWS facilitates combining these digital twins’ features so that a bigger goal can be achieved without forcing developers into an intense collaboration and without having to maintain a complex product. Complex management is taken on the orchestrator, and individual features are developed in the components with little overhead and full control of the development process. ODTWS also defines the properties of a marketplace of components named the component zoo to facilitate the exchange of digital twin components to attain interoperability.
The ODTWS series contains the following four parts:
Digital Twin | A digital twin is an abstract construct to represent a physical twin by capturing key characteristics of interest to describe and analyze observable processes to a sufficient degree for a dedicated task and then make a decision or potentially actuate on the physical twin based on the findings. Degrees of realism required by a task may vary and this definition tries to accommodate all variants of digital twins, including subsets such as Digital Shadows (raw data visualization) and Digital Models (no coupling to the physical twin). In some cases, digital twins are coupled to virtual twins in case the physical twin does not exist yet to explore its potential properties. A virtual twin provides data to a digital twin that behaves as if it came from a physical twin. ODTWS covers all these variations and henceforth will refer to them plainly as digital twins. |
---|---|
Open Digital Twin Platform (ODTP) | The reference implementation of the ODTWS: A tool designed to generate specific digital twins by integrating into a platform how to design, manage, run, and share digital twins. It offers an interface (CLI and GUI) for running and managing digital twins. It wraps different open-source technologies according to ODTWS to provide a high-level Application Programming Interface (API) for the final user. Finally, it implements a zoo as a repository of searchable components to (re-)create digital twins. The current version of reference implementation may not implement all features of the standard yet but aims to do so in the long-run. The reference implementation can be found here: [[ODTP-Organization]] |
Components (ODTWS Term) | Components for a digital twin are used to instantiate tools for specific tasks by providing implementations of the three individualized environments (see Figure 1). Each component provides an extension to the available capabilities under ODTWS that implementations (such as ODTP) may use to perform specific tasks in the digital twin. These extensions are generated by the community for the community, and their specific capabilities are not part of ODTWS, but ODTWS describes the features a component must have to be interoperable. This includes that the input/output of a component is validated semantically and that they run within a ([[Docker]]) container as an independent micro-service. Typically, they can be one of the following categories:
|
Core/core-optional modules (ODTWS Term) | Modules for a digital twin are used to instantiate solutions for generic tasks with a digital twin by providing implementations for features in the two orchestrating environments (see Figure 1). Each module is tightly integrated into the reference implementation (e.g., ODTP) and is developed together with the maintainer of the reference implementation and deployed as needed. These core modules shall include the programmes needed to run the ODTWS implementation and wrap the services used into a user interface. Core-optional modules are not mandatory to run an ODTWS implementation with the minimal features (e.g., running manually ODTWS components) but should be implemented to provide a more complete experience of the ODTWS standard. |
Services | ODTWS follows a micro-services architecture to generate a digital twin. Each service refers to one logical unit that performs one specific task in an independent manner to produce a digital twin. In ODTWS, both modules and components are instantiated as (micro-)services. These are chained together to produce an effective implementation for a specific digital twin. The digital twin combines generic features for digital twins (modules) with individual solutions for a specific task (components). |
Semantic input & output validation |
In the ODTWS abstraction Digital twin components can be considered data converters. The input to such a converter must be a file structured in a certain way. The tool inside the component converts this input into a different format, such as data needed for visualization, a different structure, an obfuscation of data or other types of processing to the data such as aggregation, simulation, interpolation, and extrapolation. As the functioning of a component is dependent on the data it receives, a structured mechanism of describing the input data requirements ensures proper functioning of a component. A well-described input schema can help a user assess whether their data set will comply with the requirements of the component. The semantic description includes information about the filenames, folder structure, (multi-lingual) labels and definitions of the columns, keys, or properties within such a file, and some information about the datatypes expected for each parameter. Having the input requirements accurately and structurally described is important, but as components may require many files, with many parameters associated to each file, automated validation of a dataset becomes a requirement for ODTWS. The automation of validation of each file in the input dataset may be performed in two steps. First, all files are discovered that exist in the input dataset and their associated parameters which are present within each file. Second, an expectation is formulated about what data is needed. The Instance data is an RDF-compliant representation of the input files. This entails triples in an [[RDF-XML]] compliant file format, describing a given input folder, with metadata about which files exist in this folder, and which variables exist in each file. The Schema data ([[SHACL]] shapes) is an [[RDF-XML]] compliant definition against which Instance data is checked. In other words - this design allows a data owner to check whether their data (or the output of another component) is compatible with the input requirements of a component. |
Workflow | A workflow describes a chain of components that are supposed to be executed to produce the functionality of a digital twin. Workflows can be sequential but should also support Directed Acyclic Graphs (DAG). This allows workflows to execute complex tasks for digital twins by splitting the data flow and joining them where necessary. |
Execution and Trace | An Execution instantiates the components in a workflow into a set of micro-services that are run once and for which operational data and results are captured. Every step of an Execution is logged in a trace of the digital twin for reproducibility purposes. A Digital Twin can be configured so executions are performed recurrently on time providing basic support for real-time setup. |
Step | A step describes the running of a single service (instantiated from a component) in the execution of a workflow. Steps are atomic from the orchestrator’s perspective and therefore, the logging is limited to parameters, exposed metrics, input data, and output data. |
ODTWS-1 Components are the centerpiece of ODTWS: An ODTWS-1 Component is a wrapper to an existing tool. This wrapper makes the tool usable in an ODTWS-3 Orchestrator. The instantiation of an ODTWS-1 Component is not independent of the orchestrator, as the transformation from tool to component depends on the ODTWS-3 Orchestrator’s own implementation. However, ODTWS-1 Components shall be usable by any compliant ODTWS-3 Orchestrator. ODTWS-1 Components can be arranged into ODTWS-2 Workflows and can be discovered by sharing them in an ODTWS-4 Zoo. An orchestrator can search ODTWS-4 Zoos to find ODTWS-1 components to assemble a digital twin according to a ODTWS-2 Workflow.
ODTWS-1 components consist of the following elements further explained in the sections below:
A tool is a [[Git]] repository: hosted by a git registry such as github or gitlab.
The ODTWS-1 Component is another [[Git]] repository derived from the tool to integrate the tool into a digital twin workflow:
An ODTWS-1 Component is a version-controlled repository (e.g. [[Git]]) and shall provide the following elements: (here we give a list and below you will find a description of each element in detail)
parameters.yaml
or parameters.json
metrics.yaml
or metrics.json
semantic-input.yaml
or
semantic-input.json
semantic-output.yaml
or
semantic-output.json
odtws.yaml
or odtws.json
An ODTWS-1 Component shall contain an app that connects to both the methods of the tool and the methods of the orchestrator. It is a bash script that shall be called in the Dockerfile (defined below) by the ODTWS-3 Orchestrator with the following tasks:
An ODTWS-1 Component shall set up the environment for the tool with a dockerfile and shall install the necessary dependencies. It shall prepare a folder structure inside the ([[Docker]]) container to match the folder structure that the ODTWS-3 Orchestrator expects for interoperation. It may create a folder structure that the tool expects. At last, it shall call the app as a bash script to execute the tool to achieve the desired task of the component.
An ODTWS-1 Component shall include metadata on the component with the following parts:
Parameters are defined as key-value pairs and may be structured in lists or arrays corresponding to the yml and json standards. Parameters are passed into the tool through the user-defined app script. Only parameters that are properly passed into the tool may take effect.
Metrics are special outputs of a tool to the ODTWS-3 Orchestrator data storage. Metrics shall be provided by the tool developer. They shall be used for control and comparison of multiple executions and digital twins on a high level. They shall implement in the component a function, a path or stream/socket from where to store the metric data in the step representation of the ODTWS-3 Orchestrator. Metrics may have semantic descriptions.
The component author should semantically describe the data inputs and outputs required for the tool in a RDF-compliant ([[RDF-XML]]) format. All required files and parameters should be accurately described with labels, definitions and restrictions to enable semantic reasoning about a component. The output of the component should also be described entirely to create a semantic black-box view on the component.
The versioning of the tool and component need to be independent from each other, since there may be various reasons that justify a new component version:
Since the Tool version is important, it shall be added in the metadata of the component, see [Metadata File]. We recommend adding an automatic check that ensures that the version of the tool mentioned in the odtws.yaml file and the version that is actually used in the component match.
The metadata schema for the Component is described at ODTWS-4 Zoo.
ODTWS implementations shall provide the following component types:
Ephemeral components shall be temporary and shall not persist data (e.g. transforming data from one format into another). They shall be used for short-lived analytical operations and discarded after use. They shall be built when preparing the digital twin execution and shall only be used in a single execution step. Parameters shall be provided as environment variables, and input data shall be placed in one specific directory.
Interactive components shall be designed to interact with the user. These components should be used in user interfaces or visualizations. They shall be built in a certain execution step but will keep running as a ([[Docker]]) container until the user stops them. They are often the last step in an execution. In real time settings, interactive components may expose parameters of other components forwarded by the orchestrator. Changing those parameters may trigger another execution of all components impacted by the change.
API components shall be built only once and can be reused in multiple executions. We can think of them as continuously running Microservices. The component is going to provide an API-Endpoint in one port that will allow running the tool in parallel or in multiple executions. This kind of component is useful when the component building process (in [[Docker]]) takes a large amount of time, or when a long-lasting task can be reused in multiple executions. An example of this is the loading of a machine learning model into memory.
The API component receives the variables as JSON in the request’s payload. This allows a more complex configuration of parameters than in the Ephemeral type. Input data can be provided in the request, or, if the file size is big, as an item in the S3 storage ([[S3-Minio]]).
Workflows are blueprints on how to assemble ODTWS-1 Components to perform the tasks of a digital twin. A workflow shall describe the sequence of ODTWS-1 components required to perform a task. A workflow may take the form of a directed acyclic graph (DAG).
Workflows should be editable with a workflow editor such as [[BARFI]]. A workflow shall contain all the information necessary to run an execution, including: components, parameters values, and initial inputs.
Examples of workflows are provided in Appendix B and Appendix C as they have been built in the context of [[ODTP]].
Workflows where components are chosen but parameters are not yet set shall be called Workflow Templates. They can be reused for workflows that differ on their inputs. These templates may be used to create and run executions programmatically.
An ODTWS-3 Orchestrator is a tool that shall combine ODTWS-1 components into ODTWS-2 Workflows of directed acyclic graphs and shall instantiate them as an Execution of micro-services running as ([[Docker]]) containers.
The orchestrator shall provide the following elements that are further described in this section below:
The user interface of the ODTWS-3 Orchestrator shall enable the following actions on ODTWS-1 Components and their management:
[[Docker]] image and container names can be derived in an automated way from the component versions. The ODTWS-3 Orchestrator should run on a server and should prepare a project folder to write the file outputs of each step.
The ODTWS-3 Orchestrator shall implement a data storage that allows to store outputs of component runs. When components are combined into workflows, the ODTWS-3 Orchestrator shall facilitate the exchange of data between consecutive components. In [[ODTP]] an S3 Data storage (([[S3-Minio]])) is used for that purpose.
The stored data can be used for partial executions of workflows, e.g. starting with the second component in a sequence but reusing the data loader component. This enables the user of the ODTWS-3 Orchestrator to reuse the output of computationally heavy components.
The ODTWS-3 Orchestrator shall automatically extract semantic information about a directory (recursively analyze the files in a folder, and extract some metadata about the columns/properties of each file) and write it to an [[RDF-XML]] graph. This “instance data” (metadata about the input dataset) shall be validated against the “Schema data” provided by an ODTWS-1 Component. This shall be implemented using SHACL. The ODTWS-3 Orchestrator shall run a ([[SHACL]]) validation engine on this combination to generate a report that provides information about the conformance of the dataset with the schema.
ODTWS-3 Orchestrators should warn users of non-conforming component connections. ODTWS-3 Orchestrators may deny the execution of a workflow that does not conform to ODTWS-2 Workflows. The mode for a warning is called “Lazy execution”, where a workflow may run until an error occurs. The mode for strict conformance is “Safe execution”, where a workflow is not run if non-conforming.
There are three kinds of operational data in ODTWS:
The following classes exist in ODTWS-3 to provide building material for Execution: this building material is usually registered in the ODTWS-4 Zoo and shall be imported from there into the ODTWS-3 orchestrator:
The following classes provide governance for the Digital Twins:
The following classes allow to run and monitor executions:
A digital twin shall be produced in an ODTWS-3 Orchestrator by executing an ODTWS-2 workflow. The ODTWS-3 Orchestrator shall create an execution by instantiating and running the ODTWS-2 Components according to the ODTWS-2 workflow. The orchestrator should be able to copy executions and make them reusable to be rerun with minor adjustments.
Executions shall have the following mandatory properties:
A Step is the execution of a single component. The ODTWS-3 Orchestrator shall maintain these mandatory properties of Steps:
The orchestrator shall store the output of each Step. Intermediate outputs shall be stored in the ODTWS-3 Orchestrator’s data storage. The output of a Step shall be described with these Mandatory properties:
The result shall provide a combined output of selected executions within a digital twin. The result shall allow the comparison of the outputs of several executions. This provides the ability to compare
Users shall create Digital Twins using ODTWS-2 Workflows or ODTWS-1 Components. The ODTWS-3 orchestrator needs to support user-owned digital twins that contain the executions and the data produced by running the executions of a Digital Twin. The ODTWS-3 Orchestrator should offer user authentication and authorization and use secure methods to verify user access.
The ODTWS can also provide team ownership. This feature allows collaboration on Digital Twins by granting access and permissions to designated team members and fosters teamwork and knowledge sharing within a project. It ensures data from different projects remains separate and secure. It is crucial for maintaining confidentiality and preventing accidental data leakage.
The ODTWS-3 Orchestrator shall provide a recipe on how tools can be turned into ODTWS-1 components in such a way that they will be compatible with the ODTWS-3 Orchestrator. An orchestrator may provide a component template that can be copied and contains further instructions on how to get from the provided template to a component for the tool, see ODTP as an example of how that can be done.
The ODTWS-3 Orchestrator and the ODTWS-1 Components shall communicate via a client library consisting of functions and methods provided by the orchestrator to the components. Any ODTWS-1 Components shall install this library when instantiating a microservice to communicate with the ODTWS-3 Orchestrator. The client library is orchestrator specific. The client library shall implement the following features:
An ODTWS-3 Orchestrator should be able to discover ODTWS-1 components proposed for an ODTWS-2 Workflow in a registry called an ODTWS-4 Zoo. An ODTWS-4 Zoo is coupled to one or more ODTWS orchestrators and shall ensure that the ODTWS-1 components and ODTWS-2 Workflows that it registers are interoperable in the context of these ODTWS-3 Orchestrators. The zoo consists of the following parts:
The ODTWS-4 Zoos shall consist of a repository that lists the ODDS-3 Orchestrator(s) with which its
components are interoperable. Compliance with these orchestrators shall be guaranteed for all
components it registers. The index may be provided as an index.json
or index.md
.
The zoo may choose to register only ODTWS-1 Components or ODTWS-2 Workflows. The ODTWS-4 zoo must provide the
index as a list. It should provide a search by tags (see metadata for components below).
The zoo shall offer a method to submit a component or workflow and add it to the index. The method to add a component or workflow shall be documented for submitters and shall guarantee compliance with the orchestrators by implementing a review process before submission. Pull Requests may be used as a submission method and to unregister components or workflows.
The ODTWS-4 Zoo shall reuse the metadata of the ODTWS-1 Component. The metadata file of the
component in the zoo shall be provided in a yaml or json file with a standardized name, such as
odtws.yml
or odtws.json
.
The ODTWS-4 Zoo shall expect these metadata for the ODTWS-1 Component:
Required:
Recommended:
For the registration of ODTWS-2 Workflows in the ODTWS-4 Zoo, workflow files shall be provided in a YAML format, specifying Acyclic Graphs of Components and including the versions of these components.
The reference implementation of the ODTWS: A tool designed to generate specific digital twins by integrating into a platform how to manage, run, and design digital twins. It offers an interface (CLI, and GUI) for running and managing digital twins. It wraps different open source technologies according to ODTWS to provide a high level Application Programming Interface (API) for the final user. The reference implementation may not implement all features of the standard but aims to do so in the long-run. The reference implementation can be found here: [[ODTP]]. As such, ODTP is a Proof-Of-Concept of the ODTWS with the focus on specific digital twins in mobility context. It was built in order to establish the ODTWS Standard via a POC in collaboration between [[CSFM]] and [[SDSC]].
In the following section we list the alignments and differences between ODTWS and ODTP: therefore this sections structure aligns to ODTWS:
ODTP Components served as role models for the concept of ODTWS-1 Components. In this section the implementation of ODTWS by ODTP comes very close. The versioning as described in ODTWS has not yet been implemented. Also the API Components are not yet supported. But both topics are on the roadmap and will be available in releases that have already been planned.
See here for an example components in the context of ODTP: [[Odtp-Component-Example]]
The concepts of ODTWS-2 Workflows and Workflow Templates have not been implemented in ODTP. But the need for them has surfaced through the user experience with the ODTP orchestrator:
Therefore ODTP Workflows are on the roadmap of ODTP and will be implemented in an upcoming release.
The ODTP Orchestrator aligns to the ODTWS-3 Orchestrator and implements an orchestrator for the ODTP Components. It offers a command line interface and graphical user interface. It is still a POC so even some mandatory features of ODTWS have not been yet implemented there. ODTP was also used to evaluate beyond what itself currently offers the demands and needs for the digital twin orchestration. It thus directly enabled the formulation of the ODTWS.
The features that have been implemented from ODTWS-3 Orchestrator are the following:
Features on the roadmap are:
The orchestrator for ODTP is implemented with python using the following technologies:
Not yet implemented but planned in one of the next releases:
A zoo has been implemented with examples from the mobility sector: [[Odtp-org-Zoo]]. For practical purposes, the zoo also has a web frontend: [[Odtp-Zoo-Frontend]].
The zoo gives its README a recipe on how to add a component to the zoo. Components are added by
PRs and they are also removed by PRs. At all times the repo offers an index.json
file of registered
components. There is a github action implemented that add the components metadata from a
odtp.yaml
file to the index. Then an static webpage hosted in GitHub pages gets updated
We provide a workflow for a digital twin to create synthetic populations (Hörl & Balać, 2019) and run MATSim-based mobility simulation of three scenarios: Ile de France, Corsica, and Switzerland:
The data loader component prepares statistical data on the population of the respective scenario and the geographic data from standardized data sources. The French scenarios are publicly available, the Swiss scenario requires a valid contract with the Swiss Federal Statistics Office (FSO).
Overview of the Components:
Reference: [[Eqasim-in-Mobility]]
We provide another workflow for a digital twin that implements the mobility causal intervention framework to evaluate the robustness of deep learning models towards data distribution shifts, with the application of individual next location prediction (Hong et al, 2023):
The mobility simulation module is used to generate individual location sequences. It also incorporates the causal intervention mechanism to generate intervened synthetic data that represent different data distribution shifts. These synthetic data are fed into the next-location-prediction module to quantify a model’s robustness against interventions. Meanwhile, a mobility-metrics module is used to monitor the change in the characteristics of mobility data.
Overview of the components:
Reference: [[Causal-interventions-in-Mobility]]