Monetizing Machine Data in International Data Spaces – A Business Model Perspective and Prototypical Demonstration

1. Premise

We live in a data-driven world, where many companies are hesitant to share their data due to a variety of different concerns. Machine-generated data, such as usage statistics, often contains sensitive business information. In this thesis, the term machine generated data refers to operational data collected from production machines, such as sensor data, or machine usage statistics. In the prototypical implementation, simulated data will be used to represent such machine data. Firms are concerned that sharing such information could reveal secrets or competitive advantages that competitors might exploit. Furthermore, companies fear disclosing data because it may pose privacy and reidentification risk, as shared information often represents how individual users utilize a given tool.


However, withholding this data can lead to missed synergistic effects: limited innovation, missed business opportunities, and inefficient processes. Participating in data sharing ecosystems, when done securely and under clearly defined policies, opens the chance to unlock new value streams, improve interoperability, and facilitate strategic partnerships.


To address and actualize these opportunities, initiatives such as the International Data Spaces Association (IDSA) and the Frauenhofer Institution created the International Data Space (IDS) concept and reference architecture. Later initiatives such as Catena-X build upon these foundational concepts, creating an ecosystem, to enable decentralized, policy controlled, sovereign and secure data exchange between organizations. Despite their efforts, many companies still struggle to identify the potential value of providing data in such an environment.
This thesis focuses on the role of data providers and explores how if and how they can generate new value streams by offering machine generated data, within ecosystems like Catena-X.


2. Research Questions

  1. What defines a data space?
  2. What roles exist within a data space, and how do they interact with one another?
  3. How can Subject Oriented Modelling (SOM) clarify imprecise and ambiguous terminology found in its documentation?
  4. What aspects of Data Space Technology can be described using SOM
  5. How can aspects of Data Space Technology be described using SOM
  6. What types of machine data are suitable for monetization?
  7. What monetization strategies are appropriate for a fictional data provider?
  8. How can the business model of this fictional provider be structured and justified?
  9. How can the processes of the proposed Business Model Canvas (BMC) be modelled using SOM?
  10. What are the strengths and limitations of the developed prototype?


3. Research Plan


3.1 Methodology


This thesis explores how an industrial data provider can monetize machine generated data within a data space. To investigate this, a fictional data provider is introduced. The thesis follows a design-oriented approach that combines literature research, subject oriented modelling, business model development and prototypical software implementation. The software aims to simulate and demonstrate the selected monetization strategies, and to serve as a proof of concept for the proposed business model.


To answer the first two questions a structured literature review will be conducted. The documentation of key frameworks such as the International Data Spaces (IDS) reference architecture, Gaia-X and Catena-X will be examined, paying special attention to imprecise or ambiguous terminology.


To answer the third and seventh research questions, selected process systems within the data space context such as: data offering, contract negotiation and access control between data provider and consumer as well as the business processes implied by the proposed BMC will be modelled utilizing Parallel Activity Specification Scheme. This subject oriented modelling approach is applied to examine whether it can help make implicit logic and role interactions more explicit, thereby contributing to improved conceptual and process clarity.
Next a second literature review will be conducted to identify “What types of machine data are suitable for monetization?”, the output of this will be used to formulate a fictional provider scenario, which serves as the basis for the remainder of the thesis.


Building on the fictional provider scenario, the BMC will be used to analyse and compare potential monetization strategies (e.g. pay per use, licensing, subscription etc.). In response to the research question “What monetization strategies are appropriate for a fictional industrial data provider?”, the selection of a specific strategy will be based on data characteristics, customer segmentation and feasibility considerations.
To address the question “How can the business model of this fictional provider be structured and justified?”, a simplified software prototype will be developed. This prototype will simulate the flow of data monetization, including the creation, publication, and simulated purchase of data. The prototype serves as a practical demonstration and partial validation of the developed business model.
Finally, to answer the last research question “What are strengths and limitations of the developed prototype?”, the software artifact will be evaluated based on its strengths and limitations.