The Governance of Big Data and Artificial Intelligence in Network Industries

Abstract

An important precondition for the development of artificial intelligence (AI) in network industries is the access to big data and the attendant necessities of data sharing and data portability. The goal of this paper is to analyze the changing needs of entrepreneurial decision-making to exhaust the innovation potential of AI-driven big data value chains taking into account AI-specific ethical, security and privacy regulations. The analytical concept of AI-powered big data virtual networks is investigated with a focus on the governance of 5G-based big data value chains required for Internet of Things (IoT) applications in particular smart networks. Although several actors may be involved—such as broadband providers, cloud service providers, geopositioning service providers, or sensor network service providers—the final responsibility for bundling these different service components lies in the hands of the AI-powered big data virtual network providers. In addition to the required data privacy and security regulations, the exploration of new liability rules for AI interacting with traditional technologies is becoming relevant, taking into account AI-specific ethical and transparency obligations. Firstly, the complementary roles of the EU data regulatory framework and European AI regulatory framework are examined. Secondly, the network economic concept of AI-powered big data virtual networks is elaborated taking into account the required regulations. Thirdly, the heterogeneity of AI systems—required for a variety of IoT applications—is considered, with a particular focus on the application of AI within the transportation sector.

Keywords

Artificial intelligence internet of things big data virtual networks smart transportation L51 L96 O31

Introduction

The Internet of Things (IoT) is characterized by a physical side, comprising a large and open set of physical applications (use cases), and a virtual side, comprising complementary information and communication technologies (ICT), such as camera-based sensors, real-time and location-aware communication, as well as cloud computing and other forms of data storage, processing, and transmission. In this context the term big data has been coined underlining the relevance of large data volumes, speed of data processing and data collection (velocity) and heterogeneity (variety) of different data classes. The governance of big data does not occur in isolation but is evolving based on a combination of data collection innovations including the rapidly decreasing costs of (camera-based) sensor technologies, significant increases in computer processing and storage capacities, as well as tremendous developments in communication technologies (OECD/ITF, 2015, pp. 11–13). In the meantime, big data cloud computing is gaining increasing relevance for local, regional, and cross-country data traffic driven by the high-bandwidth capacities of 5G with heterogeneous ICT requirements entailing heterogeneous Quality of Service (QoS) requirements of bandwidth capacities, heterogeneous sensor networks, heterogeneous (big) data processing capacities, and heterogeneous security requirements. Cloud computing is no longer only a cost-saving possibility enabling an on-demand pool of shared computing and storage solutions but is gaining a pivotal function in the entrepreneurial design of data value chains for real-time, adaptive, sensor-based applications in many use cases of the IoT (Knieps, 2021, pp. 48–50).

Classes of use cases requiring a tactile Internet (e.g., driverless networked vehicles, smart manufacturing, augmented and virtual reality) are concomitant with massive and high-velocity datasets challenging the traditional approaches (e.g., statistical analysis or optimization theories) to deriving relevant decisions based on the insights from this data. This is the very reason why Artificial Intelligence (AI), relying on algorithm-based pattern recognition, will become so relevant in the network industries of the future. Although AI has the character of an umbrella term with different meanings, its basic principle is the interaction of big data with operational logic (algorithms), and actuators in the search for automated decision making. For a well-defined set of objectives data-trained algorithms are applied which—depending on the sensor-based data inputs—are pursuing a set of actuator decisions (OECD, 2019b, pp. 22–24, 2022; ITF, 2021b, pp. 12f.).¹

An important precondition for the development of AI is access to big data and the attendant necessities of data sharing and data portability. Two topical complementary waves of reform initiatives are the EU data regulatory framework on the one hand (European Commission, 2020a; 2020b; 2020c) and the AI for EU regulatory framework on the other hand (European Commission, 2021b, 2021c, 2021d). Promotion of AI-driven innovation is closely linked to the Data Governance Act, the Open Data Directive, and other initiatives under the EU strategy concerning the reuse, sharing, and pooling of data essential for the development of data-driven AI models (European Commission, 2021d, p. 5). Of particular relevance for future AI systems regulations is the proposed Artificial Intelligence Act (European Commission, 2021d) and the concomitant Commission Staff Working Documents (European Commission, 2021b; 2021c).² The ever-expanding scope of AI systems regulations entails not only data privacy and cybersecurity provisions, but also AI-specific ethical, non-discriminatory, transparency obligations and liability rulings (OECD, 2019a, pp. 7–10; European Commission, 2022d).

To analyze the potential of AI in network industries, the concept of AI-powered big data virtual networks is elaborated, emphasizing the relevance of analyzing the potentials and risks of AI systems not in isolation but in the context of idiosyncratic integration into the relevant data value chains. The entrepreneurial task of the big data virtual network provider is to make decisions regarding the specifics of the AI system in combination with the relevant data value chains according to the requirements of the physical IoT applications. The governance of AI-powered big data virtual networks is driven by big data-trained algorithms enabled to collect, aggregate, and analyze the relevant datasets, thus enabling relevant decision-making from the IoT application perspective. Although several actors may be involved—such as broadband traffic service providers, cloud service providers, geopositioning service providers, or sensor network service providers—the final responsibility for bundling these different service components lies in the hands of the AI-powered big data virtual network providers. Platform operators responsible for performance guarantees on the physical side of IoT applications may be horizontally integrated with such virtual network providers. In addition to the required data privacy and security regulations, the formulation of new liability rules for AI interacting with traditional technologies is becoming relevant in combination with AI-specific ethical and transparency obligations.

The paper is organized as follows: In The Challenges of Al-Driven Big Data Value Chains and Regulations, the complementary roles of the EU data regulatory framework and AI for EU regulatory framework are examined. In addition to the required data privacy and AI-specific ethical and transparency obligations, AI security regulations, and new liability rules for AI interacting with traditional technologies are becoming relevant. In The Governance of Al-Powered Big Data Virtual Networks, the network economic concept of AI powered big data virtual networks is elaborated. The heterogeneity of AI systems is explored, with the focus on a variety of 5G-based big data value chains and concomitant AI-powered big data virtual networks required for a variety of IoT applications. In Al-Powered Big Data Virtual Networks in Transportation Industries, attention is given to the challenges of AI-powered big data virtual networks in various heterogeneous application cases within the smart transportation sector, with a particular focus on AI in transport logistics, proactive road infrastructure maintenance, congestion management, and networked driverless vehicles. Conclusions summarizes the conclusions.

The Challenges of AI-Driven Big Data Value Chains and Regulations

Whereas EU regulations underline the entrepreneurial and market-driven role of cross-border data sharing and data pooling to stimulate AI innovations, AI regulations are also considered as of particular relevance to guarantee AI-specific ethical, non-discriminatory, and transparency obligations and liability rulings, data privacy and security provisions.

The Complementary Roles of the EU Data Regulatory Framework and AI for EU Regulatory Framework

EU Data Regulatory Framework

European reform proposals on the future role of big data, algorithms for machine learning, and subsequent artificial intelligence (AI) are gaining increasing attention as part of a transition to data economy and heterogeneous e-privacy protection and security measures (European Commission, 2020a; 2020b; 2020c). On the basis of already existing EU regulations (European Commission, 2020a, p. 4), in particular the regulation on the free flow of non-personal data,³ the goal is to enable a single European market for the free flow of data within the EU and across sectors. This has resulted in two important legislative initiatives. The Data Governance Act, initiated in November 2020 (European Commission, 2020b) and agreed to by the European Parliament and European Council in November 2021, focuses on the governance of data sharing by companies, individuals, and the public sector. The Data Act, proposed in February 2022 (European Commission, 2022a), is concerned with the question of who can access data and under what conditions. In order to facilitate data pooling and sharing, the concept of common European data spaces has been introduced, fostering an appropriate governance structure to ensure non-discriminatory access to the sharing and use of data (European Commission, 2022b). An important precondition for cloud computing is the QoS-guaranteed free movement of data across borders. In addition to global cloud computing, edge cloud computing enabling date processing in short distance to the end users gains increasing relevance for classes of applications with requirements for ultra-low latency guarantees (Chang et al., 2014; Knieps, Bauer, 2022, pp. 3f.). Cloud service providers and cloud users should be able to deploy any cloud service at any time and location required. Furthermore, cloud users should also have a free choice of cloud service providers and the possibility of data portability, enabling them to easily switch among cloud providers (European Commission, 2020a). In the meantime, based on the EU Data Economy legal framework, the recently established self-regulatory SWIPO (switching and porting) Data Portability Codes of Conduct are focused on cloud security certifications as well as reducing the risk of vendor lock-in by cloud service providers. The GAIA-X project for the next generation of a data infrastructure for Europe has a particular focus on the role of codes of conduct between European and non-European cloud service providers (SWIPO AISBL, 2020).

AI for EU Regulatory Framework

The nature and design of AI systems is an essential driver for the creation of EU data pools enabling big data analytics and machine learning, taking into account data protection legislation and competition law, thus fostering the emergence of data-driven ecosystems (European Commission, 2020a, p. 5). One can observe an increasing shift from the processing of data in centralized computing towards computing at the edge. The availability of data is considered an essential input for training AI systems (European Commission, 2020a, p. 2). Since AI, the IoT, and robotics are transforming the characteristics of many products and services, liability rules concerning the interaction between artificial intelligence and traditional technologies are becoming relevant (European Commission, 2021a; 2022d). The establishment of EU common data spaces by the European Commission, as well as the facilitation of data sharing between business and government in the public interest, will be instrumental to providing trustworthy, accountable, and non-discriminatory access to high-quality data for the training, validation, and testing of AI systems.

Different Dimensions of AI Regulations

EU initiatives on ethical and security frameworks concerning AI are not internationally isolated but are to be considered against the background of OECD AI standardization initiatives (OECD, 2019a, p. 3; ITF, 2021b, pp. 13f.). Five specific AI characteristics are considered particularly relevant: complexity, transparency/opacity, continuous adaptation, autonomous behavior, and big data taking into account the interaction between AI regulations and existing sectoral product safety legislation (European Commission, 2020c; 2021c, pp. 33–37). In addition, the General Data Protection Regulation (GDPR),⁴ as it applies to personal data, plays an important role to guarantee AI security from the perspective of damages caused by data processing that infringes the GDPR (European Union Agency for Cybersecurity ENISA, 2020, pp. 8–10). Security is a principle which must be guaranteed when processing personal data and is thereby considered as a data protection by design instrument (recital 83, Article 5, Article 32 GDPR).In contrast, the goal of the proposed Artificial Intelligence Liability Directive (European Commission, 2022d) is to improve the functioning of the internal market focusing on aspects of compensation claims based on non-contractual civil liability for damages caused by the implementation of AI systems (European Parliament, 2023).

With regard to transparency/opacity requirements in the context of automated decision making, the GDPR is of relevance. Of particular relevance is Art. 22 (1): “The data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her.” Exemptions are possible under very specific conditions laid down in Art. 22 (2), if the data subject’s rights and freedoms and legitimate interests are guaranteed respectively the data subject`s provides explicit consent. According to Articles 13–15, data controllers must be able to explain how an AI system makes decisions that have a significant impact on individuals. The question arises as to what extent a full disclosure of the algorithm can be enforced (EPFL IRGC, 2018, p. 16; Chivot, Castro, 2019, p. 7).

The proposed EU regulation safeguarding fundamental rights, safety and liability differentiates between prohibited AI practices, high-risk AI systems, and non-high-risk AI systems (European Commission, 2021d, Articles 5, 6, 7, and 2021c, pp. 47-49), particularly with regard to their impact on the fundamental rights to human dignity, privacy, and data protection equality and non-discrimination. According to TITLE II, Prohibited Artificial Intelligence Practices, Article 5 (European Commission, 2021d, pp. 12f., 43f.), all those AI systems are prohibited which contravene EU values, particularly those that violate fundamental rights. Such rights include the prohibition of the manipulation of persons through subliminal techniques beyond a person’s consciousness, rules against the exploitation of specific vulnerable groups such as children or persons with disabilities, and the prohibition of AI-based social scoring for general purposes and carried out by public authorities. The use of “real time” remote biometric identification systems in publicly accessible spaces for the purpose of law enforcement is also prohibited, with exemptions only in very specific situations of security violations.

According to TITLE III, Classification of AI Systems as High-Risk, Article 6 (European Commission, 2021d, pp. 13f., p. 45) AI systems likely to pose a high risk to fundamental rights and safety are differentiated from non-high-risk AI systems. Classification rules for high-risk AI systems refer to AI systems intended to be used as a safety component or which are themselves products that pose a risk of harm to health and safety or risk an adverse impact on fundamental rights.⁵ High-risk AI systems are classified in different areas of the society such as biometrics, education and vocational training, employment and workers management, law enforcement or access to essential private and public services. Of particular relevance from the perspective of network industries is the classification of AI systems to be used as safety components in the management and operation of critical infrastructures such as, road traffic and the supply of water, gas, heating and electricity as high-risk.

High-risk AI systems fall under the regulatory requirements of high-quality data documentation and traceability, transparency, human oversight, accuracy, and robustness to mitigate the risk to fundamental rights and safety posed by AI that is not covered by other existing legal frameworks such as the GDPR or the (proposed) AI Liability Directive (European Commission, 2022d) taking the final responsibility to the person who designed and implemented the AI system. The EU has established a system for registering stand-alone, high-risk AI applications in a public, EU-wide database. Key participants across the AI value chain include providers and users of AI systems (European Commission, 2021d, p. 12). According to Recital (53) “It is appropriate that a specific natural or legal person, defined as the provider, takes the responsibility for the placing on the market or putting into service of a high-risk AI systems, regardless of whether that natural or legal person is the person who designed or developed the system” (European Commission, 2021d, p. 31).⁶

According to Recital (48) “High-risk AI systems should be designed and developed in such a way that natural persons can oversee their functioning. For this purpose, appropriate human oversight measures should be identified by the provider of the system before its placing on the market or putting into service. In particular, where appropriate, such measures should guarantee that the system is subject to in-built operational constraints that cannot be overridden by the system itself and is responsive to the human operator, and that the natural persons to whom human oversight has been assigned have the necessary competence, training and authority to carry out that role” (European Commission, 2021d, p. 30).

The proposed new EU AI regulatory framework pursues a risk-based approach to ensure that the regulatory interventions are proportionate avoiding oversized regulatory basis, inefficient duplication of regulatory interventions as well as inadequate regulatory instruments. Regarding proper regulatory basis the goal is to be effective in drawing the borderlines between prohibited AI practices, high-risk AI systems and non-high-risk AI systems. Due to the dynamic evolution of AI systems a particular challenge is to draw the borderline between regulated high-risk and unregulated non-high-risk AI systems within a forward looking regulatory decision-making process. Since AI algorithms are of important relevance within high-risk AI systems the goals of regulation granting fundamental rights, safety and liability should be pursued avoiding unnecessary or inadequate interventions (European Commission, 2020d, pp. 16–18).

The Governance of AI-Powered Big Data Virtual Networks

In this section the changing needs of entrepreneurial decision-making are analyzed in order to exhaust the innovation potential of AI-driven big data value chains and concomitant big data virtual networks considering AI-specific ethical, security and privacy regulations.

AI, Big Data, and the Search for Innovative Algorithms

Characterization of AI

The history of AI dates back to the 1950s (OECD, 2019b, pp. 20–24) concomitant with an ongoing debate on the future role of human-machine interaction. In a seminal paper on computing machinery and intelligence, Turing (1950) raised the question of whether machines can think. He developed the so called “Turing test” to determine whether a suspicious human could have a conversation with a (hidden) computer and be convinced that the latter was actually a person. In the same year, Shannon (1950) proposed the creation of a computer that could learn to play chess, pointing out its possible relevance for many other areas such as machines for translating languages or for routing telephone calls, etc. A strong increase in computation power due to transistor innovations, networked computing based on broadband capacities, and the associated advances in computational and storage capacity during the 1980s and 1990s strongly increased the potential of AI (OECD, 2019b). These developments become even more disruptive due to the 5G innovations of QoS-differentiated broadband capacities together with the big data-driven IoT.

Of particular relevance for AI systems is the interaction of big data and self-learning algorithms in the search for automated decision-making. The ever increasing and unknowable potentials of future AI systems raises fundamental questions regarding the interaction of AI systems and human beings within the world-wide population of the future. The OECD’s Recommendation on Artificial Intelligence adopted in 2019 became the first intergovernmental standard for AI policies: “The Recommendation aims to foster innovation and trust in AI by promoting the responsible stewardship of trustworthy AI while ensuring respect for human rights and democratic values” (OECD, 2019a, p. 3). Basic ingredients for specifications and requirements of AI are sensor-based data collection; reasoning/information processing by algorithms and concomitant proposed action in keeping with human-centered values and fairness; transparency and self-explanatory qualities; robustness and security, safety and accountability. The EU was strongly involved in the international efforts for developing this OECD’s AI principles (European Commission, 2020d, pp. 8f.; OECD, 2019a, pp. 7-8; ITF, 2021b, p. 14).

AI and Access to Big Data

Data analytics and big data value chains are interrelated, consisting of collection, storage, processing, visualization, and querying of data. Multiple machine learning algorithms are used as data analytics techniques. The goal of big data analytics platforms is to interpret and derive decision-relevant analysis for a specific use case under consideration (Syed et al., 2020). Examples of heterogeneous data classes are operational data, non-operational data, meter usage data, event message data, and metadata. Data may also be classified according to the static versus dynamic nature of data and its scale (OECD, 2022, pp. 36-37). Whereas static data do not change after being collected, dynamic data updated from time to time may change after they are recorded depending on updated incoming data. Dynamic real-time data are provided immediately after collection with minimized forward delay, relevant for real time-based use cases. The scale of a dataset can be classified depending on the order of magnitude (per second of real time).

The statistical analysis of the influence of data size on machine learning algorithms differentiates between types of data such as text data, image data, audio data, network data, measurement sensor data, etc. The concept of big data is relative and depends on the available sample sizes for the relevant types of data being sufficiently large to conduct a particular analysis required for the practical application or use case under consideration. The influence of the sample size (e.g., the number of training samples) on the accuracy of classification, and by extension, classification error should be considered (Emmert-Streib, 2020, pp. 19 f.).

To take into account the interdependence between the objectives and design of data value chains and AI, data are not considered in isolation (ITF, 2022, pp. 6–9). The relationship between data collection, processing, and public governance is relevant. The fallacy of raw data has to be avoided because data collection depends on framing for specific reasons (ITF, 2022, p. 16).

AI and Algorithms

High volumes of raw data are typically available in many areas of IoT applications. The real challenge arises when transforming high volumes of data under the requirements of real-time processing into decision-relevant information by means of algorithm-based big data analytics. Data analytics pose tremendous challenges due to massive storage and high-speed real time processing. Although these obstacles may be tackled by using parallel systems, there remain challenges regarding energy management, scalability, and real-time collaboration (Syed et al., 2020, p. 13; Fridman, Brown, Glazer et al., 2017).

Innovations in quantum computing also possess a tremendous potential for improving AI models by enabling them to recognize highly complex relationships in datasets that classical systems cannot. Since investment costs in quantum computers are very high and their applications are very specialized, cloud-based access to quantum computers is becoming the typical usage model. In the meantime, applications in healthcare, manufacturing, and transportation can be tackled with cloud-based quantum computers with AI technologies and algorithms with quantum machine deep learning under the precondition that the available data are sufficient to use deep learning models (Omar, 2021, pp. 1–3; Emmert-Streib et al., 2020, p. 19; Biamonte et al., 2018, pp. 1–3).

Algorithms that learn and self-evolve warrant particular attention because these algorithms are no longer programmed by human beings, but increasingly learn and are adaptive (EPFL IRGC, 2018, p. 2). The meaning of self-learning mechanisms is twofold: AI self-learning may take place during the training, but there is also the possibility that AI machines continue learning after they are deployed (European Commission, 2020b, p. 7, FN 38). In contrast to well-established static algorithms in optimization theory, dynamic deep learning algorithms based on neural networks are gaining in relevance for complex big data real-time applications (Emmert-Streib et al., 2020). The question arises whether a conflict may arise between the required entrepreneurial search for complex deep-learning algorithms and the requirements of security regulations of high-risk AI-systems. In other words, is human autonomy still guaranteed, taking the responsibility for the implementation of a high-risk AI system? For the class of cases that learning continues in real time applications human beings may become unable to understand every iteration of the algorithmic decision-making process. Nevertheless, the behavior of the AI system must be constrained by the goal and relevant design choices under the autonomy of the human operator (European Commission, 2020c, pp. 6–11).

AI-Powered Big Data Virtual Networks

The IoT is driven by the complementary roles of physical networks and virtual networks. Virtual network providers develop big data value chains according to the necessities of the IoT applications, combining QoS-differentiated bandwidth capacities with the different complementary components such as sensor networks and geo-positioning services, and different forms of data processing activities (Knieps, 2017).

The transition from 4G to 5G represents a disruptive innovation based on fixed-mobile convergence, ultra-high bandwidth capacities, a large variety of heterogeneous QoS of bandwidth capacities, and the possibility of ultra-low latency guarantees of data packet transmission. The concomitant characteristics of 5G as an application-agnostic new General Purpose Technology (GPT) rest on the potential for pervasive use within a large and open set of downstream application sectors driven by complementary innovative activities widely dispersed throughout the economy (Knieps, Bauer, 2022; Parcu et al., 2022). The focus is on the requirements of 5G and concomitant 5G-based big data virtual networks, which also satisfy the strong QoS requirements of bandwidth capacities of the tactile Internet.

The heterogeneity of AI algorithms and machine learning depends on the big data value chains of big data virtual networks. The focus is on the heterogeneity of data value chains and concomitant heterogeneity of AI algorithms. Data are not homogeneous, such that depending on the underlying IoT applications (use cases), the degree of aggregation of data and time-sensitivity of data varies such that time-insensitive data, real-time streaming process of real-time data, as well as interactive processing of massive scaled data, may become relevant. This impacts the whole data value chain as well as the AI design necessities required for a particular IoT application. The complementary requirements for AI differ depending on the goal, which depends on the use case under consideration.

The important role of AI systems based on big data magnifies the importance of data sharing and open data for AI applications in many areas of the app economy such as transport, agriculture, financial services, marketing and advertising, science, health, criminal justice, security in the public sector, applications using augmented and virtual reality, etc. (OECD, 2019b, pp. 47–80).⁷ The Joint Technical Committee of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) (Subcommittee SC 42, Artificial intelligence) listed 24 application domains indicating a large variety of fields such as home/service robotics, media and entertainment, the public sector, education, health care, energy, manufacturing (including logistics), mobility, and transportation (ISO/IEC, 2021, pp. 7–9).

Examples for heterogeneous AI-powered big data virtual networks are based on heterogeneous data collection, processing and actuator requirements driven by heterogeneous AI algorithms characteristics. An important class of examples are driven by deep learning algorithms illustrating the insights into various big data analytics in the context of smart grids (Syed et al., 2020).

AI-Powered Big Data Virtual Networks in Transportation Industries

Transportation is one of the largest economic sectors across the OECD, with an increasingly important role being played by big data-driven applications (OECD/ITF, 2015; 2016; ITF, 2022). Heterogeneous big data-driven classes of use cases with heterogeneous AI technologies are becoming increasingly relevant. Examples are data-driven transport infrastructure maintenance (ITF, 2021a), AI in proactive road infrastructure safety management, and accompanying traffic safety regulations (ITF, 2021b).

Big Data and AI-Driven Use Cases in Transportation Industries

In recent years, big data in transport has gained particular relevance (OECD/ITF, 2015; OECD/ITF, 2016), with a large potential for reporting mobility data (ITF, 2022). In the following, the focus is on AI and big data in heterogeneous application areas of the transport sector. Transportation industries are an important driver of the IoT. Big data is becoming relevant in all areas of transportations. Big data-driven, AI-powered intramodal and intermodal transportation value chains are enabling an open set of innovative use cases in all areas of transportation industries, supply chains, and logistics. Best use cases of AI—machine learning in the transportation, shipping, and logistics industries—are manifold, resulting in decreasing transportation costs and increasing transportation performance characteristics (punctuality, reliability, safety).⁸ AI and big data algorithms transform the real-time vehicle tracking data into real-time analysis and provide real-time information to drivers or self-driving autonomous vehicles, generating the best possible routes by adaptive simulations (Gesing et al., 2018).

AI-powered big data value chains act as enablers of vehicle platooning on public roads, self-driving aircraft, and autonomous trains (ISO/IEC, 2021, pp. 87–91). The role of AI in railway transport—such as intelligent train automation and operational intelligence for rail operators and infrastructure managers—enables the forecasting of infrastructure or rolling stock conditions, faster and less resource-intense repairs, reduction of maintenance costs, and the optimization of fleet reserves in case of train breakdowns. AI in maritime shipping and inland navigation, as well as smart ports and logistics, improve the allocation of relevant resources in a manner similar to that of just-in-time operations and offer transparency to optimize the real-time allocation of available dock spaces (European Parliament, 2019).

The transition towards fully automated driving may be considered as an evolutionary process. “It may be decades before the majority of cars on the road are fully autonomous. During this time, the human is likely to remain the critical decision maker either as the driver or as the supervisor of the AI system doing the driving” (Fridman, Brown, Glazer et al., 2017, p. 3). To the extent that data supporting enforcement actions relate directly to unique vehicles, individuals, and their specific behaviors, the most privacy-sensitive data reporting occurs in this context (ITF, 2022, p. 7).

Heterogeneity of AI-Powered Big Data Virtual Networks in Transportation Industries

The analysis of the heterogeneity of AI-powered big data virtual networks in transport leads immediately to the question regarding the heterogeneity of big data value chains and AI algorithms. Heterogeneous AI-powered big data virtual networks for physical applications are based on heterogeneous big datasets and heterogeneous algorithms. A fundamental shift in the transportation sector (physical side) towards smart network industries is based on big data-driven innovations due to the combination of enhanced sensor technologies, the drop in data storage costs, and the availability of new data processing algorithms and innovations in communication technologies (OECD/ITF, 2015, p. 9). Data aggregation is not application-agnostic and depends on the particular use case under consideration. There are multiple ways to aggregate vehicle data (ITF, 2021b, p. 23). The degree of aggregation and the heterogeneity of requirements for the geographical footprint (local, regional, global), as well as the heterogeneous requirements for the time-sensitivity of datasets (static, dynamic real time), are dependent on the different use cases under consideration. Data transmission is time-sensitive according to different requirements (latency sensitivity varying between ultra-low latency of milliseconds, seconds, minutes, hours, etc.).

Appropriate aggregation of data depends on the underlying data value chain problems to be solved: e.g., planning data, operational data, and data supporting enforcement actions and vary in terms of time sensitivity. Planning data may be aggregated according to the requirements of the relevant time horizon and are non-time-sensitive, operational data respond to real-time events and may be aggregated according to the logistic transport requirements, congestion management may be time-sensitive depending on the congestion levels and aggregated on a given infrastructure corridor or slot, autonomous vehicle data require ultra-low latency-dependent traffic management within a localized mobility platform. Heterogeneous footprints can be a local, regional, national, or cross-border focus of big data virtual networks, depending on the underlying space dimension of the big data virtual network. Heterogeneous data protection and safety requirements differentiate between data-related risks, privacy, trade secrets, and commercially sensitive data with concomitant impact on incentives for the collection and sharing of data (OECD/ITF, 2015).

AI in proactive road infrastructure safety management requires the sensing and sharing of safety-relevant data along the entire road network (ITF, 2021b). The focus is on accurate risk prediction and guidance for proactive road network safety management. Data often remain in separate entities instead of being shared; a barrier for AI implementation is the fear of litigation for disclosure of identifiable personal information, which highlights the necessity of decision-relevant aggregation of data. The automotive industry produces high volumes of moving vehicle data including indicators of traffic volume, speed, and information from active safety systems. In this context, it would be valuable to enable market platforms to utilize heterogeneous data from vehicles and smartphone apps. Risk prediction models based on data aggregated over different time periods instead of the real-time date, are not critical for private data protection, resulting in less complex AI algorithms compared to the big data processing of real-time datasets.

Shipping volume prediction turns historical data from a transportation management system into a data-driven tool for forecasting sales numbers and shipping volumes. Forecasts consist of historical data and also external factors, like seasonality or weather. Another AI-powered big data application (use case) is real-time vehicle tracking. Based on IoT sensors, this process involves real-time monitoring of the speed, location, and direction of a vehicle: A local gateway delivers the data to a network server while the truck is still on the road and sends information to a cloud-based application server.

AI-powered big data virtual networks for autonomous driving systems are to be considered as an evolutionary process: Many aspects of driver assistance and vehicle performance are increasingly being automated with data-driven, learning-based approaches. The beneficial role of data sharing can be seen in situations whereby competitors win by collaborating, sharing high-level insights and large-scale, real-world data. Parties involved are car companies, automotive parts suppliers, insurance companies, technology companies, government regulatory agencies, and researchers in applied AI and vehicle safety (OECD, 2019b, p. 25, pp. 48–53; Fridman, Brown, Glazer et al., 2017).

AI systems in driverless cars must be trained to deal with critical situations, which raises the problem of training in computer-simulated environments. Autonomous driving systems and their adaption to ride-sharing services entail potential liability problems in the case of accidents. The importance of sharing real-time data among different development firms in the course of their trials to train AI algorithms has been pointed out (OECD, 2019b, pp. 48–52). The role of large-scale deep learning is based on a massive-scale dataset collected from the instrumental vehicle fleet (Fridman, Brown, Glazer et al., 2017, pp. 6f.).

One aim of this paper is to make more explicit the neglected role of broadband-driven networked AI: e.g., not only a particular model of a car is required for autonomous driving systems, but a more general model of networked autonomous vehicles focusing on interoperability based on a car-to-cloud data standard such that data from different manufacturers can be combined (OECD, 2019b, p. 25; ITF, 2021a).

Building accessible datasets in the context of automated driving for understand driving behavior raises the issue of big data access, either with proprietary data, shared data across firms, or the role of government funding for open data collection. “AI is notoriously dependent on provision of large amounts of quality data. Will drivers, carmakers, commercial transport operators and telematics companies be willing to share the data they produce?” (ITF, 2021b, p. 13). AI creates the necessity for big data sharing. This also creates incentives for competitors to collaborate in an environment where competing firms win by sharing large-scale, real-world data in the search for a better understanding of the complexity of networked autonomous driver behavior (Fridman et al., 2017, p. 9).⁹

Conclusions

Big data value chains are not necessarily AI-powered, but there are many important IoT applications where the use of AI becomes necessary. Depending on the individual IoT application, a specific implementation of AI is chosen in combination with the relevant big data value chain, resulting in a heterogeneity of the design of AI systems ranging from “simple” traditional optimization algorithm to highly complex deep learning algorithms. The economic concept of AI-powered big data virtual networks has been introduced to analyze the changing entrepreneurial incentives for the choice of an adequate AI algorithm and its integration into the required big data value chain taking into account the requirements for ethical, security, and privacy regulations.

Big data and AI are becoming increasingly relevant in transportation network infrastructures (e.g., tracks, airports, roads, ports), infrastructure management (e.g., air traffic control, railway traffic control), and network services (e.g., transport on roads, rail, waterways). In these networks, the requirements for AI systems vary strongly depending on the complexity of the associated big data value chains: Intelligent traffic information systems, such as those offering the real-time distribution of information, are real-time without latency guarantees and have no requirements for camera-based sensors. In contrast, networked fully automated (driverless) vehicles need guarantees for ultra-low latencies along with ultra-high requirements for both the sharing of sensor data and edge cloud local processing of data.

The growing importance of AI is fundamentally changing the nature of big data virtual networks. The significant innovation potential of AI algorithms can be exploited by entrepreneurial search processes of the AI-powered big data virtual network service providers focusing on all dimensions of big data virtual networks simultaneously, considering the requirements for ethical, security, and privacy regulations according to the OECD and EU principles.

Of particular relevance for the future development and implementation of AI systems within network industries is the application of AI risk-based regulation in combination with the relevant concomitant regulations such as GDPR, AI Liability Directive without hampering the entrepreneurial driven search for AI powered big data virtual networks for the sake of future smart network industries solutions.

Footnotes

Acknowledgements

Helpful comments by Matthias Finger, Juan Josè Montero and Niccoló Innocenti as well as anonymous referees are gratefully acknowledged.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Guenter Knieps

Notes

References

Biamonte

Wittek

Pancotti

Rebentrost

Wiebe

Lyod

(2018). Quantum machine learning. arXiv: 1611.09347v2 (quant-ph) 10 May. https://doi.org/10.48550/arXiv.1611.09347

Chang

Hari

Mukherjee

Lakshman

T. V.

(2014). Bringing the cloud to the edge. IEEE INFOCOM Workshop on Mobile Cloud Computing, pp. 346–351.

Chivot

Castro

(2019). The EU needs to reform the GDPR to remain competitive in the algorithmic economy, Center For Data Innovation.

Emmert-Streib

Yang

Feng

Tripathi

Dehmer

(2020). An introductory review of deep learning for prediction models with big data, Frontiers in Artificial Intelligence, 3/4, 4–23. https://doi.org/10.3389/frai.2020.00004

EPFL IRGC . (2018). The governance of decision-making algorithms. EPFL International Risk Governance Center.irgc.org.

European Commission . (2020a). A European strategy for data, communication from the commission to the European parliament, the Council, The European economic and social committee and the committee of the regions, 19.2.2020 COM(2020) 66 final.

European Commission . (2020b). Proposal for a regulation of the European parliament and of the council on European data governance (data governance act), European Commission. 25.11.2020, COM (2020) 767 final.

European Commission . (2020c). Report on the safety and liability implications of artificial intelligence, the internet of things and robotics, report from the commission to the European parliament, the council and the European Economic and social committee, European Commission. 19.2.2020 COM (2020) 64 final.

European Commission . (2020d). White paper on artificial intelligence—a European approach to excellence and trust, European Commission, 19.2.2020 COM (2020) 65 final.

10.

European Commission . (2021a). Proposal for a regulation of the European parliament and of the council on machinery products, European Commission, 21.4.2021, COM (2021) 202 final.

11.

European Commission . (2021b). Commission staff working document impact assessment, accompany the proposal for a regulation of the European parliament and of the council: laying down harmonized rules on artificial Intelligence (artificial intelligence act) and amending certain union legislative acts, European Commission, 21.4. 2021, SWD (2021) 84 final, Part 2/1.

12.

European Commission . (2021c). Commission staff working document impact assessment, annexes, accompany the proposal for a regulation of the European parliament and of the council: laying down harmonized rules on artificial Intelligence (artificial intelligence act) and amending certain union legislative acts, European Commission, 21.4. 2021, SWD (2021) 84 final, Part 2/2.

13.

European Commission . (2021d). Proposal for a regulation of the European parliament and of the council: Laying down harmonized rules on artificial intelligence (artificial intelligence act) and amending certain union legislative acts, European Commission. 21.4.2021, COM (2021) 206 final.

14.

European Commission . (2022b). Commission staff working document on common European data spaces, European Commission, 23.2. 2022 SWD (2022) 45 final.

15.

European Commission . (2022a). Proposal for a regulation of the European parliament and of the council on harmonized rules on fair access to and use of data (data act), European Commission. 23.2. 2022, COM (2022) 68 final.

16.

European Commission . (2022c). Commission Staff working document impact assessment report, accompanying the document: Proposal for a regulation of the European parliament and of the council on harmonized rules on fair access to and use of data (data act), European Commission, 23.2. 2022 SWD(2022) 34 final.

17.

European Commission . (2022d). Proposal for a directive of the European Parliament and of the council on adapting non-contractual civil liability rules to artificial intelligence (AI liability directive), European Commission, 28.9.2022, COM (2022) 496.

18.

European Parliament . (2019). Artificial intelligence in transport, current and future developments, opportunities and challenges, EPRS/European parliamentary research service, European Parliament. Author: M. Niestadt with A. Debyser, D. Scordamaglia and M. Pape, March.

19.

European Parliament . (2023). New product liability directive, EPRS/European Parliamentary Research Service, S. De Luca.

20.

European Union Agency for Cybersecurity, ENISA . (2020). AI cybersecurity challenges, Threat landscape for artificial intelligence. European Union Agency for Cybersecurity, ENISA.

21.

Fridman

Brown

D. E.

Glazer

(2017). MIT autonomous vehicle technology study: Large-scale deep learning based analysis of driver behavior and interaction with automation. arXiv: 17.11.06976v1 (cs.CY) 19 Nov.

22.

Gesing

Peterson

S. J.

Michelsen

(2018). Artificial intelligence in logistics—A collaborative report by DHL and IBM on implications and use cases for the logistics industry. https://www.globalhha.com/doclib/data/upload/doc_con/5e50c53c5bf67.pdf

23.

ISO/IEC . (2021). Information technology—artificial intelligence (AI)—use cases. ISO. Technical Report ISO/IEC TR 24030, First edition 2021-05.

24.

ITF . (2021a). Data-driven transport infrastructure maintenance, international transport forum policy papers. OECD Publishing, No. 95.

25.

ITF . (2021b). Artificial intelligence in proactive road infrastructure safety management: Summary and conclusions. In: ITF roundtable reports. OECD Publishing, No. 187.

26.

ITF . (2022). Reporting mobility data—good governance principles and practices. International transport forum policy papers. OECD Publishing. No. 101.

27.

Knieps

(2017). Internet of things, future networks and the economics of virtual networks. Competition and Regulation in Network Industries 18(3-4): 240–255. https://doi.org/10.1177/1783591718784398

28.

Knieps

(2021). Digitalization technologies: The evolution of smart networks, In: Montero

J. J.

Finger

(Eds.), A modern guide to the digitalization of infrastructure, Edward Elgar, Cheltenham, 2021, Chapter 2, 43–58.

29.

Knieps

Bauer

J. M.

(2022). Internet of things and the economics of 5G-based local industrial networks, Telecommunications Policy, 46(4), 102261. https://doi.org/10.1016/j.telpol.2021.102261

30.

OECD . (2019a). Recommendation of the council on artificial intelligence, OECD/LEGAL/0049. OECD Legal Instruments. https://legalinstruments.oecd.org

31.

OECD . (2019b). Artificial intelligence in society. OECD Publishing. https://diu,irg/10.1787//eedfee77-en

32.

OECD . (2022). OECD framework for the classification of AI systems. OECD Digital Economy Papers, p. 323.

33.

OECD/ITF . (2015). Big data and transport—understanding and assessing options. https://www.itf-oecd.org/big-data-and-transport

34.

OECD/ITF . (2016). Data-driven transport policy. International Transport Forum. https://www.itf-oecd.org/data-driven-transport-policy

35.

OECD/ITF . (2019). New directions for data-driven transport safety. www.itf-oecd.org

36.

Omar

(2021). Why the United States needs to support near-term quantum computing applications, Center For Data Innovation.

37.

Parcu

P. L.

Innocenti

Carrozza

(2022). Ubiquitous technologies and 5G development. Who is leading the race? Telecommunications Policy, 46(4). 102277, https://doi.org/10.1016/j.telpol.2021.102277

38.

Shannon

C. E

. (1950). XXII. Programming a computer for playing chess, Philosophical Magazine, 41(314), 256–275. https://doi.org/10.1080/14786445008521796

39.

SWIPO AISBL . (2020). Code of conduct for data portability and cloud service switching for infrastructure as a service (Iaas) cloud services. version: 2020, 08-07-2020. https://pr.euractiv.com/pr/swipo-aisbl-publishes-codes-conduct-205589

40.

Syed

Zainab

Refaat

Abu-Rub

Bouhali

. (2020). Smart grid big data analytics: Survey of technologies, techniques, and applications, IEEE Access, 1–22. https://doi.org/10.1109/Access.2020.3041178

41.

Turing

A. M.

(1950). I.—computing machinery and intelligence, Mind: A Quarterly Review of Psychology and Philosophy. 59(236), 433–460.