Abstract
Background.
Aim and method. By
Results. The
Conclusions. The
Recommendations. The article identifies the need for a
Keywords
We want a game, and it should be as realistic as possible.
The above phrase encapsulates of the standard request one gets as a game designer for organizations that predominantly deal with technical systems. The phrase has been heard literally in the context of designing innovations for these organizations. For the author, this phrase has been the start of a path to explore whether games done with traditionally low-tech materials are really different from the computer supported high-tech ones with respect to their degree of realism. The current article explores this question in the context of games built for one organization, the Dutch railway administration ProRail.
The Dutch railway system is a highly complex and heavily utilized network (Ramaekers, de Wit, & Pouwels, 2009; Goverde, 2005). Improvements in the domain of capacity management and traffic control are increasingly difficult to implement, due to the high degree of interconnectedness among all of the components and processes involved. Facing the challenge of a 50% growth rate until 2020, the Dutch rail infrastructure is in urgent need of new and smarter ways of managing capacity and traffic. The ProRail organization has started to use gaming as a key method for improving the innovation process.
Railway simulations have traditionally involved complex computer models with advanced discrete event or multi agent simulation. Both represent so-called closed models. The railway domain knows a range of simulation models (like the well-known OpenTrack and Railsys) that are generically applicable to all railway systems, as well as a vast range of specific simulators for specific systems, like FRISO for the Dutch systems (D’Ariano, 2008; Middelkoop, Meijer, Steneker, Sehic, & Mazzarello, 2012). None of these systems allows for interactive simulation, other than setting initial parameters and checking the final outcomes.
To overcome this limitation, an interactive layer has been developed for the FRISO simulator, called PRL GAME (first described by Kortmann & Sehic, 2011). As such, the organization has access to a high-tech interactive simulator as an open, hybrid simulation: the environment is simulated, but humans take some decisions, not by a rule system in the computer model. The PRL GAME has been used for multiple innovation projects.
Simultaneously, we also used low-tech games, using analogue materials (e.g., pen and paper, sponges and wooden sticks) to represent system components (e.g., trains, passengers, infrastructure, and timetables). High-tech simulators use computer simulations to represent in detail the system’s components and dynamics. Low-tech games do not necessarily abandon the use of computers. When they are used, however, it is usually for such tasks as registering data and displaying information, and not as a dynamic model for essential components of the reference system: the operational control of railways.
In a previous study, Meijer (2012) showed that railway gaming that involves the actual operators requires high fidelity of the rules of correspondence between the game and the reference system, specifically with respect to the infrastructure, timetables and processes. Given the technical possibility of achieving a high level of fidelity for most projects involving low-tech and high-tech games, it is not clear which approach is best suited to support innovation in the railway sector. Surprisingly enough, we found that both low-tech and high-tech games were requested to innovate in operational traffic control, and that both tend to address similar problems. We had expected that because of their different instrumentality, both types of games would find projects on different issues. Hence we have a unique situation with empirical observations from five cases to explore the consequences of using high-tech simulation and low-tech gaming methods. The central search in this article therefore is what relations we could observe between the type of gaming (high-tech, low-tech), their fidelity and their impact on the use of gaming in innovation. Given the situation wherein the key component of the low-tech cases was scour sponges as models for trains, the article proverbially explores the power of sponges.
First, the next section describes the railway innovation problem as experienced by ProRail, and then the use of gaming for process innovation, the use of gaming in railways, and the fundamental problems of high-tech, and low-tech games and the relevant dimensions of fidelity. Section 4 gives an overview of the data collected from 5 cases, and leads to a discussion on patterns and observations in Section 5. A discussion of the debriefing structure for all these games then precedes the conclusions.
Railway Innovation Problem
While innovation is sorely needed within the Dutch railway system, such innovation is quite difficult to achieve. The politically motivated separation in 1995 of railway infrastructure administration (ProRail) and train services (predominantly NS, as well as several smaller regional lines, including Syntus and Veolia) requires synchronization between distinct authorities, involving multiple offices and platform/line operations in order to control the daily flow of train traffic. The increasing importance of rail services for individual provinces in the Netherlands has led to multi party tendering (Van de Velde, Veeneman, & Lutje Schipholt, 2008). Within this complex multi actor and multi level environment, it is often impossible to implement strategic safeguards for public values in the management of operations (Steenhuisen, Dicke, & De Bruijn, 2009). Such developments within the management of the railway system pose two challenges to innovation: ensuring quality and reliability in operations and identifying ways to increase capacity.
Quality in Operations: Robustness and Resilience
Over the past decade, the railway system in the Netherlands has received major criticism regarding the quality and reliability of its operations. From a policy perspective, this has led to performance contracts for both the primary trainservice operator (NS) and ProRail (Van de Velde, Jacobs, & Stefanski, 2009). During this period, the performance of these companies has improved on the critical performance indicators, but the system still fails to provide high quality service due to many small delays, overcrowded trains, and poor (or absent) provision of information to passengers. The rail system suffers from frequent minor defects, which can generate major delays, as the problems spread like an oil spill throughout the various regions and lines like an oil spill. Defining robustness as the extent to which a system is capable of withstanding problems within the limits of its design, the robustness of the railway system in the Netherlands is questionable.
The consequences of such diminished robustness would be less detrimental if the railways were more resilient. Hollnagel, Woods, and Leveson (2006) define resilience as the ability of a system or organization to react to and recover from disturbances at an early stage, with minimal impact on dynamic stability. Instability poses a risk to the safety of the system. Resilience engineering (i.e., the methods and principles that prevent safety risks) can be used to address these challenges. In recent years, the vulnerability of the system has been exacerbated by snow, storms, national festivities, and other infrequent, however predictable events, offering further evidence that situations for which the system was not specifically designed are capable of causing total or, at best, partial collapse of the national system as soon as minor problems begin to occur. In their assessment of safety operations in the Dutch Railways, Hale and Heijer (2006) suggest that railway systems constitute an example of poor (or, at best, mixed) resilience, which can nevertheless achieve high levels of safety in at least some areas of their operations. Safety can thus be achieved by sacrificing some goals in the areas of traffic volume and punctuality. The system does not achieve all its goals simultaneously and flexibly. It is simply not resilient.
Capacity Increases
In the coming decade, the Dutch railway sector will face massive growth in the demand for transport. This growth is expected in both passenger and freight transport. The Dutch railway network is one of the most densely used networks in the world, and it is approaching its maximum capacity, given the current infrastructure and control mechanisms. The projected increase in demand will require systematic changes in both the physical and control aspects of the railways. ProRail has formulated an ambitious program, entitled “Space on the Tracks” (in Dutch, Ruimte op de Rails), which aims to increase the number of trains operating in the network by 50% before 2020. One of the major components of this program is a plan for high frequency passenger trains on the major corridors. On average, the current system involves four intercity trains, two to four local trains, and one or two freight trains operate on the major corridors each hour. ProRail’s plan would increase this to six intercity trains, six local trains, and two freight trains by 2013. As of 2015, the plan holds and is foreseen for implementation in 2017. The new system would provide for “unscheduled travelling,” enabling passengers to simply go to a station without checking departure times: the next train will arrive soon. The official title of this schedule is High Frequency Train Transport.
The projected capacity increase cannot be achieved by building new infrastructure alone: the costs for the complete program would be around €9 billion, and the time required for procedures and construction would frustrate the transport demand for years. ProRail has accepted the challenge of achieving the goals with only half of this budget by combining strategic choices for new infrastructure with new solutions for control and management.
Gaming for Process Innovation
For the purposes of this article, gaming is defined as “simulating a system through gaming methods.” The term “gaming” exists within a loosely demarcated field of interactive participatory activities aimed at involving participants, who may be actual stakeholders in an activity. Related terms include simulation games, gaming simulations, policy exercises, and serious gaming. Although different authors have different preferences, the terms generally depend upon the intended use of the method. Given the number of gaming titles and scientific publications, the use of gaming methods for learning is by far the most popular, in most cases, using the terms “serious gaming” and “simulation game” to refer to computer supported games that place the player in a simulated world (De Freitas & Oliver, 2006; Kriz, 2003).
Gaming for process innovation builds upon two known approaches to using the method. First, for five decades, gaming has been used as an intervention for bringing together policymakers and other stakeholders in participatory events in order to help them take innovative steps. Games provide a way to making collective decisions regarding to system boundaries and, subsequently, with regard to the dynamics of the system that will be played. Policies can then be formulated within this simulated environment (Duke, 1974; Duke & Geurts, 2004; Mayer, 2009). A second and more recent approach to the use of gaming for process innovation involves using the method to test hypotheses concerning the behavior of human systems (Meijer, 2009). This application is less common, and it places considerable emphasis on the verification and validation of the game (Klabbers, 2003, 2006; Meijer, 2009; Noy, Raban, & Ravid, 2006).
Within the context of innovation at ProRail, the combination of testing hypothesized improvements to the system with the operational and strategic stakeholders involved is at the core of the reasoning behind choosing gaming as a new method for reducing uncertainty in complex system-level changes. The literature on assessment of training environments, like Dugdale et al. (2006), does not suffice to set standards on verification and validation of games used for process innovation. The related question here is twofold. First, how do we define the outcome of training in terms of learning goals? This is only possible if this target is known, which in process innovation by definition is not the case. Therefore, providing a reference scenario for future systems is not possible. Second, how could we gain full knowledge of the future system during the innovation process, if that knowledge is not yet available at the start? No single expert can draw out every function at the time of innovation. The simulated environment is therefore partially made up while gaming.
The more recent work on other views on learning and training in games on improving higher order learning (meta cognitive skills) such as heuristic learning and/or self organizing learning (i.e. Warmelink, 2011) is conceptually closer to innovation and design. While the last decade has seen several case studies proving the suitability of gaming for higher order learning, little work has been done on the actual innovation value in organizations from participating in such games.
Klabbers (2003, 2006) has elaborated on the use of gaming for design, and introduced design-in-the-small, addressing game design as such, and design-in-the-large, aiming at improving existing situations into preferred ones through the use of those games. Klabbers argued that artifact assessment is something completely different from theory testing. This is of course true, but does not completely explain all roles of gaming in the design process of an innovation. Within ProRail, a regular need exists to test a new artifact (part of the reference system) as a step in design of the innovation. This is a step in the design science tradition, and requires assessment of validity prior to, during and after the gaming session due to the immediate relation with the system at large. In the simulation world, this type of validity is well known from the work by a.o. Balci (1998) that prescribe a process of verification, validation and accreditation of simulation models.
Van den Hoogen et al. (2014) argue that a gap exists in the current literature between design within a game, design with the use of games and interventions in change processes. A tension exists between design within a game and design by using games, where the first one aims to design new artifacts in the context of the game process, while the latter aims to use gaming as a step in the design process, for instance as an intermediate test. Testing follows the logic of artifact assessment, but not of the game artifact itself. It is here where this article explores the right theoretical framework to interpret the role of gaming in this process.
Gaming in Railways
The use of gaming in the railway sector is generally new, and documented for the Dutch railway introduction by Meijer (2012). For other national railways, no real gaming methods are documented in the literature. Related to gaming is the more common use of driving simulators and training sets for traffic controllers. In Germany and the UK, driving simulators are of a very high level of detail (Wilson et al., 2007). The same holds true for French simulators, most often built by Corys, which also delivered the two Dutch full-scope driving simulators in Amersfoort. It is of course possible to run gaming-like exercises with these simulators, by having multiple operators interact with each other via the communication devices in the train cockpit, from the session control room. However, the nature of these sessions is different from gaming, as in principal the complex multi actor setting is often simulated by facilitators in the control room, and the simulators are designed for feedback on the performance of the driver and not on the effects of the joint decision making on the system.
The existing simulators are often referred to as high-fidelity environments. Fidelity, here defined as ‘the extend to which the virtual environment (game) emulates the real world’ (Alexander, Brunyé, Sidman, & Weil, 2005) can be described in three dimensions of this similarity in terms of:
The physical characteristics, for example visual, spatial, kinesthetic, etc, and
The functional characteristics, for example the informational, stimulus, and response options of training situation (Hays & Singer, 1989, p. 50), and
The psychological characteristics, for example stress and arousal.
It may be clear that the gaming-like simulators in railways in use are actually more simulators than games, as they aim for a really high similarity. Following Sauve, Renaud, Kaufmann, and Marquis (2007), this indicates a non-game use as gaming has notions of artificial characters and new rules. Klabbers and Van der Waals make a distinction between freeform and rigid rule games, where rigid rule games would be what would be labeled ‘gaming-like simulators’ in this article. It is important to note that all the games reported in this article have a certain free form character, however with a relatively detailed representation of the technical environment (the railway system) with actual operators playing their own or similar to daily life roles in a non rigid way.
High-Tech or Low-Tech?
Most publications on the use of gaming methodology for purposes of innovation draw no clear distinctions between the various technologies that are used for creating games. In this article, the terms high-tech and low-tech are used as clusters in a continuum ranging from interactive simulation (e.g. first person cockpit simulations) to games based entirely on the use of paper and pen. The term high-tech includes the use of computational simulation models and advanced human-computer interfaces, while the term low-tech is used to refer to games in which the dominant simulated world is represented using analogue methods (e.g. game boards), although may also be supported by some calculation methods through the use of spreadsheets. Low-tech and high-tech gaming thus can (and often do) represent the same reference system to be simulated, although they use different representations for the same items. This is a different distinction than rigid rule or free form gaming as discussed above, since the representation of the reference system includes technical aspects and roles, but not the rules for the parts that are not the environment but the actual play.
In other contexts, scholars of gaming applications have published extensive discussions concerning several aspects related to differences between high-tech and low-tech gaming. Within the context of applying gaming for learning purposes, considerable attention has been paid to the fidelity of games for the transfer of specific skills and knowledge (e.g., Druckman, 1994; Feinstein & Cannon, 2002). In general, these scholars conclude that skills that are more specific require gaming environments of higher fidelity in order to ensure adequate transfer of knowledge. For example, Sutton (2005) reviews the use of a combination of the two extremes in training for firefighters. However, other scholars, like Beaubien and Baker (2004) argue even that no direct relationship exists between the level of simulation fidelity and teamwork training effectiveness, and that fidelity is a multi dimensional phenomenon that requires precise analysis of what aspects need to have good fidelity. Toups, Kerne, Hamilton, and Shahzad (2011) even go so far as to prove that what they call ‘zero-fidelity games’ have a large value for education, and reduce costs of the simulation, in case of their training games for firefighters.
Dormans (2011) provides an extensive discussion of the value of realism in games, particularly within the context of learning and policy games. Dormans concludes that many games follow the inherent logic of simulation through inducing rules that resemble the rules of the reference system. He calls this iconic simulation. He concludes that simpler forms of games like indexical simulations and symbolic simulations do not necessarily lead to less complex behavior, since
the power of non-iconic simulation, such as games, lies not in its power to accurately model a source system or in the creation of a vast, realistic game world, but rather in its efficient use of expressive game mechanics.” (p. 628)
This implies that high-tech games are not necessarily better with their iconic approach.
Dormans’ conclusions are based on the language structure of the artifact, which Subrahmanian, Reich, Smulders, and Meijer (2011) link to mental models and the design theory of complex systems. With regard to railway systems, however, the complexity of the sociotechnical nature of the context requires a relatively iconic approach to the object of control: trains on a rail network. One could say here that fidelity on the physical and functional aspects needs to be good before psychological fidelity can set in. Lo, Van den Hoogen, and Meijer (2013) link this to validity, for gaming in the railway domain.
Gaming for innovation often involves having actual stakeholders play their own roles. Meijer (2012) analyzes requirements for ensuring that railway operators will become engaged in their roles, given different questions and abstractions. Enfield, Myers, Lara, and Frick (2012) analyze the innovation diffusion strategies followed by participants in a gaming setting in order to derive conclusions regarding the validity of their theory on innovation behavior. In the railway context, this implies that operational workers, who are accustomed to working with detailed information screens, should be able to play.
Empirical Gaming Cases
The data used in order to compare high-tech and low-tech games were collected from five cases taking place in the period 2009 – 2011 in collaboration between ProRail and Delft University of Technology. The format is a limited number retrospective cross case comparison (Yin, 2009), wherein the criteria are established for selecting cases from historical records for inclusion in the study. The cases have been selected on the basis of similarity to the reference system to be modeled: operational control in the railways. In all cases, the players were operational experts, in their own or highly related role about which they possess enormous knowledge. The experience of participants in the railway system ranged from 4 – 38 years. All had the Dutch nationality, apart from two cleaners in the last case who also had Turkish nationality.
Tables 1 and 2 provide an overview of the cases and their characteristics. Table 1 presents 2 high-tech cases, and Table 2 presents 3 low-tech cases. The data collected refer to different items from various phases of the design and implementation trajectory of each game, building upon the inputs and outputs of a gaming as depicted in Figure 1. Experience was not measured, as the games were not used for training. All data was collected via observation notes by at least 2 observers per case, and where appropriate from the game design documentation.
High-Tech Cases.
Low-Tech Cases.

Inputs and outputs of a game session (Updated from Meijer, 2009).
In all of these cases, the game development method was rooted in the design pattern developed by Duke and Geurts (2004), including the use of causal diagrams and interactions with clients. Clients were ProRail internal project leaders. Data about the consequences were collected using a case study approach in which the researchers received the project reports and could do interviews with stakeholders involved, following the development of the game and the facilitation of the game sessions. Cases 3, 4, and 5 involved low-tech simulations, as they were designed as board games, using paper and pencil, visualization with scour sponges, printouts, labels, and colored materials. Cases 1 and 2 involved high-tech simulations, as they relied on computer simulators, laptops, and other media in order to represent the railway system.
The variables presented for each of the seven cases are as follows:
Game management Purpose: What was the management’s intended purpose for the gaming session? Conclusions and Consequences: What were the consequences of the results of the gaming session for the decision making process? Development: Time required from initiation until delivery of the results to the management. # of sessions: How many sessions were able to generate data? Game design Roles: Fidelity aspects modeled in the game Rules: Fidelity aspects modeled in the game Objectives: Fidelity aspects modeled in the game Constraints: Fidelity aspects modeled in the game, and specifically: i. Time model: Was the game played in real time, step based, in condensed time, or turn based? ii. Data representation: The manner in which input data from the reference system (e.g. timetables) were represented in the game Load: Scenarios Qualitative and quantitative data: Data generated: Description of output data generated Immersion: How long did it take to reach a state in the game in which participants were engaged in the process? Defined as the moment that all participants would be working on their in-game tasks.
Analysis
To analyze answers to the central question ‘what is the relation between the type of gaming (high-tech, low-tech), fidelity and the use of gaming in innovation’, this section provides cross case analyses on the dimensions discussed earlier in the paper. Cases 1 and 2 together form the high-tech cases, and cases 3, 4 and 5 together form the low-tech cases. The analysis is reported in a discussion on all dimensions reported in Tables 1 and 2, and enriched with case information.
Game Management
Purpose
All high-tech and low-tech cases were used to test concepts with real operators. In that sense, no difference is observed in the use of the games, despite the earlier discussion for the need of precise, validated environments to do formal testing or artifact assessment within games. However, if we dive a little deeper into the nature of the tests, we can observe a few differences.
All of the low-tech cases were used to determine whether a pre-designed concept would actually work when applied in the operations. The project managers wished to test their concepts, and the games helped them to gain insight into many requirements for successful implementation.
In contrast, the high-tech cases used gaming in order to test fundamental ideas in a very early phase of development, but beyond the stage of formulating the fundamental principals only. Both cases had predefined all exact specifications of orders of trains, timings, infrastructure, etc, as was necessary for building the simulation scenarios. The innovation to test was stated in the instructions that were given to the operators, who could then perform their regular jobs within the simulated environment. The output data were analyzed in order to determine the correctness of the innovation. Especially in case of the RAILWAY BRIDGE GAME, this was very successful.
If we now connect this observation to the actual innovation trajectories at hand, we cannot observe any differences in the stage of decision-making (all projects follow the same PRINCE2 method for project management). All projects were identical in being collaborations between the staff employed in the traffic control department and the capacity allocation department. In most cases, even the same set of decision makers had to be satisfied. The only difference observed was whether the actual question was part of a much larger coherent program of projects, or that it was an isolated question. All low-tech cases were part of larger programs, and were managed in a coherent manner. Both high-tech cases fit in the same strategic decennium-long change philosophy, but were isolated projects in the sense of management.
This observation is not consistent with the usual flow in innovation trajectories as dominant models on decision-making (e.g., Stage Gate) suggest making ideas more and more specific over time, and to start with generic assessments of the costs and benefits. Within the context of railways, the prevailing logic apparently calls for testing generic ideas with a very high level of precision and detail. Steenhuisen and Van Eeten (2008) described the general top-down engineering logic of this sector that supports this observation. Apparently, larger coherent programs of projects have many tools and staff available to use the gaming method only for the type of answers and precision that can be obtained with low-tech game, i.e. more qualitative, where projects in a more isolated context go for the inherent appeal of the high-tech tool. An interesting follow-up question here would be to see what the balance between computer simulations and gaming use in both contexts is.
Conclusions and consequences
As shown in the tables, both the low-tech and high-tech gaming delivered quantitative and qualitative results from within the game. If we look at the way the results have been used we observe some patterns. First of all, the high-tech cases placed slightly more emphasis on the quantitative part, while the low-tech cases placed more emphasis on the qualitative aspects. The more complex ETMET and NAU low-tech games generated quantitative data at performance indicator level. The high-tech games (aimed at) delivering more in-depth train management process data too. For assessment of innovation ideas, the standard information early in the innovation cycle is to focus on costs, benefits and risks. No case focused on the costs within the gaming sessions, leaving benefits and risks as the information to collect.
The question now is what proofs of benefits and risks the project leader, as client of the game, requires. As mentioned before, the cases chose high-tech solutions when the requirements were heavily dependent upon the details of railway interlocking (the safety systems) and train management process. The exact formulation of such requirements, however, appeared to be steered more by management than by the question at hand, since all cases dealt with similar types of projects. The processes of all high-tech cases were driven by the inherent desire to have an insight into the process, but for the low-tech games it was accepted that only performance indicators could be measured. It is difficult to explain this difference.
From the row ‘Conclusions and their consequences’ in Tables 1 and 2, we can observe that the conclusions based upon the games have often been negative in the confirmation of the concept under test. Apart from PLATFORM OVERNIGHT PARKING, all cases returned a huge list of issues, relations with other processes and limitations to the concept. In all these four cases, the project leaders had not been aiming for such lists, but took a more positivist approach, asking mostly for a formal proof of the benefits. For the high-tech cases, this pressure was stronger than for the low-tech cases.
Development
Low-tech games have much shorter development cycles, as shown in Tables 1 and 2. This is logical, as high-tech games require the programming of modules that connect to rail-traffic simulators. This requirement is challenging. We need to ensure that the system’s response resembles the actual workstations of the operators as closely as possible. The long development times often cause problems.
ProRail uses a version of PRINCE2 project management, which generates an organization with many projects lasting 6-12 months. The time between the formulation of a question and the ultimate deadline for the answer is usually so short that, in many cases, it is impossible to develop any special software or other tools. In terms of innovation, this situation is consistent with the traditional gap between creating a market or demand for new tooling and realizing the ability to deliver it. Once the software has been developed, it can be applied quickly. In the meantime, however, low-tech tools could have been used to answer the questions at hand. Low-tech tools are thus better suited to the typical demand pattern.
# of sessions: How many sessions were able to generate data?
All cases had one full day session for data generation, except for the high-tech RAILWAY BRIDGE GAME where operators played for a full week to collect more quantitative data. It has been very difficult for the organization to schedule sufficient operators for playing roles in the games, and one day was often the maximum. Given this constraint, for most cases it was not reasonable to expect much quantitative data on which statistical tests could be applied, as typical railway scenarios are relatively long. As soon as this was clear to project leaders, for practical reasons low-tech games were accepted.
Game Design
Roles: Fidelity aspects modeled in the game
Many more roles have been involved in low-tech games than have been involved in high-tech games. This is largely because low-tech games are able to connect roles through stylized information and modeling, instead of via software modules. It is only at the time of writing of this article (2015) that the software package used for the RAILWAY BRIDGE GAME is capable of including the number of roles that the NAU GAME already could in 2010, despite enormous efforts by the same development team. For processes requiring information that is more specific and detailed, it is harder to use such stylized models to connect roles. For the BIJLMER JUNCTION case, the network controllers were not connected via a software module, but through face-to-face contact and ‘looking-over-the-shoulder’ interfaces, which led to different levels of discussion between the players, and was a partial explanation for the failure here to collect quantitative data.
The logical consequence of this situation is a need to develop more modules for more roles that are capable of connecting to high-tech simulations for train-traffic controllers and other roles via a backbone. These modules should interconnect with traffic-simulation models, as well as with the other modules and the infrastructure information databases. Only after such extensive development, high-tech games will provide the same functionality as low-tech games for questions with many different roles.
This means that functional fidelity is easer to obtain in low-tech games on the aspect of roles active in the context of the problem. This assertion implies that fidelity of high-tech games is lower than that of low-tech games, given the same development time and resources.
Rules: Fidelity aspects modeled in the game
On this aspect, little differences between low-tech and high-tech cases have been observed in the official rules. In all cases, the rules of the reference system applied, with the exception of a few explicit rules around specific aspects of the scenario. What is however clear in retrospect, but notoriously hard to measure, is how the rules were expressed. The low-tech games allowed for what Van den Hoogen and Meijer (2014) call ‘larger search distance’ in deviation from reality. In all cases, reality was taken as the default rule set, and brought in via the experts, but the deviations possible were significantly larger for the low-tech cases. The language around the rules in the low-tech games avoided the exact technical implementation of rules in operational ‘handles’ (knobs, choices) in the game. In the high-tech cases, much attention went to the exact implementation of the handles for each rule, as documented for the BIJLMER JUNCTION CASE in Meijer, Mayer, Van Luipen, and Weitenberg (2011). This will influence both functional and psychological fidelity, though not in a linear way. For the functional fidelity it means that every option available needs to be precise in the high-tech cases, and can be done in a more playful way in the low-tech cases. In all low-tech cases, the game leader improvised solutions for process aspects that came up during the game and were not explicitly modeled. For the high-tech cases, the answer was always ‘this falls outside the scope of the game’. One could argue that more precise rules implementations make for higher fidelity, but also that more rules and options as available in reality have the same effect. The influence on psychological reality is not clear, but should be researched.
Objectives: Fidelity aspects modeled in the game
The objectives of low-tech cases were formulated in more general terms than those of the high-tech ones (see item f above). This relates to the earlier mentioned broader scope of these games. Compared to high-tech games, the objectives of low-tech games are expressed more in symbolic than in mechanistic terms. All operators – involved in the game sessions - have been trained to be able to deal adequately with the differences between the objectives of high-, and low-tech games.
Constraints: Fidelity aspects modeled in the game
The high-tech cases were the only environments wherein interlocking and safety systems were simulated. This meant that the fidelity of both physical and functional characteristics is higher for this important subsystem of the railways. In the process towards the game design, the project leaders deemed this aspect important, and that this would justify the added time for preparation needed.
Time model
Apart from PLATFORM OVERNIGHT PARKING, all cases used continuous time, with a smaller resolution for the high-tech cases. Important to note were the remarks of players that the simulator is faster and with better resolution than the real work place. In the low-tech cases, the resolution and effects of interlocking were always systematically assessed in the debriefing. Any issues were taken up in the final reporting.
Data representation
On this aspect significant differences between the cases have been observed. All low-tech cases used familiar schematics for representation of the infrastructure like depicted in Figure 2, and simple game-like objects like scour sponges as representation of trains. These were easily accepted in all cases, with a good laugh from the operators when introducing the game elements. The high-tech cases were more precise with respect to the representation of the infrastructure, but in case of the BIJLMER JUNCTION case not in schematic, but in geospatial format. This was tested in the preparations with the operators and accepted, though led to serious issues during the session (documented in Meijer (2009). This was solved in the RAILWAY BRIDGE GAME, but also here a vivid discussion with every player on the rules of correspondence between the simulator and the real-world systems took place. Here we observe the uncanny valley problem in which the believability (and thus psychological fidelity) is reduced when interfaces get closer to reality, but are not equivalent to the real thing. This observation is in line with the arguments, sketched earlier, that the low-tech cases emphasize more psychological fidelity and less physical fidelity.

Game board with schematic representation of Utrecht Central Station infrastructure.
Load: Scenarios
The number of scenarios in the high-tech cases was a little higher, especially for the RAILWAY BRIDGE GAME case. This is logical, given the aim to collect more quantitative data. Scenarios consisted of traffic flow scenarios plus incidents defined with a particular infrastructure, time table, set of controllers and rolling stock. Typically, scenarios include 1 – 1,5 hours of play.
Qualitative and Quantitative Data
Data generated
Both low-tech and high-tech cases generated qualitative and quantitative data. As mentioned earlier, the quantitative data generated by the RAILWAY BRIDGE GAME was more precise and of a quality on which quantitative analyses could be performed. None of the low-tech games were capable of providing this data. The PLATFORM OVERNIGHT PARKING case was capable of providing a precise answer of the range of train wagons that can be parked along the platform.
Immersion
No differences have been observed between high- and low-tech games regarding the more or less smooth immersion of the players into these virtual worlds. Apparently, high-tech simulations, mapping the physical and functional characteristics of the reference system with higher precision than the low-tech-tech games, did not necessarily yield higher psychological fidelity. On the contrary, low-tech games – through their symbolism – scored higher on psychological fidelity (see noted above).
Debriefing
Given the nature of the games in this research project, the debriefing has had a place in the playing of the game different from what is common in games for learning or policy exercises. For all games, debriefings have been held, after each round or scenario, and after the entire gaming session. These exercises served two purposes:
Collect qualitative data that has been missed during the actions in the game. Each debriefing started with purely procedural invitations from the facilitator to tell what has happened over the course of the game. All games with a multi actor set of players had similar story telling and construction of the language about the game scenarios played. The RAILWAY BRIDGE GAME was different due to the single player character.
Validate the dynamics and tendencies observed in the game. Once the narrative and data collection has been completed, and enriched with observations from quantitative data and observers, the value of the observations for reality, and the level of realism were assessed in every debriefing. In this, a small difference was found between the high-tech and low-tech games in the extent to which quantitative data could be produced immediately after the game. The low-tech games required video post processing for train movements, where the high-tech games could provide near-instant feedback.
Van den Hoogen et al. (2014) have formalized the debriefing method developed along the ProRail project and provide the framework as listed in Table 3. This framework has been followed for all of the cases listed above, though it has been formalized better over time.
Framework of the Phases, Addressed Topics and Involved Participants in a Research Game (Adapted From Van de Hoogen et al. (2014).
Conclusions
By comparing low-tech and high-tech games used for innovation in comparable cases in the railway sector, this article identified different patterns emerging from a retrospective cross-case comparison.
The high-tech cases were used to generate more quantitative data, including aspects of safety systems and interlocking of the railways. Given the early stage in the innovation cycle at which all concepts were tested (none of these concepts were ready to be pushed to the operations yet, and all involved decisions on long-term changes to the railway system), in hindsight it is questionable whether the goals of more precise testing through the high-tech cases can be justified.
The low-tech cases were used to get similar conclusions and consequences as the high-tech cases, but were not used to get the same level of process data. Otherwise, little difference was found in the actual use of these games.
To ensure good participation of the experts who played their own roles in all cases, the kinds and appropriate level of fidelity are crucial for the success of the games. However, the discussion of the cases shows that, despite the higher precision, the experience of fidelity of high-tech simulators is not necessarily better than that of low-tech cases. The larger number of roles, and more flexibility in rules help low-tech cases enhance both functional and psychological fidelity with respect to the traffic control situation at hand. For the constraints and data representation, high-tech cases provide more precise physical and functional fidelity, but low-tech cases provide wider fidelity, through as well including psychological fidelity and covering more items related to physical and functional fidelity. Additionally, the uncanny valley problem can come into play when high-tech interfaces get closer to the real-world systems, but are still different their reference system equivalents.
The role of the gaming cases in the innovation process was always testing and assessing an innovative idea via the gaming session. When we look at the conclusions and consequences of the cases, none of the cases resulted in formal acceptance or rejection of a formal hypothesis, but in positive or negative results, explained by a lot of qualitative results on benefits and drawbacks. Does this mean that the cases did not really test the innovations? Were these games actually policy games in the tradition of Duke and Geurts (2004)? The real-world decisions reported that have been based on the gaming sessions show that it is more than an intervention in a process, but none of the cases can uphold the rigor of formal testing either. Here, the author believes that a new use of gaming needs to be further developed in a theoretical concept: gaming for design processes. This should extend the existing framework by Klabbers (2003) on design-in-the-small and design-in-the-large as previously discussed.
In this study, low-tech games proved appropriate for questions involving multiple types of roles and rapid development. Through a process of trial-and-error, the team discovered how to develop low-tech solutions to connect to the mental models of the operators involved, thus producing games that now have real consequences in the decision-making on the future of the railways.
Connecting low-tech to high-tech games requires additional components such as software interfaces and backbones, which require long development times. Even though such modules are currently under construction, in the mean-time low-tech games are capable to produce symbolic representations of systems that generate outcomes that fulfill their purpose for the innovation projects that ask for them. The potential of high-tech games, involving a greater variety of roles that mirror the related social organization of the transport system, will be elaborated in the forthcoming years. Based on the current experiences, for varying purposes both types will be utilized within the Dutch railway organization.
Footnotes
Acknowledgements
This article is a revised version of a paper presented at the 2012 CESUN Conference held in Delft (Netherlands). It was enhanced through the discussions and interactions at the conference regarding the structure of collecting data on gaming sessions, and from the feedback of a series of good reviewers. I would like to thank the session chairs for their efforts. I am particularly grateful to the team members at ProRail and TU Delft.
Author Contributions
The author wrote the manuscript single-handedly. He performed the synthesis based upon the work of the Railway Gaming Suite project team, for which he was the project leader.
Declaration of Conflicting Interests
The author declared no conflicts of interest with respect to the authorship and/or publication of this article.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research has received funding from ProRail and Next Generation Infrastructure Foundation (NGI).
Author Biography
Contact:
