Abstract
In the quest to improve projects, project actors rely on sound project evaluation. However, project evaluation can be complex and challenging. This study aims to explore and define project evaluation and reveal how it can promote continuous improvements within and across projects and organizations. A review of extant literature finds four constitutive properties for project evaluation: criteria, times, evaluands, and evaluators. Based on the action design research of 75 projects in 21 organizations, the study finds three evaluation perspectives: process, outcome, and learning. Understanding the multidimensionality of project evaluation through the seven identified dimensions offers a meaningful conception of project evaluation.
Introduction
Project evaluation is necessary for offering relevant information to improve projects to project actors, those people who are involved in projects and their evaluation. The expectation of success typically drives evaluation, but there is much debate and disagreement on the overall idea of success (Ika, 2009; Jugdev & Müller, 2005; Pinto et al., 2021; Pinto & Slevin, 1988; Shenhar et al., 1997). Most project evaluation research deals with project success and assessments after project completion (Haass & Guzman, 2020), often using the classical iron triangle: assessing time, cost, and quality (Lenfle, 2012), and comparing the results to the plan. Achieving short-term goals is increasingly complemented with assessing long-term project effects (Atkinson, 1999; Shenhar et al., 2001). Research has moved toward a broader view of project evaluation (Zidane & Olsson, 2017), so this study concentrates on project evaluation holistically, throughout the project life cycle.
Assessing project success at project completion is beneficial for learning, but judgments made in hindsight cannot improve the project. Project evaluation can provide a qualified basis for improving projects by generating insights for enhancing performance (Earley et al., 1990), increasing success rates (Powell & Buede, 2006), and preventing failures (Chen, 2015). Project evaluation is therefore also necessary before the project starts, to justify project selection (Archer & Ghasemzadeh, 1999), align projects with strategy (Merikhi & Zwikael, 2019; Samset, 2003; Williams & Samset, 2010), and prioritize funding and resource allocations (Criscuolo et al., 2017; Harrison & Harrell, 1993; Lin et al., 2019; Martinsuo & Poskela, 2011; Müller & Turner, 2007). During ongoing projects, evaluation helps monitor and control projects (Crawford & Bryce, 2003; Kivilä et al., 2017; Lin et al., 2019; Liu et al., 2014; Merikhi & Zwikael, 2019), guide change decisions (Steffens et al., 2007), and even terminate the projects as needed (Unger et al., 2012). During project closure, evaluation is important to review performance and accumulate lessons learned (de Wit, 1988) to improve future projects. Different purposes require the use of different evaluation criteria (Hart et al., 2003), as do comparisons among projects (Archer & Ghasemzadeh, 1999; Barber, 2004). Benchmarking can improve projects by early qualification of the business case, such as benefits and risk estimates (Flyvbjerg, 2006).
For project evaluation to be beneficial, there must be acknowledgment of its holistic nature and different evaluators’ viewpoints (Baccarini, 1999; Chen, 2015; Davis, 2014; Korhonen et al., 2014; McLeod et al., 2012). Current project evaluation knowledge is fragmented (McLeod et al., 2012) and even conceptual definitions are weakly aligned. There is a lack of consensus regarding how to evaluate projects, including what indicators to use, how, and when (Haass & Guzman, 2020, p. 589): “…both project management theory and practice suffer from the lack of frameworks that consider the emerging and evolving temporality, dynamism, subjectivity and complexity of projects [and] project environments.” There is a need for the development of a coherent integrated project evaluation conception (Merikhi & Zwikael, 2019). Moreover, project evaluators need to recognize the complexity of project evaluation and take into account the evolving, comparative, subjective, and relative nature of evaluations (McLeod et al., 2012). The current study is an attempt to bridge the gaps in extant evaluation models (Zidane et al., 2016) and address recent calls for holistic conceptual approaches for project evaluation.
The purpose of this study is to develop a definition and multidimensional conception of project evaluation to improve projects. This article provides insight into project evaluation by structuring the complexity of project evaluation and answering the overall research question: How can project actors use a multidimensional conception of project evaluation to improve projects?
The next section reviews extant literature to develop a definition of project evaluation and extract four central project evaluation properties. The methods section introduces the empirical action design research approach as a process of developing and using different versions of a project evaluation framework. The results section presents three project evaluation perspectives, and the illustration section shows how the four properties and the three perspectives appear in project evaluation practice. As a key contribution, the properties and perspectives are integrated into a multidimensional conception of project evaluation to improve projects.
Literature Review
Defining Project Evaluation
Evaluation is a natural part of everyday life. Evaluation is the act of appraising or valuing (Oxford English Dictionary) something. It is perhaps the single most important and sophisticated cognitive element in human reasoning and logic (Osgood et al., 1957, in Stufflebeam & Coryn, 2014). Although essential in project management (Anbari, 1985; Merikhi & Zwikael, 2019), there is a poor understanding of project evaluation. The APM Body of Knowledge, 7th edition (APMBoK) (Association for Project Management [APM], 2019); A Guide to the Project Management Body of Knowledge (PMBOK® Guide) – Sixth Edition (Project Management Institute [PMI], 2017); and PRINCE2 (AXELOS, 2017) do not feature the term evaluation in their glossaries. The PMBOK® Guide and PRINCE2 describe project evaluation only in relation to project closure, yet project evaluation is relevant in many different circumstances (Merikhi & Zwikael, 2019).
In a recent review, Haass and Guzman (2020, p. 574), examining 72 papers on project evaluation, portrayed project evaluation as a multilayered affair, and saw it necessary to “…view project evaluation as a socially constructed endeavor, in which evaluators and those who are evaluated interact with each other in an ongoing basis to make sense [of] the evaluation process and its outcome.” Linzalone and Schiuma (2015, p. 92) examined 57 program/project evaluation models, developed a classification of 20 typologies, and emphasized: “With particular regard to projects and programs, evaluation is the assessment and the analysis of the effectiveness of an activity; it involves the formulation of judgments about the impact and progress. Evaluation is the comparison of the actual effects of a project, against the agreed planned ones.”
While researchers have developed project evaluation frameworks for different purposes and tested them in projects, the concept is treated vaguely. Table 1 shows a selection of previous frameworks and definitions (or lack thereof) based on a structured literature search. A systematic search was done using the Business Source Complete Database, with selected search words (evaluation, assessment, monitoring, controlling, and judgment) and focusing on scholarly peer-reviewed articles published in academic journals, especially on project management. Title and abstract readings enabled delimiting the focus to studies, which dealt with project evaluation frameworks, and additional publications were discovered through snowballing. A detailed table was developed to summarize the literature, and Table 1 shows a condensed version. Many frameworks are skewed toward only one type of (effectiveness) success criteria, limited in their focus on only one (absolute) project, or restricted toward only one (ex-post) time perspective. The frameworks together suggest that project evaluation should take multiple properties into account.
Project Evaluation Frameworks
The methodological underpinnings of the developed frameworks may explain why they do not fully address the multifaceted character of project evaluation. All except three publications (Cao & Hoffman, 2011; McLeod et al., 2012; Ngacho & Das, 2014) are conceptual and build on extant literature, without empirical evidence. Only one of the three empirical publications starts from the organization’s reality rather than theory (McLeod et al., 2012). Project evaluation may seem simple in desktop research, compared to the reality and actuality of projects.
To define project evaluation, we adjust Chen’s (2015) conception of program evaluation as the activity of systematically gathering or generating and analyzing data about projects to answer what, who, when, and how questions that can improve projects. The “what” questions relate to a project’s components and results: its intervention, deliverables, and value. The “who” questions relate to the people connected to a project: its managers, sponsors, and users involved in the project and its evaluation. The “when” questions relate to the timing of the project evaluation: ex-ante, interim, or ex-post. The “how” questions relate to project evaluation units: absolute evaluation focuses on only one project, whereas relative evaluation compares several projects. Table 1 indicates that the majority of project evaluation frameworks cover only parts of these four questions. There is a need for an integrative conception that encapsulates the four properties crucial to the definition of project evaluation and their interlinkages.
Identifying Multiple Project Evaluation Properties
Evaluation Standards
An evaluation standard refers to a criterion that a thing and its success are judged by, and project success covers a group of standards (Ika, 2009). It is well acknowledged that projects should be evaluated based on multiple standards or criteria (Anbari, 1985), potentially compared to general business objectives or specific project goals. Two overall kinds of success are typically distinguished: project success versus project management success (de Wit, 1988; Ika, 2009; Jugdev & Müller, 2005).
As the objective of completing a project within budget, schedule, and scope requirements is fundamental in project management, projects are often evaluated in terms of project management success (Atkinson, 1999; de Wit, 1988; Ika, 2009; Shenhar et al., 1997), recently also referred to as project management efficiency (Serrador & Turner, 2015; Zidane & Olsson, 2017). The focus here is on doing things right (Zidane & Olsson, 2017), which requires the manager’s ability to manage a project by converting inputs to outputs in a resource-efficient way in line with the triple constraints of cost, time, and scope/quality (Baccarini, 1999; Samset & Volden, 2016). While efficiency partly explains project success (Serrador & Turner, 2015), the triple constraint is considered insufficient and other criteria are also needed (Ika, 2009). Efficiency offers a partial and rather simplistic view to measuring success with a hard, tangible, internal, and tactical short-termed focus, which perhaps suits project control rather than project success (Baccarini, 1999; de Wit, 1988; Pinto & Slevin, 1988; Samset & Volden, 2016; Shenhar et al., 1997).
Consequently, efficiency criteria must be complemented with other criteria encompassing project success or project effectiveness (Pinto & Slevin, 1988; Zidane & Olsson, 2017). Project success and effectiveness feature more external, intangible, soft, and long-term criteria concerning the results of the project: its output and impact (Baccarini, 1999; Pinto & Slevin, 1988; Shenhar et al., 1997; Turner & Zolin, 2012), and doing the right thing (Zidane & Olsson, 2017). Effectiveness criteria include the value generated by the project, the relevance and usefulness of the project’s results, meeting of project goals, strategies, and organizational objectives, and (direct) organizational benefits, (indirect) community benefits, side benefits, and future potential. Effectiveness can be measured in stakeholder satisfaction, sales, income, profit, and market share as well as sustainability, innovation and new ideas, skills, technologies, capabilities, and core competences (Atkinson, 1999; Baccarini, 1999; Barclay & Osei-Bryson, 2009; Haass & Guzman, 2020; Laursen & Svejvig, 2016; Martinsuo, 2019; Martinsuo & Killen, 2014; Nelson, 2005; Pinto & Slevin, 1988; Samset & Volden, 2016; Serrador & Turner, 2015; Shenhar et al., 2001; Svejvig et al., 2019; Williams & Samset, 2010; Zidane et al., 2016).
Project evaluation literature lacks consensus regarding which criteria to use (Haass & Guzman, 2020), and central concepts are considered ambiguous and overlapping (Zidane & Olsson, 2017). A simplistic criteria list is not sufficient for assessing success across all projects, but, rather, criteria need to be treated as subjective, context-specific, and even symbolic and rhetoric constructs (Haass & Guzman, 2020; Ika, 2009).
Evaluation Timings
The timing of the evaluation relates to the type of improvement aimed for. The literature covers three temporal orientations based on their approach to the project (Haass & Guzman, 2020; Merikhi & Zwikael, 2019).
Ex-ante project evaluation or appraisal at the front end of the project often takes the point of departure in a project’s business case to inform whether a project should exist or not (Merikhi & Zwikael, 2019; Samset & Volden, 2016; Williams & Samset, 2010). Ex-ante evaluation justifies the choice of the project among alternatives, based on its estimated impact, strategic fit, and alignment with the host organization’s goals and objectives (Samset, 2003; Samset & Volden, 2016; Williams & Samset, 2010). It also informs decisions necessary to integrate the project into a project portfolio and allocate project resources (Lopes & Flavel, 1998; Merikhi & Zwikael, 2019; White, 2011). Project evaluation research seldom concentrates on ex-ante evaluation (Haass & Guzman, 2020).
Interim project evaluation or monitoring concerns the status and progress of the project and assists in controlling and steering the project forward (Bauch & Chung, 2001; Colin & Vanhoucke, 2015; Crawford & Bryce, 2003; Kivilä et al., 2017; Lin et al., 2019; Merikhi & Zwikael, 2019; Wong et al., 2010). Project managers need to make periodic assessments throughout the project’s life cycle to monitor it (Pinto & Slevin, 1988). As project success emerges and evolves from project execution onward over time (Shenhar et al., 1997), its evaluation covers only interim and ex-post time perspectives.
Ex-post project evaluation, or judgment, deals with the project’s past performance identified through post-project reviews, retrospectives, or postmortems (Merikhi & Zwikael, 2019; Nelson, 2005; White, 2011). Ex-post project evaluation starts at project closure, can continue years after project completion, and can cover short and long time frames (Turner & Zolin, 2012). A long time may pass before success can be really evaluated (Shenhar et al., 1997), as the project results turn to benefits over time (Pinto & Slevin, 1988). Such ex-post evaluations are often black-and-white judgments of failure or success (Chen, 2015) and cannot do much for the evaluated project, but their potential lies in improving future projects (Nelson, 2005). Most project evaluation research concentrates on ex-post evaluation (Haass & Guzman, 2020).
The three different time perspectives differ in terms of the availability and uncertainty of information. Many empirical studies of project evaluation are conclusive and summative, and judge a project late, when much information is available and the results are known (Chen, 2015; Mertens & Wilson, 2012; Stufflebeam & Coryn, 2014). Few evaluation studies are formative and constructive in terms of creating information, which can improve the project throughout its life cycle. While evaluations are made in the different phases of a project’s life cycle (Hart et al., 2003), aspects of timing and repetition are weakly covered in the literature.
Evaluation Units
Project evaluation needs to clearly explicate the evaluand, the evaluation entity (Mertens & Wilson, 2012), and here it is a project. A project is often evaluated in a vacuum, and only compared to itself at an earlier or later point in time. Offering merely a nuanced picture of that project (McLeod et al., 2012) is not the best strategy in all situations. Relative evaluation, in other words, comparing projects to other projects, could be useful too. Project portfolio management tends to center on evaluating projects only when selecting (Dye & Pennypacker, 1999) or terminating projects (Unger et al., 2012).
Both singular (within-project) and plural (across-project) project evaluations are needed to improve projects. For instance, project managers or owners in charge of multiple projects benefit from efficiency and effectiveness comparisons (Marques et al., 2010). Benchmarking can reveal the potential and problems of a project or a portfolio for prioritization and strategizing (Barber, 2004; Flyvbjerg, 2006). Internal benchmarking for different project types, businesses, or regions can assist in recognizing high- and low-performing projects and share best practices (Xu & Yeh, 2014) and supports a culture of continuous improvement and learning (Barber, 2004). External benchmarking can lead to a better understanding of the sector, industry or market, benefits and risks (Flyvbjerg, 2006), or discovery of new ideas, solutions to similar problems, and proven practices (Barber, 2004).
Evaluation Stakeholders
Project evaluation is something someone does, and that someone is essential to the evaluation. An objective epistemology dominates in project evaluation (Haass & Guzman, 2020; Ika, 2009) and treats successes and failures as absolute truths to be discovered (Jung Ho et al., 2019; Nelson, 2005; Robertson & Williams, 2006; Shao et al., 2012). Yet, success is socially constructed by project actors. Two groups of actors are central: evaluation stakeholders who conduct the evaluation and project stakeholders who provide data for the evaluation. Following Freeman’s stakeholder definition (Mitchell et al., 1997), we define (1) project stakeholder as any group or individual who can affect or is affected by the project, and (2) evaluation stakeholder as any group or individual who can affect or is affected by the evaluation. Consequently, many project and evaluation stakeholders can be identified, but research acknowledges only some of them.
Evaluation stakeholders are often internal, but external evaluators also exist. Examples are associations such as IPMA conducting the Global Outlook Survey in collaboration with KPMG and AIPM (Sexton et al., 2019), PMI publishing the Pulse of the Profession® report (PMI, 2020), or the Standish Group conducting the CHAOS report based on a large-scale survey (Johnson, 2018). Additionally, researchers may conduct evaluation case studies or assess dozens of projects in large datasets (see, for example, Samset & Volden, 2016). In such external evaluation, stakeholders play a powerful role in selecting informants, defining criteria, designing measurement instruments, analyzing data, and interpreting and presenting the results. Internal evaluation stakeholders include committed and supportive senior directors and developers who can ensure that the evaluation is carried out professionally (Loo, 1985). Directors, developers, and project portfolio managers often devise assessment tools and frameworks specific to the business.
Project stakeholders, such as project sponsors, owners, project team members, and steering committee members each has their own viewpoint on project status and success, due to different backgrounds, experiences, knowledge, information, interests, preferences, stakes, and values (Baccarini, 1999; McLeod et al., 2012; Samset & Volden, 2016; Zidane et al., 2016). Therefore, project evaluation should encompass the perceptions of multiple stakeholders as information input (Davis, 2014, 2018), which occurs rarely (Turner & Zolin, 2012). Oftentimes, emphasis is on the perceptions of project managers only (Davis, 2014). There is a need to account for the viewpoints of diverse stakeholders and include their experiences in project evaluation (Barclay & Osei-Bryson, 2009).
Method
This article draws on action design research (ADR), which is a research method used to generate prescriptive design knowledge through building and evaluating artifacts in an organizational setting (Sein et al., 2011). ADR combines elements of action (interventions) and design (artifacts) research and is considered relevant for studying project evaluation, as it uses a systematic specification of justificatory knowledge based on insights from practice, as well as kernel theories (Gregor & Hevner, 2013) from project and evaluation disciplines.
ADR implies close collaboration between researchers and practitioners to build and evaluate an artifact in an iterative process of designing, reflecting, and abstracting learning, which takes place in both an abstract domain and an instance domain (Gregor & Hevner, 2013). In this case, the development of a project evaluation framework is at the abstract level, whereas the application of it is at the instance level.
This ADR study is based on a national initiative to develop, implement, and evaluate a new project management methodology (PMM) designed to improve project management by increasing the speed and impact of projects. The study entails designing and applying a project evaluation framework, to evaluate a set of pilot projects using the new PMM, and comparing these to a set of similar reference projects not using the new PMM.
The national initiative lasted six years (2015–2021) and consists of 75 projects in 21 organizations. Within each of the 21 organizations, at least one pilot project has applied the new PMM and been compared with up to three similar reference projects not applying the new PMM. The projects are very diverse across organizations, but similar within organizations. The initiative is funded by a private foundation and involves project participants, managers, owners, and stakeholders from the 21 organizations, as well as consultants and researchers.
Data Collection
The empirical study features multiple embedded cases (Yin, 1989), with several units of analysis (projects) within each of the cases (organizations). Table 2 provides an overview of the data used for the ADR study.
Data Overview
Table 2 shows that the organizations vary in size and represent different industries, and that the projects differ in complexity and size. The data rely on one to six projects evaluated in each of the 21 organizations and related data generation interactions between researchers and practitioners, including interviews, focus groups, review meetings, and project evaluation documents. All data are summarized in confidential reports and parts are published in official reports. The official publication (Jensby et al., 2021) contains the qualitative quotes from the illustration section.
Framework Development
Table 3 provides an overview of the actors, activities, and artifacts of the different ADR stages. Different versions of the project evaluation framework (PEF) are presented and discussed with researchers and practitioners throughout all stages.
Action Design Research Stages
Problem Formulation: A Project Evaluation Template Was Developed
The private foundation expected to see benefits from using the new PMM, which required evaluation on different levels. This study focuses on project-level evaluation. A comprehensive literature search on evaluation theory and application (Stufflebeam & Coryn, 2014) identified a need for (intraorganizational and interorganizational) comparisons (Rihoux & Ragin, 2009), and a template was designed to map projects with contexts, mechanisms, and outcomes (Pawson & Tilley, 1997). The objective was to evaluate the pilot projects using the new PMM in order to improve projects and project management.
Building, Intervention, and Evaluation: The Early Versions of the PEF Were Developed
Project Evaluation Framework Version 1 (PEF1)
We entered the first organizations in an experimental phase, where both the new PMM and the evaluation approach were in flux. We developed an abstract solution illustrated in PEF1, based on the open systems theory (Chen, 2015) and evaluation theory (Pawson & Tilley, 1997; Stufflebeam & Coryn, 2014). PEF1 includes four tentative perspectives that prompted discussions on evaluation results among practitioners and researchers: (1) the classical iron triangle, (2) specific success criteria, (3) internal benchmarking, and (4) external benchmarking. The abstract solution was instantiated in select projects at different times, so we were able to learn and improve the evaluation framework.
Project Evaluation Framework Version 2 (PEF2)
While clear scripts and templates were standardized for the evaluation process, we discovered that learning in and between projects was a crucial element of the evaluation (Wong et al., 2010; Wong et al., 2012). The evaluation results facilitated not only cross-project learning of vital importance for any organization seeking to improve their projects (Cao & Hoffman, 2011) but also cross-organizational learning among organizations. We realized that the learning facilitated by both internal and external benchmarking in and between projects and organizations was tied to a specific project perspective—being either particular (success criteria) or general (iron triangle). As all four perspectives of PEF1 informed learning, we thus added it as a fifth and central element in PEF2.
Reflection and Learning: A Mature Version of the PEF Was Developed
The comprehensive study of real-life projects created a large amount of qualitative and quantitative data, which were structured and condensed for analytical purposes. Initial reflection and learning from the data processing happened when preparing project evaluation reports, making cross-organizational comparisons, and writing conference papers and journal articles. Continued reflection and learning required revisiting our reports and the literature for new inspiration and input. Intermediary results were presented and discussed between practitioners and researchers on several occasions.
Project Evaluation Framework Version 3 (PEF3)
Despite the usefulness of PEF2, the five evaluation categories were experienced as too specific to cover all project evaluations. It was necessary to move to a higher abstraction level, so that we could use the framework more generally. This abstraction process meant revisiting evaluation literature again to develop PEF3 as a replacement for PEF2. As it became evident that the organizational boundary was only one boundary of many (Barber, 2004), we reduced the internal and external benchmarking to one benchmarking perspective. As it became evident that the projects’ specific targets (success criteria) consisted of both classical efficiency criteria (iron triangle) and more value- or vision-driven effectiveness criteria (Atkinson, 1999; Laursen & Svejvig, 2016), these two perspectives were replaced by outcome and process perspectives (Chen, 2015).
Formalization of Learning: The Final Version of the PEF Was Designed
While we used the former PEF versions for data collection and analysis as well as categorization and presentation of project evaluation results, the final PEF reflects a need to conceptualize a broader understanding of project evaluation. This spurred another review of the literature and our reports on project evaluation and a subsequent definition of project evaluation and identification of the four constitutive project evaluation properties presented in the literature section.
Project Evaluation Framework Version 4 (PEF4)
As the four properties put the four perspectives of PEF3 into a new perspective, PEF3 is reduced to three perspectives. As benchmarking is one relative evaluation variant out of several possibilities covered by the evaluand property, we excluded it from the final PEF4. In the evaluated projects, benchmarking is integral to the other three perspectives (outcome, process, learning), which are all different variants of the same criteria properties. The resulting generalized abstract solution involves the three perspectives constitutive of the mature and final PEF4. The results section introduces the three perspectives, and the following illustration section instantiates them through the four properties. In line with ADR, we will conclude by discussing some design principles that connect generalized outcomes to a class of solutions (Sein et al., 2011), namely how researchers and practitioners can use project evaluation to improve projects.
Data Analysis
The data analysis was first performed within each organization by reviewing all data and using a deductive coding approach, covering the three perspectives of the evaluation framework identified during the ADR process (outcome, process, learning) and evaluation episodes that manifest the properties proposed in the literature review (criteria, timing, units, stakeholders). All instances were cross-tabulated so that all project evaluations for each organization were covered, then supplemented by statistics from the quantitative data used for comparing projects internally in the organization. The project evaluation data were compared in both qualitative and quantitative terms for the cross-case analysis. The quantitative data were used for the technical reporting of the initiative, whereas the qualitative analyses included identification of informative quotes and vignettes from the data, to offer illustrative examples of the project evaluation perspectives and properties.
Results
The project evaluation framework was built by combining previous literature and empirical experience on practicing project evaluation, focusing on three perspectives: process, outcome, and learning (shown in Table 3).
The Outcome Perspective
The outcome perspective focuses on the outputs, impacts, and effects of projects: what the project creates (Chen, 2015). Outcomes create direct and indirect organizational and societal effects (Atkinson, 1999). An outcome perspective on evaluation monitors whether the project satisfies clients, customers, suppliers, the project team, and other stakeholders (Müller & Turner, 2007; Müller & Turner, 2010). Outcome evaluation also moves beyond traditional objective-based evaluation to consider the value (Laursen & Svejvig, 2016) or worth (Martinsuo, 2020) of a project’s realized and potential outcomes.
In the empirical study, the outcome perspective was used to generate data on project success within each of the 21 organizations. The outcome data show the absolute and relative success rates of the pilot projects applying the new PMM. Relative success was evaluated by benchmarking 15 pilot projects with their comparable reference projects to find seven pilot projects with a higher relative success rate, three projects with a medium success rate, and five projects with a lower success rate compared with the reference projects within the same organization. Such outcome evaluation is important to understand project results and improve projects, for instance, through a dialogue on what sets the most and least successful projects apart.
The Process Perspective
The process perspective focuses on the mechanisms of projects: how the project is conducted (Chen, 2015). The iron triangle—comparing expected and realized cost, time, and quality (Atkinson, 1999)—is one example of process evaluation. Process perspectives, however, need to go beyond objective-based evaluation and encompass project practices and management behavior. This may be referred to as white-box evaluation (Chen, 2015), which can explain what goes on inside the project, between input and output, in contrast to summative black-box evaluation.
In the empirical study, we used the process perspective to generate data on specific practices within each of the 75 projects. The process data show how all 75 projects are managed: the degree to which project managers use a set of nine practices represented by the new PMM and scored on a scale from one (little application) to four (much application). For instance, a project manager is asked: “To what extent did you focus on customer value in this project?” A statistical analysis of 22 pilot projects and 46 reference projects compares the average practice scores for all pilot and reference projects, and finds that pilot and reference projects differ significantly for all nine PMM practices. The process evaluation shows that the use of PMM practices is significantly more in pilot projects than in reference projects, and confirms that the PMM is actually new and radically different from normal practice. Such process evaluations help project actors to understand project dynamics and improve projects, for instance, through dialogues on which good practices could be useful in the future.
The Learning Perspective
The learning perspective focuses on lessons learned in and between projects (Shaw et al., 2006), and evaluation should produce credible and useful information for learning (Organisation for Economic Co-operation and Development [OECD], 2010). Constructive evaluation reveals future potentials, whereas conclusive evaluation makes retrospective judgments (Chen, 2015). Learning promotes improvements (Cao & Hoffman, 2011; Christiansen & Mouritsen, 2020) and accumulates to dynamic capabilities (Teece et al., 1997) and core competences (Prahalad & Hamel, 1990) to equip the organization to meet the future. Organizations tend to look ahead without looking in the rearview mirror to learn from experience (Samset & Volden, 2016). Learning may challenge existing knowledge but can also direct attention to what is strategically important (Martinsuo & Killen, 2014).
In the empirical study, we used the learning perspective to generate lessons learned within and across the 21 organizations. For instance, four pilot projects judged as failures were scrutinized to understand the reasons for their lower success rate. While there can be many causes of failure, the failed projects showed that the timing of the application of the PMM mattered. Late implementation did not support the pilot projects sufficiently, suggesting a need to adopt the PMM early in the project’s life cycle. The examples also showed that failing projects were terminated based on early insight and, in that sense, considered successful attempts to reduce waste. Such learning evaluations are important to stimulate continuous improvement, for example, in terms of preventing similar mistakes in the future.
Illustration
Based on the data from 75 projects in 21 organizations, this section illustrates how the three perspectives of the project evaluation framework are integral to each of the four project evaluation properties.
Multiple Criteria
Project outcome perspective was central to using evaluation criteria. All 75 projects with available data were evaluated based on impact measures, and at least one-third were evaluated based on one or more established success criteria. Such summative and conclusive outcome evaluations judged the projects as high or low performing and more or less successful. The projects’ success criteria varied in their level of explicitness. Some projects were evaluated based on a list of more than 10 success criteria, whereas others had very few. Typical evaluation criteria were based on some outcome of the project, ranging from detailed features of deliverables to one conclusive measure of overall customer satisfaction and loyalty. One consultant, reflecting on regular evaluations of project outcomes, illustrates a combination of conclusive and constructive approaches integrated into the flow of the project:
“Although the business case was discussed much too late with the customer, an early and very positive involvement was initiated with customers. On a biweekly basis, the customer (the retail chain) was shown the solution at its current progression. At the end of the discussion, the customer would rate their expectation across 3–4 KPIs [key performance indicators]. It created a very open atmosphere and a very high level of energy in the team and between the team and the customer.”
Criteria relating to process were also apparent. All 75 projects were evaluated based on the practices of project managers. Some projects used a minimum score for stakeholders’ perceived progression throughout the project, or accelerated speed, in other words, time constraint in the iron triangle. Process-related criteria included objective and subjective indicators, qualitative statements, and quantitative measurements.
Of the studied projects, 36% were explicitly evaluated based on their learning, formalized into lessons learned. The majority of lessons dealt with the project process, but some also related to the produced outcome. Process-related learnings revealed important information for improving not only the project’s future but also future projects. As a particular project’s outcome is seldom replicated in future projects, learnings from the process are of a more general nature and represent good practices that could be diffused among projects.
Multiple Times
The dominant perspectives when evaluating projects multiple times concern process and outcomes, yet learning occurred from comparing distinct project evaluations performed at different points in time. Of the 75 projects, 27 used ex-ante evaluation in the creation of an impact case (resembling a business case) to define targets, and front-load projects to deliver early effects. This constructive impact case was revisited and used as a vehicle for repeated monitoring and control throughout the projects. In this way, an outcome perspective on impact was used both in ex-ante and interim evaluations. In the 21 organizations, many projects were evaluated multiple times, and all projects were evaluated ex post after their termination from an outcome and process perspective, to identify their success rates and managerial practices. For example, a pilot project and two reference projects were evaluated 3, 6, and 12 months after project closure, respectively. These identical evaluations gave different pictures of the projects’ relative success because they were made at different times.
One-third of the projects regularly evaluated stakeholder perceptions of progress to track trends over time. For instance, in one project key stakeholders evaluated progress biweekly. The stakeholder evaluations were plotted into a diagram, which revealed a U-curve of high stakeholder enthusiasm turning to moderate and returning to high at the end of the project. Such repeated stakeholder evaluations were helpful for continuously improving the project. One consultant’s reflections illustrate the importance of regular stakeholder satisfaction evaluation for continuous improvement and also for stakeholder satisfaction in itself: “After workshops, we usually conducted a mini [stakeholder satisfaction] pulse check with the participants. Getting immediate feedback gave us valuable insights and allowed us to take corrective action when needed. When we were about three months into the project, we hosted a workshop and were running out of time. To finish on time, we agreed on next steps and finished the workshop there. One of the operation managers asked in a somewhat confused and disappointed tone: ‘But what about the pulse check?’ Only then did we realize that they also enjoyed doing the pulse check.”
Multiple Evaluands
All 75 projects were evaluated both absolute and relatively, both comparing within and across the 21 organizations. These internal and external comparisons were based on outcome and process perspectives and enabled learning. For example, projects were often understood within their own project boundaries, but justified decision-making in the organization required prioritizing, and, thereby, putting single projects into a broader perspective. As some projects competed for the same resources, comparisons were helpful for resource negotiations. Using the projects’ relative outcomes and process evaluations offered information for priority argumentation and decision-making.
For example, multiple evaluands needed attention in order to resolve delay risks and competing priorities among projects. A consultant reflected on a major risk in one project, resolved by comparing the project’s impact to another project: “A critical improvement initiative was dependent on a system change. To carry this change through, we needed help from the IT department. However, the IT organization was busy with another project and therefore declined the meeting invitations and did not respond to phone calls. Our countermeasure was to address this with the project owner and the specific consequences of this problem on impact. He took a stern view, as he saw how this obstacle would affect his KPIs. He took the issue up with the CEO who talked with the head of IT, and convinced him that the improvement we were working on had more impact than the IT project.”
Multiple Evaluators
The use of multiple evaluators emphasized the learning perspective of evaluation but used outcome and process data as content. Project evaluations were often performed in review and planning meetings involving multiple evaluators. These meetings involved sensemaking and negotiation processes, and sometimes they resulted in one integrated project evaluation perception. Typical internal evaluators are project team members with detailed project knowledge. They, however, do not necessarily value the same things. In one organization, there was a division between us versus them with regard to the information technology and business people, which challenged feedback mechanisms and delayed decision-making. In the pilot project, however, these two subgroups worked side by side in design workshops. The closer collaboration ensured a common understanding of the project, which simplified decision-making and increased the progress of the project.
Across projects and organizations, the project owner appeared as a powerful internal project evaluator. Project owners’ active engagement in project evaluation revealed their centrality and importance. They contributed through insightful knowledge, a catalyzing and legitimizing role, and accelerated decisions. A consultant explained how a project owner influenced the project by changing the evaluation and prioritization: “We are months into the project and the commercial core team members are once again gathered in the project room. [The current state of the project] is messy. […] For the first time, the project owner has joined the meeting in the ‘engine room’ and engages in the discussions […] and the value is undebatable. He challenges the team on their current prioritization and technical focus and intuitively directs the dialogue toward the business impact that the project was initially set out to realize. […] At the end of the meeting, prioritizations have been updated and there appears to be a new common mindset and agreement that commercial deliverables that might otherwise be postponed must be accelerated.”
It is necessary to acknowledge other voices and views, in addition to that of the project team and the owner’s insider information. External project stakeholders, such as customers and end users, were sometimes invited and included in both project outcome and process evaluations. Such evaluators brought vital insight to improve projects, for instance, from prototype testing and simulating user experiences.
Interplay of Evaluation Properties
The illustration reveals that the four properties are highly intertwined and it is very difficult to treat them separately. Treating a project in isolation would offer a different evaluation than if it was compared with other evaluands, even when criteria, timing, and evaluators are fixed. Different evaluators focusing on different aspects, will generate different evaluations, even when evaluands, criteria, and timing are fixed. Evaluating a project at different times possibly results in different success rates, even when evaluands, criteria, and evaluators are fixed. A project can be a failed success or a successful failure, depending on the criteria used in the evaluation, even when evaluands, evaluators, and timing are fixed.
Discussion
Improving Projects Through Multidimensional Project Evaluation
This study offered a formal definition of project evaluation as systematically gathering or generating, and analyzing data about projects to answer what, who, when, and how questions that can improve projects (building on Chen, 2015). The definition adds to the debate about what project evaluation is and how it can be used to improve projects. Although reviews of project evaluation do exist (see, for example, Haass & Guzman, 2020; Linzalone & Schiuma, 2015), the concept of project evaluation has remained vague.
Our findings revealed the multidimensionality of the concept of project evaluation, combining evaluation properties (criteria, times, evaluands, evaluators) with perspectives on evaluation (outcome, process, learning). Specifically, our definition adds a when question with a timing answer to the other questions in Chen’s definition, and draws attention to the use of evaluation for improvement prior to and during the project. The large-scale empirical study showed that the four project evaluation properties are closely tied together and intertwined with the three perspectives. The findings offer an empirical illustration of the multidimensionality of project evaluation and portray its inherent complexity. The multidimensional view offers a more nuanced, complete, and inclusive way of thinking about project evaluation. This reveals the many different possibilities in project evaluation, and helps project actors to approach the evaluation task in a structured manner to improve projects.
The multidimensional view of project evaluation supplements earlier project evaluation frameworks in several ways. For instance, it suggests adding multiproject comparisons to logical framework and log frame approaches (Baccarini, 1999; Couillard et al., 2009; Crawford & Bryce, 2003), considerations of time to some frameworks (Crawford & Bryce, 2003; Zidane et al., 2016), and learning and outcome criteria to frameworks focusing exclusively on project efficiency (Xu & Yeh, 2014). Although some studies suggest extensive lists of general evaluation criteria (Ngacho & Das, 2014; Zidane et al., 2016; Zidane & Olsson, 2017) and shared perceptions of project success across different stakeholders (Davis, 2014), our findings emphasized the use of different criteria across different projects and firms, and the need to acknowledge multiple evaluators with different backgrounds and priorities. While project evaluation research displays much heterogeneity regarding informative criteria (Haass & Guzman, 2020), in reality, project managers and end users agree more about success criteria in successful projects compared to unsuccessful projects (Wateridge, 1998, in Jugdev & Müller, 2005). However, project stakeholders often disagree about which criteria are the most important (McLeod et al., 2012; Nelson, 2005). Our findings confirm the need to use multiple criteria for project evaluations, to cover both short-term and long-term success, and multiple stakeholder expectations, to improve projects. The three evaluation perspectives (process, outcome, and learning) provided a useful categorization of evaluation criteria across different projects and organizations.
This article answered the research question: How can project actors use a multidimensional conception of project evaluation to improve projects? The findings portray the use of project evaluation as a holistic, evolving, relative, and social act that requires repeated sensemaking within the evaluation context. Moreover, the study shows that the multidimensionality of project evaluation is subject to change both within and across single and multiple projects and organizations. Project actors need a systematic approach for navigating project evaluations so that they become useful for improving projects. This requires holistic understanding of project evaluation as a concept, along with evaluation methods and practices covering the four properties and three perspectives of project evaluation. The ADR process resulted in a multidimensional approach to project evaluation that project practitioners and researchers used as a basis for their improvement initiatives and recommendations.
Project evaluation appeared as a selective and subjective act that requires individual and collective sensemaking. Project actors used the integrated and multidimensional conception of project evaluation to understand projects and their status, to monitor and control, and generate learning for the future. Improvement of projects manifested in the organizations rearranging resources, initiating changes, and speeding up or terminating some projects. These findings concerning project improvement complement earlier research that tends to tie project evaluation to ex-ante project selection (Criscuolo et al., 2017; Dye & Pennypacker, 1999; Harrison & Harrell, 1993) or ex-post and conclusive judgments of project success or project management success (de Wit, 1988; Ika, 2009).
Selectiveness in multidimensional project evaluation enables the situation-specific negotiation of evaluation criteria, acknowledgment of different evaluator priorities, and the comparison of projects in their natural settings. Project actors typically operate with an overflow of information: what they need is not more information, but meaningful information that can help them navigate in the often turbulent and uncertain waters of project reality. The sensemaking that preconditions project improvement therefore requires qualified project evaluators.
Methodological Contributions
This study contributes methodologically by exemplifying the use of ADR (Sein et al., 2011) for generating solutions to a scientific and practical problem on project evaluation. Project scholars have only recently begun to see the possibilities of applying ADR, which is a recognized and rather mature methodology within the field of information management. This study adds to the few studies applying ADR in project management (Henriques & O’Neill, 2021; Mikkelsen et al., 2020). Applying ADR here yielded useful descriptive and prescriptive project knowledge across different research contribution levels (Gregor & Hevner, 2013). Specifically, this study generates contextual and specific knowledge through the instantiated illustration and more abstract, complete, and mature knowledge through the design principles presented in the practical implications.
The ADR study contributes with knowledge generated from 75 projects in 21 organizations, to reveal the practical aspects of project evaluation. Developing and validating a multidimensional project evaluation conception across different contexts, including both project successes and failures, are necessary to reveal the complexity of project evaluation. Traditional project evaluation frameworks are conceptual models without empirical grounding (see, for example, Anbari, 1985; Loo, 1985). Although case studies illustrate some frameworks (see, for example, Marques et al., 2010), practical validation in a broader sample of projects and organizations is necessary. This study thus makes a contribution as an embedded multiple-case study addressing different projects in distinct contexts.
Practical Implications
Three design principles (Sein et al., 2011) are suggested as practical implications to help project actors who wish to improve projects through multidimensional project evaluation.
First, it is important to consider the constitutive nature of project evaluation. Projects as well as evaluations and improvements are social constructions (Berger & Luckmann, 1966; Haass & Guzman, 2020), and subject to negotiation (McLeod et al., 2012). In the ideal case, all aspects are covered in a multidimensional and triangulated project evaluation but, in reality, project evaluations never become perfect. Project actors will need to choose between alternative approaches and decide on one project evaluation solution to a given project evaluation problem. Careful consideration of meaningful project evaluation dimensions in socially constructed contexts is necessary.
Second, it is important to consider the contextual conditions of project evaluation. When choosing among different alternatives, the circumstances of the project evaluation are decisive. Project evaluation conditions involve mandate and power (Loo, 1985), as well as political stances, ideologies, assumptions, and norms (Zidane et al., 2016), not to mention institutional forces, organizational commitments, sectional interests, professional affiliations, and individual agendas (Haass & Guzman, 2020). Such rules can be conscious and official, or unofficial and subconscious, but should be explicated (Linzalone & Schiuma, 2015) as much as possible, as they can be quite powerful in guiding the evaluation. Our study also drew attention to tangible conditions, such as resources, including time, money, and data availability. As those evaluation conditions are also socially constructed and negotiated, it is important to set realistic expectations concerning what project evaluation can and cannot do.
Third, it is important to consider change as a constitutive condition of project evaluation. As uncertainty is an inherent feature of projects and their contexts, project improvement suggestions and initiatives need to be flexible, and so does the project evaluation informing them. If uncertainty denotes the difference between the data required and the data already possessed (Galbraith, 1973), a reliance on plan-oriented methods (Sanderson, 2012) needs to be complemented with situation-sensitive action to deal with uncertainty (Huemann & Martinsuo, 2016). This study has shown that project evaluation cannot always be planned perfectly, but needs to be flexible, allowing for amendments. Project actors need to continually reflect on the appropriateness of the project evaluation design, considering the context and circumstances and the possible changes in both. Designing project evaluation to improve projects requires continuous, context-specific sensemaking and adjustments.
Conclusion
This study has offered a broad and integrated definition for project evaluation and brought together its multiple dimensions, in terms of properties (criteria, times, evaluands, evaluators) and perspectives (outcome, process, learning). Project actors can use project evaluation to acquire accurate information and generate comprehensive understandings of the actuality of projects in their contexts, allowing for sound improvement recommendations and initiatives. Instead of criterion lists, we have introduced project evaluation as a holistic, evolving, subjective, and social act that entails constant sensemaking and acute awareness of the evaluation context. As evaluation has important implications for project selection, control, and learning, the multidimensional understanding developed in this study is useful for practical application.
This comprehensive study has instantiated the project evaluation perspectives and properties. The validity of this study is limited by the application context of primarily private sector industry firms. While the illustration of the project evaluation conception confirms its internal validity and serves as a proof of concept, the authors would encourage further testing the conception in other contexts and with other types of projects. In general, future research is warranted to better understand the contextuality of project evaluation, for instance, by exploring what the contextual embeddedness of projects and evaluations means for the different dimensions of project evaluation presented here. Another validity limitation stems from the choices of projects included in piloting and referencing, and, thereby, as the main data for this study. The project choices are not objective, but offer sufficient variety for research purposes. Further studies could benefit from using project evaluation throughout entire project portfolios or programs, instead of selective settings. As project evaluation has such a central position in the continuous improvement of projects, the authors encourage more research on longitudinal aspects and practices preceding, during, and following project evaluation. A practice perspective on project evaluation could reveal more about the inherent challenges and complexities of performing project evaluation in practice. Finally, there is a potential avenue for future research in the use of ADR to further explore and explain project evaluation. Specifically, this approach holds potential to develop a complete design theory with a set of refined design principles.
Footnotes
Acknowledgments
This study is part of a larger project entitled Project Half Double—funded by the Danish Industry Foundation, a private philanthropic industry foundation. The authors would like to thank all participants who provided data for this study. Earlier versions of the work behind this article have been presented at conferences (Laursen et al., 2017; Rode & Svejvig, 2018; Rode & Svejvig, 2021) as part of the research process described in the methods section.
