Abstract
We reflect on AI's implications for management and organization research. Advanced language models like ChatGPT can analyze large datasets and generate human-like text, potentially disrupting academic research and knowledge production. While potentially helpful, the technology evokes concerns about, for example, inaccuracies and accountability, inadvertent plagiarism, and confidentiality breaches. We also reflect on potential systemic consequences such as the erosion of creativity, analytical thinking, and the human voice. Overall, our stance is cautious optimism. We require full disclosure regarding AI's role in research, and advocate for maintaining human oversight and critical engagement with AI to preserve the integrity and originality of academic work.
Keywords
Introduction
Artificial intelligence (AI) is a trendy topic for project practitioners and academics alike. The latest version of PMI Infinity™ promises to introduce features designed to enhance project management practices significantly by providing a robust set of tools to improve daily project management, such as smart navigation to the most relevant and current best practices, aimed at increasing project success. PMI Infinity™ also includes a continually expanding library of over 14,000 pieces of content, vetted by Project Management Institute (PMI) members and project management experts (PMI, 2024b). On the academic side, according to Scopus nearly 40,000 articles have been published on the subject in the first six months of 2024.
Not surprisingly, the attention of management scholars, attracted by the trend, has turned to AI (von Krogh et al., 2023; Lindebaum & Fleming, 2024; Gatrell et al., 2024; Clegg & Sarkar, 2024). Particularly intriguing is the rise of Advanced Natural Language Processing (ANLP) models and their powerful subset; Large Language Models (LLMs), such as OpenAI’s Generative Pretrained Transformer (GPT); Google's BERT (Bidirectional Encoder Representations from Transformers); and T5 (Text-to-Text Transfer Transformer). These machine models can analyze vast amounts of unstructured data and generate professional text resembling human work, and those with generative capabilities (e.g., GenAI) claim to create new content (Dwivedi et al., 2023). AI technology harbingers a future for a different kind of automatic writing, in which there is an engine creating the ideomotor effect (Shin et al., 2010), a machine in the ghost rather than a ghost in the machine.
Given that much scholarly work in project studies, as well as management and organizational research in general, involves analyzing and creating texts, these emerging technologies promise much. Beyond automatic writing in response to prompts, there is the possibility of assisting “with the entire value chain of knowledge production, from synthesis to creation to evaluation and translation” (Grimes et al., 2023, p. 1617), including analyzing new datasets (Grégoire et al., 2024) that have simply not been possible without AI capabilities. For instance, while a scholar can reasonably well analyze manually a few hundred pages of interview transcripts, it is unrealistic to analyze several thousands of pages of government reports or newspaper articles. While systematic literature reviews written by humans over several months may provide overviews of research themes and topics, advanced software and coding devices can produce results much faster, and even at a greater level of detail.
Speed is not everything; interpretive understanding is also important, which may very well be lacking to a greater extent from machine-generated rather than human-generated content. Indeed, it is difficult to say that it could be present in the former, whereas it is possible to discern its absence in the latter. Nonetheless, AI technologies fundamentally alter human engagement, development, and dissemination of knowledge. Consequently, these technologies potentially may, in Schumpeterian terms, “creatively destroy” academic research.
As editors of and contributors to Project Management Journal® (PMJ) and project studies, we will use this editorial to reflect on the potential and implications of AI’s creative destruction, deconstruction, and reconstruction of management and organization research. Major publishers, such as Sage, Elsevier, and Wiley, have policies prescribing the use of AI in research and writing, most significantly prohibiting coauthorship with a bot. The Journal of Management Studies already cautioned against the loss of human voice and developed its policy for AI, emphasizing transparency and human oversight, and banned the use of AI in their peer- review process (Gatrell et al., 2024). The Journal of Management Inquiry curated a discussion on practical and other philosophical issues related to the use of AI in management research, with a division between contributors for whom AI can function as an under laborer and others for whom the ethicality of the nonhuman is deeply problematic (Kulkarni et al., 2024). The Academy of Management Journal (Grimes et al., 2023), appreciated AI’s potential as a tool for management knowledge production, dissemination, and accessibility.
More fundamentally, the discussion across management and organizational journals goes beyond only governing scholarly work, but also the very role of management and organizational scholars in society. As Grimes et al. (2023) conclude, “we suspect that a plausible generative AI-led shift from scarce academic knowledge production to abundant academic knowledge production will inevitably increase the urgency around answering a fundamental question: To what problems in society is management scholarship the (unique) solution?” (p. 1623). To ask this question suggests to project scholars a lack of vision, as surely the important problems are related to the futures we choose to make. As Clegg (2023) writes, “we do not have to live in the pasts that have been made in the futures still to be created” (p. 31). Future-making entails imagination and political will to materialize possibilities rather than render probabilities.
As an initial reflection designed to curate discussion and debate about the implications of AI, the editors of the PMJ invite you, the project scholarly community and the readers of PMJ, to a dialogue about AIs’ affordances and our ingenuity to navigate them in our efforts to make a better future. AI might introduce a range of new possibilities of developing new crafts with which to produce research rather than diminishing our craft, upskilling rather than deskilling, and working with technology rather than against it. While AI has potential implications for all facets of scholarship, including discovery, integration, application, teaching, and learning (Clegg & Sarkar, 2024), our focus here is on the scholarship of discovery.
Moreover, Gatrell et al. (2024) propose the difference between research on AI and research with AI. Similarly, in the project field, research on AI can be on AI projects, in other words, projects to develop and apply AI in various contexts or on the uses of AI in and for projects (Niederman, 2021; Holzmann et al., 2022). Research on AI can, for example, investigate how AI can be used for creating risk registers and improving project scheduling or decision-making. These are relevant topics as stated in the recent PMI report on AI (PMI, 2024a): “This is important evidence supporting the need to experiment more to understand the full potential of this technology in project management.” (p. 8) Research on AI was addressed in a past PMJ Thoughtlet, in which Müller et al. (2024) discussed various opportunities and challenges associated with research on AI.
In this editorial, we move forward and explicitly focus on research with AI, reflecting on AI as a tool to conduct scholarly work. AI, as a research tool, joins a heritage built on technical tools, such as SPSS, NVivo, and others; however, there is a fundamental difference. While the themes and correlations of these earlier software products require human interpretation of the pattern-making that is produced, as Pachidi (in Kulkarni et al., 2024) argues, AI applications are performatively predictive pattern makers. In this editorial, we will discuss the implications of the main AI technologies currently in play. We will discuss their use in different phases of research work—interpreting data, reading, writing, and even developing research ideas—and their critical potential implications: what can be gained, what could be lost, and what can be considered off-limits from an editorial perspective. We will close with reflections on the roles of management and organizational scholarship in the age of AI.
What’s the Story? GPT-4: Please Help!
To start, we asked GPT-4 to explain the different technologies related to natural language processing with relevance to management research. The prompt was as follows:
“I would like to write a section in an editorial about the role of genAI and LLM in academic work, in particular in my field of project studies. As we have been exchanging here, there are many acronyms and different concepts, that are not necessarily known to the general public, like the difference between LLM and ANPL, and GenAI, and machine learning… could you please write me a short overview of the core technologies that have important implications to research in management and organization, explaining the differences between these technologies? I would like a short text, around three to four paragraphs. Please take your time, accuracy is the most important criterion”.
On this prompt, GPT-4 produced the following answer, which we read, adjusted, and verified: “In recent years, the landscape of academic research, especially in the field of project studies, has been significantly transformed by the advent of generative artificial intelligence (GenAI) and large language models (LLMs). Understanding these technologies and their differences is crucial for comprehending their implications for research in management and organization. Generative AI refers to a subset of artificial intelligence that can generate new content, such as text, images, or music, based on the data it has been trained on. This technology leverages advanced machine learning techniques to create outputs that are often indistinguishable from those produced by humans. Large language models (LLMs), a key component of GenAI, are specialized AI systems designed to understand and generate human language. LLMs, such as OpenAI's GPT-4, are trained on vast amounts of text data, enabling them to produce coherent and contextually relevant text across various topics. These models are particularly useful in academic research for tasks like drafting papers, summarizing literature, and even conducting preliminary data analysis. Another important concept is artificial narrow intelligence (ANI), which contrasts with the broader capabilities of GenAI. ANI, or narrow AI, is designed to perform specific tasks or solve particular problems. It excels in narrowly defined areas, such as recommendation systems or image recognition, but lacks the generalized problem-solving capabilities of LLMs. While ANIs are invaluable for automating routine tasks and analyzing structured data, LLMs offer more versatility, making them suitable for creative and complex tasks in academic research. Machine learning (ML), the broader field underpinning both GenAI and ANI, involves training algorithms on data to identify patterns and make predictions or decisions. ML encompasses various techniques, from simple linear regression models to complex neural networks. In project studies, ML can be used to analyze project performance data, predict project outcomes, and optimize resource allocation. The integration of ML with LLMs and GenAI allows researchers to tackle multifaceted problems by combining data-driven insights with advanced linguistic capabilities. The implications of these technologies for management and organization research are profound. GenAI and LLMs can enhance the efficiency and creativity of academic work, enabling researchers to generate high-quality content, synthesize vast amounts of information, and explore innovative solutions. As these technologies continue to evolve, they hold the potential to redefine research methodologies, offering new tools and perspectives for studying complex organizational phenomena and improving project outcomes.” (OpenAI, 2024, see attachment for full transcript)
Machining Interpretation
Speed and Rigor in Current Practices With Some Caveats
GenAI and LLMs can and probably will revolutionize practices of data collection and analysis. Most of the research within project studies is based on relatively unstructured data stored in written format (e.g., transcribed interviews, diaries, reports) and coded by hand, with the support of general-purpose software (e.g., word processors or spreadsheets) or dedicated qualitative data software packages like NVivo and ATLAS.ti. Some of these dedicated packages are already incorporating AI functionalities. For instance, ATLAS.ti partnered with OpenAI and promises to “Reduce your overall data analysis time by up to 90% with the power of OpenAI; Increase the accuracy of your results thanks to advanced AI models and algorithms and Save valuable time with various automated and semi-automated research tools.” (ATLAS.ti, 2024). As such, GenAI and LLMs can potentially help with existing coding practices. When we have clear ideas of what to look for, GenAI can aid in coding with more complex and well-defined prompts. If used properly, GenAI might also enhance the reliability and validity of research. It may assist in producing augmented data to test our interpretations across various scenarios and identify data inconsistencies. It can also help in the painstaking reanalysis of data, confirming whether one or another interpretation better fits the data. The use of such technology can make data analysis processes quicker and easier and, as some of the algorithms are freely available (whereas qualitative data software packages require a paid subscription), they are also more accessible, lowering barriers to the methodological competence required in research, potentially democratizing research practices.
Stimulating New Research Topics, Questions, and Data
Beyond doing what we already do, GenAI and LLMs may stimulate exploration of new research avenues (Grimes et al., 2023) that rely on the analysis of new datasets (Grégoire et al., 2024) impossible to access without AI capabilities. We are particularly fascinated by the opportunities to engage with the much larger unstructured datasets GenAI and LLMs can afford. First, GenAI can do rapid qualitative analysis of large amounts of unstructured data (Odacioglu et al., 2023); for example, thousands of pages of newspaper articles, social media posts, or government reports, opening opportunities to address different research questions and analyze new datasets, that were until now simply too laborious to work with conventional data analysis methods. Second, as Grimes et al. (2023) suggest, we can use GenAI for inductive exploration of large datasets through simple prompts, asking the algorithm to identify patterns that we might not have seen. Qualitative analysis of vast datasets could reveal underlying networks and structures of relations otherwise not apprehended. Such research might even receive increased legitimacy with positivists, a benefit that dedicated qualitative researchers might well regard as dubious.
While size is not everything, it does open new opportunities. We experimented with GTP-4 for qualitative data analysis of the Fehmarn Belt link, a major infrastructure linking Denmark and Germany, which solicited contrasting opinions among various stakeholders. While in Denmark the infrastructure had strong support, in Germany many protesters questioned its utility and environmental impact. To investigate the emergence of contrasting narratives, their evolution over time and their performativity, we collected qualitative data encompassing 5,178 newspaper articles from 178 sources from the two countries, spanning a period of 30 years (1993–2024), the length of infrastructure development. Nexis Uni, a comprehensive search engine, provided access to a wide range of news, legal, and business resources. In Nexis Uni we searched for a string using the different names associated with the infrastructure in both countries: “Fehmarnbelt” OR “ Fehmarn fixed link” OR “Fehmarn project” OR “Femern Bælt” OR “Femern Bælt-forbindelsen” OR “Femern Bælt-tunnelen” OR “Femern projekt” OR “Femern A/S” OR “Fehmarnbelt-Tunnels.”
Documents selected were differentiated into samples based on two key variables: language and publication date. Before feeding the documents to AI, we performed a trial to validate the data analysis method. The trial consisted of a comparison between manual and AI inductive coding. We performed an inductive first-order coding on 25 newspaper articles related to the Fehmarnbelt tunnel project in English. We then replicated this process with an AI model exclusively trained on the analyzed documents, scrutinizing its accuracy in comprehension against established qualitative themes and codes. To compare the accuracy of the AI analysis, Table 1 provides a comparison between the manually identified codes “environment and sustainability” and “local stakeholders” and the AI-identified codes “environmental sustainability,” “environmental reclamation,” and “community engagement.” Specifically, the table presents the direct quotations identified in both cases to support such codes.
Comparison Between Manual and AI Coding
AI coding proves to be well-aligned with manual coding and sometimes achieves a greater level of granularity. For example, what was categorized under the theme “environment and sustainability” in manual coding, AI divided into “environmental sustainability” and “environmental reclamation.” The quotations provided by AI to support its codes are accurate. Only in a few cases does AI slightly modify the phrase to ensure the text segment used as support makes complete sense. For instance, the quotation “The infrastructure will […] while also fostering the development of new nature and recreational landscapes by working with nature concepts,” is rendered by AI as “The project fosters the development of new nature and recreational landscapes by working with nature concepts.” to avoid including a nonrelevant phrase segment for the corresponding code.
Overall, we considered the results of the comparison between manual and AI coding sufficiently close to proceed further with the other documents. First we divided the collected documents in two samples based on the language (i.e., German and Danish). Then we further split each sample into three time periods, obtaining six samples. For each of the six samples, we created. Each AI chat we created embodied one of the several narratives we considered in the study. AI was provided with further information about the piece of research we were conducting, the aim and research questions, and the main definitions of constructs and concepts. Consequently, we provided detailed instructions for the analysis to be performed (i.e., first-order coding of the Gioia method). Finally, we manually proceeded with the Gioia method, grouping first-order concepts in second-order themes and aggregate dimensions while investigating their connections both internally and with established theory.
Our example offers one possibility of the blended application of human expertise with GenAI that relates to project studies. In the future, we look forward to receiving manuscripts that use such methods or develop other insightful and innovative forms of technology advancements to enable us to ask interesting questions and accomplish research impossible before AI. We expect that the usage of AI should be acknowledged in the methodology section of submitted manuscripts.
Caveats When Machining Interpretations
Several challenges are associated with the use of LLMs and GenAI in data analysis and research in general. First, AI can produce erroneous analyses. Indeed, commentators write of “AI hallucinations” to refer to instances where an AI system generates false or misleading information that appears plausible (Walters & Wilder, 2023). Until a few months ago, it was possible to ask for “evidence” to ChatGPT-3.5 that humans had never been to the moon. After a few “pushy” prompts, ChatGPT-3.5 would have eventually provided some “proof” based on various conspiracy texts that moon exploration was a hoax. ChatGPT-4 is better and, even after a few pushy prompts, keeps insisting that there is overwhelming evidence of a moon landing. Yet not all the disputes are so easy, and some topics are intrinsically controversial (e.g., try to ask ChatGPT-4 about nuclear power or sticky geopolitical issues). Arguably, hallucination can be diminished over time through expert human actors’ applied intelligence, as discussed by Glaser and Gehman in (Kulkarni et al., 2024).
Another way around this is to state clearly in the prompt that accuracy is critical and that no data or a lack of clarity is a valid answer. The algorithmic search may be lengthier but also less susceptible to error. Glaser and Gehman (in Kulkarni et al., 2024) note that LLMs can be improved through instruction tuning and alignment tuning (Zhao et al., 2023, pp. 15–20). Practices of chain-of-thought prompting (Wei et al., 2023) and tree-of-thought prompting (Long, 2023) explaining the context of the prompt may elicit these smarter responses. As stated earlier, we are experimenting with it, even mentioning and adding the classic Gioia method paper into the prompt and it improved accuracy significantly. These developments also mean that the use of GenAI and/or LLMs for data analysis requires skills in prompting. Sophisticated methodological approaches, as in classic modes of data analysis, require expertise—the same expertise required before using software like SPSS or Stata.
Second, as Cameron and Rahman (in Kulkarni et al., 2024) argue, in qualitative research, surprises—unexpected findings that defy initial expectations—are crucial for deep analysis. While LLMs detect surprises as deviance, as statistical anomalies, true qualitative surprises emerge from the nonlinear, immersive research process and live coding (Locke et al., 2015; Locke et al., 2022). These surprises often challenge prevailing theories. They typically require a good and solid grasp of theories and ongoing conversations (Silverman, 2024). Anomalous and surprising findings lead to new metaphors and insights shaped by the researcher’s continuous engagement with the data. While AI can statistically identify anomalies based on training data, it lacks the contextual understanding that expert human researchers bring to the analysis. Therefore, an in-depth understanding of the dataset, its context, and nuances, even if empowered by GenAI and/or LLM prompts, still requires human agency in knowing what to prompt and what is anomalously surprising. Questions and societal problems that were difficult to approach and address might become evident. A pertinent question is: How will GenAI develop unique creative abilities to complement and strengthen those of human expertise? That a recent AI-generated piece of art won an art competition in Colorado, might serve as an example of the abilities of AI but also of the expert human difficulties in distinguishing between AI and a human product. A comparative study by Guzik et al. (2023) demonstrated that GPT-4 was equal and, in some cases, even superior to human creative abilities. This raises serious questions; for instance, the ownership of such new intellectual property is a key topic of discussion across academic and practitioner communities. Accordingly, AI entrepreneurship, innovation, and creativity may well play prominent roles in future research and in generating novel topics also within project studies.
Final Thoughts on Manufacturing Interpretations
Just because we can use such advanced tools, however, does not mean that we should. First, we need a clear rationale for the use of any tool that we rely on in our academic conduct—similar to what we would do with any other type of data analysis technique (again, having insight into appropriate statistical analysis before using SPSS). Transparency is also paramount here; in submitting their research to PMJ, researchers must show how data have been analyzed, specify what prompts have been used, and detail their interactions with AI chatbots. Following the principles of Open Science, which advocate disclosure and publication of datasets alongside research papers, conversations with AI chatbots should become part of research documentation.
Second, we might prefer to maintain the sensitivity of expertise in sensemaking and engaging with data. As Petriglieri (in Kulkarni et al., 2024) implies, is there a paradox in humanizing management when insights about it are machine manufactured? Moreover, LLMs’ focus on textual discourse could set us back decades by ignoring the progress made in understanding practices (Schatzki et al., 2005), process ontology (e.g., Langley, 1999; Hernes, 2008) and the multimodality of data (Boxenbaum et al., 2018). We would be extremely concerned to see a movement away from a rich and embodied understanding of life in organizations just because textual analysis is easier and quicker. Ease of availability may well come to overrule theoretical acumen in the choice of topics, theories, and translations of diverse forms of sensemaking.
Machining Reading
Machining Summaries
As with the analysis of data, LLMs can be used, in theory, to paraphrase literature, write summaries and compare and contrast different past publications. After all, this technology can identify patterns in the data it can just as well, and with similar caveats, identify patterns in literature.
We used ChatGTP-4 to summarize and paraphrase articles on AI in management and organizational scholarship. We chose two articles we were familiar with as readers: Editorials by Grimes et al. (2023) and Gatrell et al. (2024) on the use of LLMs in the Academy of Management Journal (AMJ) and the Journal of Management Studies (JMS), respectively. The summaries were accurate, so we used some LLM formulations as a point of departure in our paragraphs about each of the papers. However, when it came to Kulkarni et al. (2024), a curated discussion on AI in the Journal of Management Inquiry (JMI), we explicitly asked for a summary of each of its different authors and for the references and evidence used in their arguments. The algorithm performed poorly; it didn’t capture each author’s completely different opinions. As a learning algorithm, it might potentially evolve and mitigate such issues; but as it stands, it misses the nuanced perspectives and diversities in the text.
Machining Peer Reviews
Here comes a heretical thought: Could AI perform peer reviews? This is an important question that we, as editors, need to address and regulate (Garcia, 2024; Mollaki, 2024). On the one hand, it is feasible to input a manuscript into an LLM, provide the review criteria, and ask the algorithm to conduct the peer review. As editors, we often struggle to find appropriate reviewers, the review process can be lengthy, and sometimes the review quality is substandard even though we have managed to get reviews from experts in the field. Incorporating an algorithm as one of the reviewers could expedite the publication process, reduce the workload on editors and the academic community, and potentially provide valuable feedback on the coherence of the text and other aspects. However, as Garcia (2024) also points out, the integrity of the review process could be compromised if LLMs’ authoritative-sounding output may be undetected as incorrect, incomplete, or biased (Garcia, 2024).
In a recent editorial, the Journal of Management Studies (Gatrell et al., 2024) explicitly prohibited the use of LLMs for peer review, as did the International Journal of Project Management (IJPM). We agree. Our reviewers will be urged not to rely on LLMs for peer reviews and we must trust our reviewers. Similarly to other scientific journals, PMJ always asked for confidentiality from our reviewers. If a reviewer uploads a paper under review to GPT-4 for assistance, the risk is that GPT-4, now aware of the ideas and data, might provide the same ideas to someone else. This could lead to obvious negative consequences. Yet, effectively controlling the use of LLMs in peer reviews is clearly a moot issue.
Furthermore, would it be acceptable to use AI for assistance rather than as the sole reviewer? For instance, let’s assume a paper is about a certain infrastructure, for example, the aforementioned Fehmarnbelt tunnel. If the reviewer is unfamiliar with the specific infrastructure, they could have Googled the name and started reading about it. So, what if a reviewer asks GPT-4 about, for example, cost and benefit for the stakeholders of the Fehmarnbelt tunnel? Would this be acceptable? Where do we draw the boundaries? The technology though is evolving quickly, and some emerging chatbots promise to analyze a manuscript without incorporating that manuscript into its body of knowledge. If such an opportunity indeed does emerge, would it then be acceptable to use the assistance of a chatbot in review processes?
Machining Literature Reviews
Moving into the next phase, the analysis of prior research, and already published scientific articles, can such AI technologies produce trustworthy overviews of large bodies of research or even produce rigorous and systematic literature reviews? At present, the free version of ChatGPT is not yet able to review published articles and summarize research topics and recommendations for future research. Yet, it will not be long before platforms like ScienceDirect, Web of Science, and large publishers will implement such services. Scopus is already offering AI options albeit the results are, scientifically speaking, quite weak. Moreover, GPT-4 allows the upload of a large number of files (including scientific papers) that it can summarize. Algorithms can thus identify trends in the literature, akin to machines churning out systematic literature reviews with ease. Speed, oversight, and the ability to keep up with a much larger body of literature will be gained, but what will be missed? Can AI technologies perform an insightful reading of a scholarly text?
Final Thoughts on Machining Reading
While there is room to use LLMs potentially to paraphrase one’s own work, avoiding the theft of words (Geraldi, 2021), proper acknowledgment of the source of ideas will be required. Indeed, in a world where the rewriting of text becomes mechanized, the theft of ideas (Geraldi, 2021) becomes even more important yet even harder to capture.
We also believe, at least for now, that the nuanced understanding and reading of a text, particularly of more sophisticated social scientists—such as Foucault, Habermas, Luhmann, Latour, and Bauman—is an interpretative skill that exceeds what such texts might explicitly say. Although the algorithm might even mimic these important scholars, in the absence of common intellectual cultures, institutions, and more or less reliable sources, what emerges is likely to be a pastiche of understanding akin to a bad undergraduate essay. While technology can be an under laborer, nearly equivalent to a good research assistant (Kulkarni et al., 2024), humanity and its expertise remain, for now, essential in reading and interpreting complex intellectual texts. So, no, we don’t want to see literature reviews written by AI.
Machining Writing (and Thinking!)
Easing the Writing
LLMs are specifically designed to generate text, making them invaluable tools for writing tasks due to their versatility and extensive capabilities; they are both cameras and engines (MacKenzie, 2008). LLMs are cameras, because they can represent a totality by scanning digital materials; engines, because they are algorithmic machines that produce text and findings effortlessly. Some of us may have encountered writer's block or the daunting feeling of staring at a blank screen, unsure of where to start or how to articulate our thoughts. LLMs can be a lifeline in such moments, aiding the transformation of fragmented ideas or bullet points into coherent prose.
To aid our reflections, we experimented with different levels of GPT-4 assistance in the writing of this editorial, trying to keep at least some control over the text, asking for the rewriting of specific paragraphs and not entire sections. Yet, authorial authority won, so we chose not to reproduce the exercise. Indeed, we noted that while the AI-produced text may undergo significant revisions as the manuscript evolves, assistance provided by LLMs can resolve hours of frustration and procrastination in minutes.
Some members of the authorship team, like Stewart and Jonas, possess remarkable writing abilities that we often admire or even idolize; Giorgio grew up reading and loving their papers. Joana personally observed similar talent in her father, João Wanderley Geraldi, whose effortless ability to produce beautifully written essays in a short period has always seemed almost magical. For example, he wrote “Portos and Passagens,” a 400 + page classic text in Brazilian linguistics, in 40 days. As he said, “I spent 10 years thinking about it, and I wrote it in 40 days.” However, the advent of LLMs has somewhat demystified this skill. Just as Simon argues that intuition is “pattern recognition,” could writing similarly be about mastering these patterns, with some individuals exceling at it through lifetimes of practice? With LLMs, we have the opportunity, perhaps, to democratize this skill, making the process more accessible to a wider variety of people, including nonnative English speakers like some of the authoring team.
This potential to globalize academia beyond the confines of English-speaking countries is undeniably exciting, yet it also evokes a sense of unease. Any piece of prose could be fed to an algorithm with the instruction to rewrite it in a particular style, yet beware of pastiche—rather like an inexperienced doctoral student spouting complex theory they have not acculturated to their expression. Moreover, writing is a skill honed through practice, reading, receiving feedback (including paper rejections!), and continuous refinement. If authors rely too heavily on algorithms to do the writing, from where will the great stylists and intellectuals, emerge? Could we become as reliant on automated assistance in writing as young people are on digital maps, potentially diminishing our proficiency in this fundamental academic skill? Moreover, could we also lose the beautiful variety of writing styles across disciplines (and authors!) to AI writing?
Nonetheless, the potential gain in quantity is mind-blowing; a quick search on Amazon brought us to an author who has published 3,424 books, several a day, since the advent of ChatGPT on subjects varying from farming to parenting; as far as we could see, all listing ChatGPT as the coauthor. The potential losses in quality are another matter. Volume begs the question of whether we can welcome machine manufacturing of knowledge as a democratization of knowledge production or despise it as polluting the planet with rubbish.
Coauthoring With Machines
In academia, a handful of papers in the medical sciences listed ChatGPT as coauthors in early 2023 (Stokel-Walker, 2023). Such developments led to a strong reaction across large publishers, including Cambridge University Press, Elsevier, Wiley, and Springer Nature. These publishers began establishing policies banning AI coauthorship and mandating a clear and comprehensive disclosure of the use of AI in the development of research and manuscripts (Spanjol & Noble, 2023). O’Connor and ChatGPT (2023) published a widely quoted editorial titled “Open Artificial Intelligence Platforms in Nursing Education: Tools for Academic Progress or Abuse?” with ChatGPT as one of the authors. The editorial was followed by a corrigendum (O’Connor, 2023), stating that “The first author became aware that the second listed author, ‘ChatGPT’, does not qualify for authorship according to the journal’s guide for authors and to Elsevier’s Publishing Ethics Policies. ‘ChatGPT’ is, therefore, removed from the author list and is acknowledged as making a substantial contribution to the writing of the paper” (p. 1).
Beyond issues of accuracy, machines, such as 2001’s HAL, have no such ethics. For instance, should AI be accountable for false (e.g., data fabrication) or offensive (e.g., racist) claims? This is a relevant issue and even the European Commission, which “Proposed 3 inter-related legal initiatives that will contribute to building trustworthy AI: European legal framework for AI that upholds fundamental rights and addresses safety risks specific to the AI systems; a civil liability framework - adapting liability rules to the digital age and AI; a revision of sectoral safety legislation (e.g. Machinery Regulation, General Product Safety Directive)” (European Commission, 2024). The Journal of Product Innovation Management thematized this issue in a survey among members of their scholarly community to gain a sense of scholars’ stance toward AI and contrast this with overall reactions among the academic community. The editorial advocated careful and full disclosure of the use of AI in manuscripts and against coauthorship with bots, arguing that AI cannot take responsibility for the research conducted (Spanjol & Noble, 2023).
Coauthoring or Cothinking?
The legal ramifications of AI coauthorships are undeniably significant, and we agree with the ban on coauthorship with bots. However, challenges in cowriting with bots extend much further. For instance, Kulkarni et al. (2024) call our attention to the issue of originality. As LLMs build on the existing corpus of texts available on the internet, how can the text be truly original? Will it invariably be plagiarized? Kulkarni et al. (2024) remind us that the concept of the “author” is relatively recent, and “In an increasingly AI-driven world, the author as we know them today (an individual with agency and legal rights) may be seen as nothing but “a brief episode in the history of writing” (Woodmansee, 1992) whose “death” was imminent (Barthes, 1977, p. 208).” Moreover, collaboration with bots introduces a complex dynamic wherein the boundaries between human creativity and machine-generated content blur. Mantere and Vaara (in Kulkarni et al., 2024) exemplify the potential nature of such collaboration well: “Imagine that not far in the future, your word processor will start suggesting what to write next; this should not be too hard to conceive because your phone probably does this for text messages. But how about if, based on what you have written already, the AI will suggest the next paragraph? Or, maybe, upon seeing what you choose as a title, it will generate an entire draft text? Even better, maybe you can write an abstract and it will extend it into an entire paper to edit? Will there be a point when you will no longer edit but it will be enough to input parameters such as genre, style, length and so on, and AI will narrate what you need without little need of editing? Now, who wrote such a text?” (p. 209)
Machining Ideas
Until now, researchers have been in command of the knowledge production process, setting the direction with a potential to benefit theory and society. AI sneaks into our prose like a thief in the night, softly stealing and changing the placement of things almost, imperceptibly perhaps; or else, mining the literature in the targeted journals or professional publications, framing a research topic, it asserts a soft domination (Courpasson, 2000) over our agency. Technologies might paradoxically prompt ideas that we might not have thought of ourselves by comparing different fields, identifying concepts that could be relevant across them, and monitoring trends in generating ideas for novel research. Scholars need to be able to use the right prompts to generate interesting research questions that might, in turn, prompt them to new thoughts that would break new ground. Similarly, the technology could be a memory chalet (Judt, 2011), a storehouse of otherwise forgotten things.
Using AI to identify research trends dares us to rethink traditional approaches to academic inquiry in which AI acts as both a tool and a collaborator in the research process, and all that is written and digitally available potentially becomes its data. However, such an approach raises important questions about the limitations and biases inherent in various AI systems and the need for critical oversight to ensure that AI-generated insights are accurate and ethically sound.
Only scholars with an ability to identify what is sensible and nonsensible, what is relevant and what is less relevant, can determine these things. After all, while AI is good at pattern recognition or even pattern making, most qualitative research emerges from a unique and in-depth understanding of quite specific events and problems. Such patterns are unique and based on the personal experience of the researchers. For instance, Giorgio’s interest in corruption (Locatelli et al., 2017) and the dark side of projects (Locatelli et al., 2022) is the result of his own experience growing up in an Italian working-class family. A scholar growing up in a middle-class family in Finland, a rich family in China, or an indigent family in Senegal would have experienced something different, likely shaping some of their research. Similar considerations can be made for gender and sexual orientations, academic backgrounds (e.g., engineering, economics, geography, law), and so forth. And, crucially, it is not just asking questions or research interests. If each of the hypothetical researchers would ask the same question (e.g., “Which are the ethical implications of dealing with local communities when delivering megaprojects?”), we would probably get different answers.
The risk is that AI will give the same answer to everybody, losing the beautiful rainbow of perspectives we are struggling so much to improve (Locatelli et al., 2023). This is a real risk, since AI would have access to all our papers and even if the answers would be different, this would lead to some standard stereotypes typical of globalized knowledge. Internationally, we reduced our beautiful historical cuisines (e.g., Italian, Chinese, and Indian) to a few standardized and stereotyped dishes. Are we not risking the same with our ideas? Is AI becoming a vehicle for the globalization of ideas? Is AI mainstreaming our scholarly creative process?
So, do patterns emerge from the proclivity of what already is biased toward the status quo? As Mantere and Vaara in Kulkarni et al. (2024) argue: “In addition to veering clear off sensitive topics, machine narration also poses questions pertaining to novelty. How much of future knowledge production will be reproduction of existing information “validated” by AI? As AI works by generalizing across existing bases of human knowledge, will this lead into a myopic development where radically new ideas fail to emerge? This is not a trivial concern as the current versions of AI operate based on algorithms that use existing texts as a point of departure. Thus, although AI can also generate new knowledge, there is a genuine risk that the knowledge produced will be more a reiteration of accepted wisdom rather than genuinely new ideas or novel insights as findings become less and less disruptive over time (Park et al., 2023b). This is also linked with the language itself; if we reproduce the concepts and vocabularies in use, it becomes increasingly difficult to engage in critical thinking problematizing these very discourses.” (p. 210)
Machining the Craft of Research or Crafting the Machining Research? Questions, Questions, and More Questions
The integration of AI into knowledge production and consumption presents a complex landscape for scholars. Our past experiences can inspire visions for potential futures. For example, with the advance of Google Maps, new generations will probably lose the ability to navigate an old-fashioned map but are still capable of navigating to find the best train connections to get nearly anywhere independently, while their moves can be parentally supervised with the GPS tracker, balancing freedom and control in ways we didn’t experience in our childhood. Likewise, it is not hard to imagine how GenAI can empower us, increasing efficiency, opening the possibility of pursuing research avenues (Grimes et al., 2023) and analyzing new datasets (Grégoire et al., 2024) that were simply not possible without AI capabilities. In the recent past, both computers and the internet changed the way we worked but still did not make it easy for everyone to publish in project studies. In fact, experience suggests that today, publishing in high-quality journals, with so much more global competition, is becoming more difficult by the year.
Yet, no doubt, these tools will speed the research process significantly. Should we be considered low-performing academics if we opt not to use them? Will it be equivalent to writing drafts in longhand with a #2B pencil in the age of computers and CAD? Dear reader: you may be surprised that a number of works by one of the members of the authorial team were produced in this way. Could we lose our ability to draft ideas on blank sheets of paper? If so, what could be the consequences not only in the research process but also on its outcomes? Is it only a new method to craft research or is this fundamentally changing the scholarly craft? As AI advances and can take on large portions of our work are we becoming empowered, as with Google Maps, or deskilled? Are we losing the craft of doing research? Should we stick to the old ways to safeguard our methods and professions? Or can we perhaps develop a craft to machining research with AI?
Having the ability to use AI does not necessarily mean that we will lose the absorptive capacity (Levinthal & Cohen, 1990) by outsourcing some of our work to algorithms. However, such loss is a relevant concern. Today, experts can relatively easily spot shortcomings in AI output but, as technologies improve, this will become increasingly difficult, and the skill to do so will become increasingly rare. Consider the use of Wikipedia, which was more or less banned in academia 15 years ago, but is now used frequently in lectures and, to some extent, as a source for academic papers; for instance, to provide definitions to specific concepts. Today, it is an integral part of academic work. However, Wikipedia might have a problematic bias regarding the origins and genders of the authors (en.wikipedia.org/wiki/Gender_bias_on_Wikipedia). It also has considerable weaknesses in accuracy within certain knowledge domains and geographical regions. Similar problems and biases will confront users of AI technologies, such as ChatGPT, where reports already indicate bias regarding race, gender, political orientation, and religion (Motoki et al., 2024). There are already several reports pertaining to the inaccuracy of results, understanding that requires detailed knowledge about specific fields to spot problematic biases that might not be detected unless the user is highly trained and an expert.
We know from outsourcing studies that organizations lose absorptive capacity over time. As they outsource manufacturing, they have difficulties retaining enough knowledge in-house to govern their contractors (Brusoni et al., 2001). Analogously, do we risk losing the ability to oversee the algorithms as we outsource some of our core tasks (e.g. writing, data analysis, literature synthesis) to them? If knowledge production and consumption are mediated through such technology, will we lose our ability to detect hallucinations? What will happen to the next generation of researchers who will have outsourced great parts of what we consider today as integral parts of scholarly work? They will not have written as we write today but instead feed algorithms with bullet points; they will not have been doing painstaking, boring, yet potentially insightful data analysis, but instead use algorithms to “grasp” the data for them; they will not have carefully reread manuscripts many times to capture their meaning and not writing summaries and collating past work?
If writing and thinking are indeed connected as linguists claim, will we still maintain critical thinking and creative skills? Might we be potentially degenerating an essential cognitive skill? As AI advances, will we lose employment as academics because machine algorithms can do our work better as Lindebaum and Fleming (2024) suggest? While we are still employed, consider this: Can AI actually generate interesting research ideas or questions to help us build better futures? We will continue to be observing and playing with the technology and look forward to hearing your views on this important topic.
Afterword: What Would GTP-4 Say About Our Editorial?
To conclude this editorial, we asked GTP-4 to critically analyze our editorial: "The editorial in the Project Management Journal® (sic) provides a thorough examination of the transformative role of AI, particularly focusing on generative AI (GenAI) and large language models (LLMs), in organization and management research. The editors highlight the potential benefits of these technologies in enhancing the efficiency and creativity of academic work, citing examples such as the ability to analyze large datasets and generate high-quality text rapidly. However, they also raise significant ethical and practical concerns, particularly regarding the loss of human voice, the risk of plagiarism, and the potential erosion of individual creativity and critical thinking skills. A critical strength of the editorial is its balanced perspective, acknowledging both the promising advancements and the potential drawbacks of AI integration in research. The discussion on how AI can facilitate various phases of research—from data interpretation to generating new research ideas—is insightful. It underscores the capability of AI to perform tasks that would be otherwise too time-consuming or complex for human researchers alone, thereby expanding the horizons of what is achievable in academic inquiry. However, the editorial rightly questions whether the speed and efficiency gained through AI might come at the expense of depth and interpretative nuance that human scholars bring to their work. Despite the comprehensive analysis, the editorial could delve deeper into the implications of AI on the diversity of scholarly perspectives. The concern that AI might homogenize research outputs, leading to a standardization of ideas and potentially stifling innovation, is mentioned but not extensively explored. This issue is crucial as it touches on the broader impact of AI on the richness of academic discourse. The editorial could benefit from a more detailed discussion on how to balance the use of AI with the need to maintain diverse, critical, and creative scholarly practices. Moreover, while the editorial advocates for transparency in AI use, it could provide more concrete guidelines or best practices for integrating AI tools in research to ensure ethical and responsible use. Overall, the editorial presents a nuanced view of AI's role in project management and organizational research, calling for cautious optimism and critical engagement. It emphasizes the importance of maintaining human oversight and interpretative skills, even as AI tools become more integrated into scholarly work. This reflective approach is crucial for navigating the evolving landscape of AI in academia, ensuring that technological advancements enhance rather than undermine the core values and practices of scholarly research.” (OpenAI, 2024b,, see attachment for full transcript)
