Abstract
Background
The emergence and accessibility of generative artificial intelligence (AI), such as ChatGPT, has facilitated opportunities for use of AI among vocational rehabilitation (VR) counselors and employment specialists for support in job development and recommendations for people with disabilities.
Objective
This study examined potential biases of ChatGPT in its responses to prompts requesting specific job recommendations for case vignettes based on specific disability types.
Method
ChatGPT 4o mini was probed for job placement recommendations for seven disability types and a no disability control across vignette conditions that varied by gender and level of case detail.
Results
Several significant differences between disability and vignette condition regarding average income level and categorization of jobs emerged. Notably, individuals with intellectual disabilities and those with serious mental illness were suggested occupations with lower salaries. Vignettes with intellectual disability showed a higher proportion of restricted job types compared to every other disability type explored in this study. Further, salaries were lowest when individual interests were omitted from the profile, while high salaries were observed when little to no information was included in the vignette. Alternatively, low salaries were observed for individuals with intellectual disabilities across all conditions.
Conclusion
Overall, these findings suggest the need for careful and thoughtful use of AI tools among VR counselors and employment support providers.
Introduction
Employment plays a vital role in overall well-being by contributing to access to healthcare, financial security, better quality of life and many other aspects, irrespective of disability status (Galderisi et al., 2015; Vestling et al., 2003; World Health Organization [WHO], 2022). Yet, considerable disparities in the unemployment rates of persons with disabilities (PWD) compared to those without disabilities persist. The U.S. Department of Labor (DOL) reported an unemployment rate of 7.2% for PWD compared to 3.5% for persons without disabilities (BLS, 2024a). Further, PWD are more likely to be employed part-time across all educational levels and to be working in service-oriented occupations such as transportation, productions, sales and offices, and material moving occupations (CDC, Centers for Disease Control and Prevention [CDC], 2023).
At the same time, estimates suggest that 80% of individuals with disabilities who are not employed want to work, similar to 78% of individuals without disabilities and dispelling the myth that PWD simply do not desire to work (Ali et al., 2011; Sundar et al., 2018). PWD continue to face significant barriers to employment that includes limited access to training or education, transportation challenges, and barriers in the workplace, including access to accommodations, and stigma (BLS, 2022; Morwane et al., 2021). Research indicates that employers tend to express more apprehension about hiring people with mental and emotional disabilities compared to those with physical disabilities (Dalgin & Bellini, 2008; Gasper et al., 2020). Although employers have historically been less likely to hire PWD (Bonaccio et al., 2020), recent labor shortages may be shifting attitudes toward more inclusive hiring practices (Reardon et al., 2025). These persistent barriers highlight the critical role of employment specialists and vocational rehabilitation (VR) providers to support PWD in achieving meaningful employment outcomes. Responsibilities of VR professionals include providing personalized guidance, job development, and well-matched job placements to help close this gap.
Securing Good job Matches
The colloquial term “F jobs” refers to categories of work that begin with “F” often overrepresented in disability employment (National Center on Advancing Person-Centered Practices and Systems [NCAPPS], 2021; Jajtner et al., 2020; Stanford Law School, 2018). These include occupations such as “Food”, “Filth”, “Flowers”, and “Factories”, which cover the areas of food preparation and service, cleaning and janitorial service, landscaping/gardening, and assembly work, respectively. Food preparation services have the highest disability prevalence (19.9%) among all occupations (BLS, 2025; CDC, 2023). Furthermore, PWD are more likely than people without disabilities to work in production, transportation, and material moving occupations (14.2% v. 12.2%), roles often completed in factories or warehouses (BLS, 2025). PWD are often relegated to roles within these fields due to the underestimation and negative perceptions of employers and hiring managers, and the perception of PWD as a cost burden to society (Bonaccio et al., 2020; Nagtegaal et al., 2023). A limited understanding of how disabilities vary from person to person—and the diverse strengths and skills PWD possess—contributes to the persistence of these beliefs. This is a disadvantage for both PWD and the overall wellbeing of the labor force by overlooking a population of people able, willing, and capable of working in occupations that extend beyond “F” jobs (Colorafi et al., 2021). Previous studies have shown that this systemic underestimation could be due to lack of awareness of disability and accommodation issues, concern over costs, and fear of legal liability, all of which reinforces the bias against disability (Kaye et al., 2011; Wunderlich et al., 2002). Although this overrepresentation exists, the value and meaning derived from these jobs should not be disregarded. Secure employment opportunities provide individuals with a sense of purpose, social engagement, skill development, and personal accomplishment. However, this overcategorization of PWD into a narrow field of job opportunity is telling of employer beliefs and attitudes.
Employment specialists and VR professionals play a vital role in supporting PWD in finding meaningful employment that aligns with their strengths, interests, and goals. However, despite their best intentions, they face numerous barriers to their work due to systemic constraints, limited employer networks, or outdated assumptions about ability (e.g., seeing clients as “broken”) (Kinn et al., 2021). AI-driven platforms may be able to identify more personalized job matches by analyzing a person's skills and preferences alongside labor market trends, potentially supporting VR professionals in making more informed, inclusive, and future-oriented employment recommendations (Skerritt, 2023). However, due to the constraints of AI to access contemporary information, it is important that VR professionals also stay updated on recent developments in research through national resources like the National Clearinghouse of Rehabilitation Training Materials ([NCRTM], 2026), and present-focused literature through WHO meta-analyses and peer networks.
AI and ChatGPT in Vocational Rehabilitation and Employment Services
There is growing recognition of AI in VR settings (McDonald et al., 2025; Skerritt, 2023). To meet the needs of clients, VR professionals draw on a range of tools and strategies to support the employment goals of PWD. Increasingly, technology-driven approaches, such as AI, provide an opportunity to enhance job matching and streamline VR and pre-employment service delivery (e.g., McDonald et al., 2025). However, research has identified that more complex and subjective tasks, such as assessing job fit and understanding cross-cultural obstacles, require human input (Chen, 2023). Therefore, there is opportunity for AI and human cooperation to promote complementary job matches.
Bias in Artificial Intelligence (AI)
AI is defined as the ability of machines to learn through experience (National Institute of Biomedical Imaging and Bioengineering, 2025). ChatGPT, developed in 2021, is an extension of AI, a machine learning model trained on information publicly available on the internet, from third-parties, or provided by human researchers, to provide perceptually better responses (OpenAI, 2023). ChatGPT is the most popular and widely used AI tool by users (Westfall, 2023). The use of AI among service providers and clinicians in helping fields is growing (Lee et al., 2021). The U.S. Department of Education (2024) encourages responsible utilization of AI to combat challenges and barriers to fair and equal employment of PWD. Further, AI has been used in some contexts in an attempt to alleviate discrimination and biases in hiring practices (Sánchez-Monedero et al., 2020). As expected, the nature of AI harbors complications through the aggregation of an enormous amount of data. AI that pulls from a high volume of data, like ChatGPT, is at risk of deploying research founded in discrimination and bias, creating disparities and disfavoring of minority populations due to a historical misrepresentation of people with disabilities (Daneshjou et al., 2021; Park & Hu, 2023). Job advertisements and recruitment naturally implement discrimination, through the mechanical process of machine learning (Chen, 2023). Gender and race/ethnicity have also been subject to bias by AI (Chen, 2023; Kaplan et al., 2024). Limited research has directly addressed the impact of AI on disability populations, but emphasize the risk of ableist learning by AI, accentuated by a lack of research on disability populations (Newman-Griffis et al., 2023; Tilmes, 2022). Newman-Griffis et al. (2023) argues that these biases are not unavoidable; rather, a mindful development, evaluation, and management of AI can mitigate these discriminations. However, since AI systems build upon themselves, AI may not possess the appropriate means to create responses rooted in objectivity due to the limitations in AI training corpora, and the potential underrepresentation of robust available literature. For example, a scoping review conducted by El Morr et al. (2024) found a prevalence of ableist perspectives and narrow medical models of disability in research generated by AI. These biases can contribute to distorted and stigmatizing portrayals of PWD compared to nondisabled individuals, reflecting broader societal stigma (Packin, 2021). Existing studies using ChatGPT highlight how such embedded biases in AI systems can influence the information delivered to users, including VR providers who may rely on these tools for decision-making.
Study Purpose & Research Questions
Currently, virtually nothing is known about potential biases in ChatGPT's job recommendations for PWD and how these recommendations may vary based on disability type. Understanding these patterns is especially relevant for VR providers who may increasingly rely on AI-assisted tools to support employment planning and to increase awareness of how such tools might reinforce or challenge existing biases. To address this gap, this study used ChatGPT to examine whether AI-generated job recommendations differ by disability type using a series of vignettes that varied in level of background details presented. Generated job recommendations were analyzed based on type of occupation, associated salary, and nature of the work. Specific research questions were:
Does vignette condition have a significant effect on average salary levels of (a) all 10 recommended occupations and (b) the top three recommended occupations? Does disability type have a significant effect on average salary levels of (a) all 10 recommended occupations and (b) the top three recommended occupations? Are there significant differences in the number of recommended “F jobs” across disability types and vignette conditions?
Methods
Procedure & Experimental Setup
ChatGPT 4o mini was used to analyze the influence of bias on job placement recommendations for individuals with and without disabilities. Seven vignettes were presented to ChatGPT over multiple trials in the course of one study (see Table 1). The vignettes described a hypothetical job seeking client with descriptions of characteristics to varying degrees. Two of the seven vignettes (Full-Vignette Female; Full-Vignette Male) described the client in brief detail to include information related to age, sex, education, employment history, location/type of employment, driving history, interests, and disability type. Geographic location was limited to Chicago, IL to favor urban residencies, and the diverse employment experiences that such locations can provide. Two of the seven vignettes (Omit-Interests Female; Omit-Interests Male) omitted the clients’ interests. Two of the seven vignettes (Omit-All Female; Omit-All Male) omitted driving history, interests, and education level. Lastly, one of the seven vignettes (No Vignette condition) omitted all details apart from disability type. The vignette was based on a clinical example of a client seeking VR services.
Vignette Labels and Descriptions.
*Note: *Blank represents the disability types and are as follows: spinal cord injury; serious mental illness; traumatic brain injury; an intellectual disability; autism; a visual impairment; with multiple sclerosis. The name “Jana” was used for the female conditions, with her/hers pronouns.
Procedure
This study did not use human participants and therefore did not require IRB approval. Disability types were presented to ChatGPT through random assignment for each vignette condition. For all conditions, ChatGPT was prompted to provide ten job placements, ranked from best fit to worst fit for the client. ChatGPT provided a list of ten job placements using the 2024 Standard Occupational Classification (SOC) codes, the system used to classify 867 detailed occupations in the United State (BLS, 2024b). Specifically, each vignette condition was followed by the prompt “What are ten SOC codes that would be appropriate for someone (with a spinal cord injury; with a serious mental illness; with a traumatic brain injury; with an intellectual disability; with autism; with a visual impairment; with multiple sclerosis; with no disability)?”. SOC codes and associated occupational wage data were retrieved from the U.S. Bureau of Labor Statistics (BLS, 2024a, BLS, 2024b).
To secure untaught ChatGPT responses, Incognito browsers and multiple computers in different locations were used to conduct the study. Researchers were instructed to “clear the chat” and computer history between each trial to confirm distinct responses. Vignette prompts and disability type were introduced to ChatGPT randomly, at different times, over the course of four months, by four researchers using four computers.
In total, 560 trials were conducted over a period of four months. Ten trials were conducted per disability type for each vignette condition, with each individual vignette being reintroduced 80 times. This resulted in 100 job placement recommendations made for each disability type, for a total of 800 job placement recommendations per each disability type across seven vignettes, and 5,600 job placement recommendations across disability types and vignette conditions. Among the 5,600 jobs identified through these trials, ChatGPT pulled from 358 categories of occupations from the 867 SOC codes possible from the United States Labor of Bureau and Statistics.
“
“
“ Food: Occupations where cooking/handling/serving food are involved (e.g., Cooks, Waiters and Waitresses, Fast Food Workers). Filth: Occupations where cleaning is required, and is the main task of the occupation (e.g., Janitors and cleaners, Cleaners of Vehicles and Equipment). Flower: Occupations where landscaping or gardening is implied (e.g., Landscaping and Gardening, Flower Designers). Factory: Occupations existing in factories (e.g., Team Assemblers, Production Workers, Fabric Assemblers).
Statistical Analyses
SPSS was used to compute statistical analyses. Two-way ANOVAs and Tukey's multiple comparison analyses were used to address the first two research questions and examine the main effects of vignette condition and disability type on average salary levels of the recommended occupations, as well as if there was an interaction between these two independent variables. The third research question was analyzed through the computation of z-scores to observe significant differences between groups for meaningful effects; due to the large number of these comparisons, the criterion of p < .001 was used in evaluating differences between vignettes.
Results
Differences by Vignette Condition
To address the first research question, we examined differences in average income among the vignette types. The results between the “All 10 Jobs” recommendations and the “Top 3 Jobs” recommendations were quite similar, and both statistically significant. In each analysis, significant main effects for vignette types emerged (“Top 3 Jobs”, F (6, 504) = 83.25, p < .001; “All 10 Jobs”, F (6, 504) = 130.60, p < .001). In addition, significant main effects for disability types emerged (“Top 3 Jobs”: F (7, 504) = 74.89, p < .001; “All 10 Jobs”: F (7, 504) = 130.87, p < .001). Finally, in each analysis significant vignette x disability interaction effects emerged (“Top 3 Jobs”: F (42, 504) = 7.20, p < .001; “All 10 Jobs”: F (42, 504) = 8.34, p < .001).
Post-hoc Tukey tests were used to compare the salary means for the seven vignette conditions. For the “All 10 Jobs” analysis, Table 2 presents the means for the 7 vignettes presented in homogeneous groups. Means in the same columns are not statistically significant from each other (p > . 001), while means in different columns are statistically significant from each other (p < .001).
Homogeneous Subsets for “All 10 Jobs” Vignette Income Analysis (N = 560).
Note. This table displays the observed means for groups in homogeneous subsets by column. The p values below each column indicate insignificant differences within each subset, and significant differences between each subset.
Four distinct groups of vignettes were evident. Occupying the lower margin, the Omit- Interest Male (M = $48,522.96) and Omit-Interest Female (M = $49,229.25) vignettes generated the lowest paying jobs (these two vignettes were not statistically different from each other). A second group of vignettes generated middle-level salaries, including Full-Vignette Female (M = $54,516.11), and Full-Vignette Male (M = $53,971.20). Following, both Omit-All conditions generated further elevated salaries (M = $59,200.24 for Omit All-Male, and M = $61,567.34 for Omit All-Female). Finally, the No Vignette condition generated the highest salary (M = $74,398.27), and this mean was statistically higher than all of the other vignettes. In other words, adding information about the candidate reduced salaries substantially. Presumably, including the candidates’ interests (e.g., playing video games) appeared to decrease the salaries of the jobs generated.
A similar pattern of results was found for the “Top 3 Jobs” analysis (see Table 3). Notably, the difference between the “All Ten Jobs” and the “Top 3 Jobs” analyses was that only three distinct groups of vignettes were evident. In this analysis, the Omit-All conditions (Male: M = $57,283.41, Female: M = $56, 172.20) were not significantly lower in salary than the Full-Vignette conditions (Male: M = $53,024.39, Female: M = $52,899.94). In both analyses, ChatGPT generated jobs with nearly identical salaries for male and female vignettes. No statistical differences between sexes existed between the mean salaries for any vignette types.
Homogeneous Subsets for “Top 3 Jobs” Vignette Income Analysis (N = 560).
Note. This table displays the observed means for groups in homogeneous subsets by column. The p values below each column indicate insignificant differences within each subset, and significant differences between each subset.
Differences by Disability Type
To address the second research question, we examined differences in incomes between disability types. For the “All Ten Jobs” analysis, Table 4 presents the means for the 8 disability conditions presented in homogeneous groups. Means in the same columns are not statistically significant from each other (p > .001), while means in different columns are statistically significant from each other (p < .001).
Homogeneous Subsets for “All 10 Jobs” Income Analysis (N = 560).
Note. This table displays the observed means for groups in homogeneous subsets by column. The p values below each column indicate insignificant differences within each subset, and significant differences between each subset.
The 8 disability conditions (7 disability types and a control condition) generated five distinct groups of salary levels in the analysis. Jobs with the lowest salaries were generated for candidates with “intellectual disability” (M = $43,406.06). Following, “serious mental illness” (M = $49,754.44) and “traumatic brain injury” (M = $50,463.07) generated the second lowest salaries. The control condition (no-disability) and “autism” were recommended jobs associated with moderate salaries (M = $55,194.34 and $57,878.80, respectively). Succeeding this grouping was “multiple sclerosis” (M = $63,143.33). The “visual impairment” (M = $68,081.69) and “spinal cord injury: (M = $70,827.27) conditions received the highest salaries. Hence, some disabilities (“intellectual disability”, “serious mental illness”, and “traumatic brain injury”) were consistently suggested jobs that were lower paying than the control, while others (i.e., “multiple sclerosis”, “visual impairment” and “spinal cord injury”) were consistently recommended jobs that were higher paying than the control. ChatGPT generated jobs with salaries as much as 75% higher for one disability versus another.
The “Top 3 Jobs” analysis demonstrated similar results (see Table 5). In this analysis, only four distinct groups of salary levels emerged. Candidates with “intellectual disability”, “serious mental illness”, and “traumatic brain injury” were recommended jobs with the lowest salaries (M = $40,199.08, M = $43,975.20, and M = $45,621.94, respectively). The control condition (M = $50,808.88) and “traumatic brain injury” condition (M = $45,621.94) received jobs of moderate salaries. Salaries for candidates with “traumatic brain injury” were not statistically different from “intellectual disability”, “serious mental illness” or control condition salaries. Higher salaried jobs were generated for candidates with “multiple sclerosis” (M = $62,064.22) and “autism” (M = $60,132.56). Lastly, those with “visual impairment” (M = $69,051.37) and “spinal cord injury” (M = $70,222.88) generated the highest salaries. The same relative ordering of disabilities in terms of salaries existed as in the “All 10 Jobs” analysis, and again, the differences in salaries among disability groups is substantial.
Homogeneous Subsets for “Top 3 Jobs” Income Analysis (N = 560).
Note. This table displays the observed means for groups in homogeneous subsets by column. The p values below each column indicate insignificant differences within each subset, and significant differences between each subset.
Interaction Effects Between Vignette Condition and Disability Type
The significant Vignette by Disability interaction is depicted in Figure 1 for the “All 10 Jobs” analysis. In this graph, the male and female vignettes were combined for simplicity and for lack of statistically significant differences based on sex. Also, for simplicity, all disability conditions were combined and compared to the control (no-disability) condition.

Linear graph depicting “All 10 Jobs” analysis comparing all disabilities to control.
The No Vignette results are particularly noteworthy. In this vignette, no information about the candidate was given, except for the type of disability. In this condition, ChatGPT generated jobs for the control conditions that were approximately $20,000 higher than for candidates with disabilities. That result suggests a strong bias in ChatGPT regarding disability status, that has the potential for quite significant differences in quality of life. However, when more information about the candidates is added in the other vignettes, the above bias against candidates with disability largely disappears. That finding may have implications for how users of ChatGPT may help defeat the above disability bias.
“F” job analyses
To address the third research question, we examined differences in job type between the vignette conditions and disability type (see Table 6). Overall, we analyzed 5,600 job recommendations across disability types and vignette conditions, coding each occupation into one of five categories: Food, Filth, Flower, Factory, and Other (non-”F” job), based on the criteria described in the Methods. ChatGPT awarded the highest prevalence of “F” jobs to individuals with “intellectual disability” (246, 30.85%), followed by “serious mental illness” (133, 16.6%), control condition (117, 14.6%), “autism” (115, 14.4%), “traumatic brain injury” (110, 13.8%), “multiple sclerosis” (39, 4.9%), “visual impairment” (28, 3.5%), and “spinal cord injury” (14, 1.8%). Regarding job classification, Food represented 33.4% (268) of all “F” jobs identified, Filth represented 11.3% (91), Flowers represented 0.9% (7), and Factory represented over half of all “F” jobs at 54.4% (436). Collectively, “F” jobs represented 14.3% (802) of all jobs generated by ChatGPT. Across vignettes, the Omit-Interests Male vignette generated the most” F” Jobs at 22.9% (183). This was followed by Omit-All Male vignette at 19.3% (154), Omit-Interests Female vignette at 18.5% (148), Omit-All Female at 14.4% (115), No Vignette condition at 9.5% (76), Full-Vignette Male condition at 8.5% (68), and Full-Vignette Female Condition at 7.3% (68).
Note. V1M: Full Vignette Male, V1F: Full Vignette Female, V2F: Omit-Interests Female, V2M: Omit-Interests Male, V3M: Omit-All Male, V3F: Omit-All Female, V4: No Vignette Condition.
Z-scores were computed for raw scores in the “F” jobs data set. The proportions of “F” jobs between male and female vignettes indicated significant differences between sexes, with males recommended more “F” jobs than females, z = 3.07, p < .005. When studying the control condition (No Vignette) compared to all other vignettes, analysis revealed significant differences (z = –3.014, p < .005), suggesting that vignettes containing demographic details (Full-Vignette Female, Full-Vignette Male, Omit-Interests Female, Omit-Interests Male, Omit-All Female, Omit-All Male) had significantly more “F” jobs generated than the No Vignette condition. The proportion of “F” jobs generated for the control condition did not significantly differ from that of the disability conditions when combined, z = 1.932, p > .1. However, closer examination revealed significant differences between the control condition and specific disability types. The proportion of “F” jobs was significantly higher for the control condition compared to “multiple sclerosis” (z = 6.625, p < .001); “visual impairment” (z = 7.806, p < .001); and “spinal cord injury” (z = 9.452, p < .001). In contrast, “intellectual disability” generated more “F” Jobs than the control condition (z = –7.867, p < .001). There were no significant differences found between proportions of “F” jobs for control vs “autism” (z = .144, p > .1), “serious mental illness” (z = –1.117, p > .1), or “traumatic brain injury” (z = .501, p > .1).
Overall Findings
Our study found statistically significant differences in income based on disability, specifically regarding the frequency of low income “F” job occupations generated for people with intellectual disability, and the generation of high-income jobs for individuals with physical and sensory disabilities (e.g., spinal cord injury, visual impairment). Furthermore, the more detailed prompts generated lower salary occupations than prompts that were broad and contained only the disability type. These findings provide important implications for the clinically responsible use of ChatGPT.
Discussion
Findings from this study revealed significant disparities in characteristics of job recommendations among disability conditions and nuanced findings regarding disability conditions compared to a the control condition. Overall, ChatGPT's job recommendations aligned with prevailing stereotypes towards individuals with intellectual disability and serious mental illness, and were more favorable towards spinal cord injury and visual impairment. This was congruent with type of job as well, with “intellectual disability” generating the highest number of “F” jobs, and “spinal cord injury” generating the lowest number, 246 compared to 14. Interestingly, there were no significant differences between the “All 10 Jobs” analysis and the “Top 3 Jobs” analyses generated, indicating that the “best fit jobs” generated by ChatGPT are not higher in salary than an aggregation of all 10 generated jobs. This is an indication of what ChatGPT considers to be the “best fit” in terms of job placement, with less emphasis on salary and more on interest. Unexpectedly, the control no-disability condition did not have the highest average income salary across vignettes compared to other disability types. ChatGPT generated higher salaries for “spinal cord injury”, “visual impairment” and “multiple sclerosis” across “All Ten Jobs”, with the addition of “autism” in this group when considering the “Top 3 Job” analysis. This is inconsistent with current research that suggests individuals without disabilities would generate higher salaries than PWD (Jajtner et al., 2020) and national data. For instance, the median annual earnings for non-institutionalized adults (21–64 years old) with any disability in the U.S. was $51,000 in 2023 (Erikson et al., 2025), notably lower than the Bureau of Labor Statistics’ report of $61,440 among all U.S. workers in that same year (Guzman & Kollar, 2024).
Disabilities with the lowest average income and the highest “F” Job prevalence were intellectual disability and serious mental illness and significantly diverged from the findings for those with physical/acquired disabilities (spinal cord injury, multiple sclerosis, traumatic brain injury, visual impairment). Interestingly, the “autism” and control condition had commensurate median incomes and quantity of “F” jobs generated. This may be because there has been increased funding, attention, and research focused on facets of autism (Sohn, 2020). The research and “Big Data” that ChatGPT pulls from may contain more positive research associated with autism than disabilities like serious mental illness and intellectual disability. Further, some physical disabilities presented in this study surpassed the average income of the control and had less generated “F” jobs (spinal cord injury, multiple sclerosis, visual impairment) than intellectual disability and severe mental illness. Our findings echo documented income disparities among these disability groups and aligns with previous research reporting the greater stigmatization of intellectual disabilities and psychiatric disabilities relative to other disabilities (Bogart et al., 2018; Werner, 2015). For example, data from the American Community Survey indicates that the average annual earnings for adults with visual impairment is $45,315 (McDonnall et al., 2022); whereas median annual earnings for people with intellectual disabilities has been reported to be $11,400 (MyDisabilityJobs, 2024). It is unclear why ChatGPT assigned higher income job placements and fewer low-skill “F” jobs for physical/acquired disabilities like spinal cord injury, multiple sclerosis, and visual impairment compared to the control condition. This could stem from several factors tied to the model's training data and design. For example, ChatGPT is trained on vast amounts of internet-based text, which may include advocacy materials, disability employment success stories, and rehabilitation research that highlight positive employment outcomes for people with these disabilities, potentially leading to an overcorrection or optimism bias in job match suggestions for these specific disabilities. Another possibility is that the control condition might be interpreted too generally or neutrally by ChatGPT, lacking protected characteristics that might prompt compensatory consideration and resulting in blander, low-stakes job suggestions. These differences likely reflect the near-contemporary literature that is available for ChatGPT to extract from and represents the wealth of biases that are accessible online.
Varying results from Z-score analysis suggests that statistical significance of the proportion of “F” jobs where disability type supersedes the control condition is only evident between the control condition and intellectual disability. These results further demonstrate bias and stigmatization towards individuals with intellectual disability. This finding demonstrates critical consideration for counselors collaborating with ChatGPT on job placement recommendations, as it reveals expanded biases regarding job type. Although ChatGPT's suggestions do highlight important disparities, employer assessments of capabilities may not always be erroneous but can align with functional realities of some PWD. It is imperative that person-centered VR strategies, such as supported decision making (Shogren et al., 2021), are used to articulate the career goals of the individual.
Inconsistent with previous research studying AI bias toward gender, our study offered a new perspective on sex distinctions. Although no statistically significant bias was found regarding the comparison of average incomes between sexes, males were more likely than females to be recommended “F” jobs. This may be because some “F” jobs (e.g., cleaning, factory, and landscaping jobs) require more physicality, a characteristic more likely to be awarded to males than females. Pertaining to “F” jobs, our research indicated a higher emphasis on Factory positions over Food service positions, results that are variable among current employment places for people with disabilities.
Among the different styles of vignettes, Full-Vignette Female, had the lowest number of F jobs generated (58), compared to Omit-Interests Male having the highest number of F jobs (183), and the lowest average salary for all groups. The No Vignette condition superseded all vignette incomes by over $20,000. These results suggest that as less details are included in the vignette, the average salary of jobs recommended by AI rises across all but one type of disability (intellectual disability). These results are valuable to VR counselors in the context of developing job placement recommendations for clients that may be financially sustainable. However, as the vignettes provided more details related to interest, average income decreased, suggesting that ChatGPT targets the client's interests over other important occupational details critical for appropriate job recommendations. In fact, z-score analyses showed that providing no vignette was associated with a lower proportion of “F” jobs compared to all other vignette types. While including information about interests and background may improve job matching, it can also narrow the range of options presented. To capture a fuller set of opportunities, VR providers should consider using both personalized searches as well as broader, general searches when exploring potential job matches.
Implications
ChatGPT can be a valuable tool for VR professionals when brainstorming and generating job matches, identifying local employers that align with a client's skills and interests, and exploring broader labor market trends. However, AI tools should be used thoughtfully with an awareness of their limitations, such as potential biases in training data. Our findings support the use of caution when collaborating with AI. ChatGPT generated lower paying, and less desirable, job placements for individuals with intellectual and serious mental health disabilities, compared to the higher salary positions for spinal cord injuries and visual impairment disabilities. These findings underscore the contrasting bias that favors more visible and physical disabilities over mental disabilities. Amid generating prejudiced job placements for individuals with specific disabilities, ChatGPT also generated erroneous SOC codes, which may mislead VR professionals. Our results underscore the need for further discussion among VR professionals about the ethical use of AI in clinical and research practice, and appropriate use concerning specific clients’ needs. Appropriate use consists of: tailoring client vignettes to include their interests to prevent poor fit, adequately evaluating job placement recommendations for potential bias trends (e.g., jobs with low salaries, unrealistic/misaligned jobs), and clarifying job type (part-time employment vs full time employment) and other job attributes (e.g., schedule preference, work environment). We found that presenting ChatGPT with different vignettes generated different job placement recommendations. In a practical application, it may be important to offer multiple different vignettes of varying information for ChatGPT and use clinical and experiential judgement to determine the most appropriate employment goal and plan. Finally, when using any form of AI in a clinical application, it is essential to exercise caution that no potentially sensitive or personally identifiable information is shared. VR providers must protect client privacy and health information, such as refraining from including too many details that could make a client identifiable (e.g., birthdate, rare conditions, detailed employment history). Awareness about existing data and discrepancies in statistics, unfavorable and undesirable disability hiring practices, and recommendations in public literature about ideal hiring scenarios is important to critically evaluate ChatGPT outputs for use with a specific case, prompting judicious and informed use of generative AI.
Limitations
There are several limitations of this study that can inform future research. It is important to clarify that the results presented in this study reflect the information available online and existing data patterns, not the personal biases of ChatGPT itself. In reference to challenges and barriers when conducting research with ChatGPT, ever-advancing mechanisms for improving AI may have impacted our study results. Although we were able to maintain our work with ChatGPT 4o mini for the duration of the study without interference from public updates, it is important to acknowledge the potential differences in our results had our work been conducted at a different time. Another limitation of the study is our focus on salary outcomes associated with the identified occupations. Although important, other relevant outcomes, such as labor market accessibility, were not examined and should be considered in future research. Further, the scope of our vignettes was potentially limited. Namely, our findings suggest that differences in the amount of information presented in the vignette contribute to salary differences among different disability types. It may be helpful to expand this study in future research by composing more vignettes with more detailed information to discover further potential biases related to additional disability categories and intersecting characteristics, such as race/ethnicity and age. Regarding the construction of vignettes, our findings may have been influenced by the focus on urban settings and associated salary levels. Further, excluding geographic information from the final vignette (No vignette) may have affected salary outcomes, underscoring the influence of geographic context on salary estimates. This observation reveals the need of careful efforts by providers on what information to include/exclude in their client's vocational profiles. Finally, it is important to acknowledge the ethical considerations of repetitive usage of AI, and the impact on the environment (Karakaş & Özdemir, 2025).
Conclusions and Future Research
People with disabilities are willing, capable, and often determined to work (Colorafi et al., 2021). VR counselors are essential in their role in assisting people with disabilities in finding appropriate and accessible employment. ChatGPT offers unique assistance in aiding VR providers in generating job placement recommendations. We suggest a careful and considerate use of AI, similar to the recommendations of Skerritt (2023), which emphasizes the need for clear ethical guidelines to protect clients, advanced research into the impact of AI on VR job expectations, increased value alignment training for AI, and privacy protection. McDonald et al. (2025) praises the exploration of AI in pre-employment settings to create holistic and creative experiences for youth with disabilities, while also advocating for further refinement of AI. Concurringly, we do not want to ignore the practice implications and client benefits of using AI in VR. Rather, we want to shed light on the need for more research on ChatGPT recommendations and the possible barriers and challenges for VR providers. Clinical impressions and opinions are necessary when reviewing the work of ChatGPT to defend against potential bias in ChatGPT job recommendations, particularly against individuals with intellectual disabilities.
Footnotes
Acknowledgements
We would like to thank the Department of Psychology at The Illinois Institute of Technology for their continued support, and the efforts of all vocational rehabilitation counselors to provide considerate care to their clients.
Ethical Statement
This study was exempt from Institutional Review Board approval.
Informed Consent
The research did not involve human subjects and informed consent was not obtained.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
