Artificial Intelligence to Support Oral Reading Fluency Assessment in English and Spanish for Students With Reading Disabilities

Abstract

Large Language Models (LLM) have the capacity to quickly produce passages of varying lengths, complexity, genres, and topics, which could be useful for teachers of mono- and multilingual students with reading-specific learning disabilities, specifically relating to monitoring oral reading fluency (ORF). This study addressed one primary research question with two sub-questions. Research Question 1: What is the quality of third-grade ORF passages generated by LLMs as compared to validated third-grade ORF in English and Spanish? Research Question 1a: What is the quantitative readability of LLM-generated passages? Research Question 1b: What is the conceptual diversity and co-occurrence of themes in said passages? Analysis was conducted using natural language processing tools, including Coh-Metrix, MultiAzterTest, and Leximancer. Results indicate that readability metrics vary greatly across texts generated by LLMs (ChatGPT, Claude), even with consistent prompting; the use of LLMs to produce ORF passages for progress monitoring and high-stakes decision-making is not recommended at this time.

Keywords

technology reading assessment

Get full access to this article

View all access options for this article.

References

Acadience Learning Inc. (2023). Acadience Reading Español. Author.

Amendum

S. J.

Conradi

Hiebert

(2018). Does text complexity matter in the elementary grades? A research synthesis of text difficulty and elementary students’ reading fluency and comprehension. Educational Psychology Review, 30(1), 121–151. https://doi.org/10.1007/s10648-017-9398-2

Bengoetxea

Gonzalez-Dios

(2021). MultiAzterTest: A multilingual analyzer on multiple levels of language for readability assessment. arXiv preprint arXiv:2109.04870.

Benjamin

R. G.

(2012). Reconstructing readability: Recent developments and recommendations in the analysis of text difficulty. Educational Psychology Review, 24(1), 63–88. https://doi.org/10.1007/s10648-011-9181-8

Burr

Haas

Ferriere

(2015). Identifying and supporting English learner students with learning disabilities: Key issues in the literature and state practice (REL 2015-086). Regional Educational Laboratory West.

Center on Teaching and Learning, University of Oregon. (2023). Eighth edition of Dynamic Indicators of Basic Early Literacy Skills (DIBELS®): Administration and scoring guide. https://dibels.uoregon.edu

Chard

Vaughn

Tyler

B. J.

(2002). A synthesis of research on effective interventions for building reading fluency with elementary students with learning disabilities. Journal of Learning Disabilities, 35(5), 386–406. https://doi.org/10.1177/00222194020350050101

Deno

S. L.

(1985). Curriculum-based measurement: The emerging alternative. Exceptional Children, 52(3), 219–232. https://doi.org/10.1177/001440298505200303

Filderman

M. J.

Toste

J. R.

Didion

L. A.

Peng

Clemens

N. H.

(2018). Data-based decision making in reading interventions: A synthesis and meta-analysis of the effects for struggling readers. The Journal of Special Education, 52(3), 174–187. https://doi.org/10.1177/0022466918790001

10.

Florit

Cain

(2011). The simple view of reading: Is it valid for different types of alphabetic orthographies? Educational Psychology Review, 23(1), 553–576. https://doi.org/10.1007/s10648-011-9175-6

11.

Francis

D. J.

Rojas

Gusewski

Santi

K. L.

Khalaf

Hiebert

Bunta

(2019). Speaking and reading in two languages: On the identification of reading and language disabilities in Spanish-speaking English learners. New Directions for Child and Adolescent Development, 2019(166), 15–41. https://doi.org/10.1002/cad.20306

12.

Fuchs

L. S.

(2004). The past, present, and future of curriculum-based measurement research. School Psychology Review, 33(2), 188–192.

13.

Fuchs

L. S.

(2017). Curriculum-based measurement as the emerging alternative: Three decades later. Learning Disabilities Research & Practice, 32(1), 5–7. https://doi.org/10.1111/ldrp.12127

14.

Fuchs

L. S.

Fuchs

(2011). Using CBM for progress monitoring in reading. National Center on Student Progress Monitoring.

15.

Gough

P. B.

Tunmer

W. E.

(1986). Decoding, reading, and reading disability. Remedial and Special Education, 7(1), 6–10. https://doi.org/10.1177/074193258600700104

16.

Graesser

A. C.

McNamara

D. S.

Kulikowich

J. M.

(2011). Coh-Metrix: Providing multilevel analyses of text characteristics. Educational Researcher, 40, 223–234.

17.

Hudson

Koh

P. W.

Moore

K. A.

Binks-Cantrell

(2020). Fluency interventions for elementary students with reading difficulties: A synthesis of research from 2000–2019. Education Sciences, 10(3), 52. https://doi.org/10.3390/educsci10030052

18.

Keller-Margulis

M. A.

Payan

Booth

(2012). Reading curriculum-based measures in Spanish: An examination of validity and diagnostic accuracy. Assessment for Effective Intervention, 37(4), 212–223. https://doi.org/10.1177/1534508411435721

19.

Keppel

Wickens

T. D.

(2004). Design and analysis: A researcher’s handbook (4th ed.). Prentice Hall.

20.

Klingbeil

D. A.

Osman

D. J.

Carberry

C. K.

Carrigan

J. E.

Berry-Corie

(2021). Predicting performance on a statewide reading achievement test in Spanish with aimswebPlus Spanish. Psychology in the Schools, 58(9), 1768–1781. https://doi.org/10.1002/pits.22535

21.

LaBerge

Samuels

S. J.

(1974). Toward a theory of automatic information processing in reading. Cognitive Psychology, 6(2), 293–323. https://doi.org/10.1016/0010-0285(74)90015-2

22.

Learning Disabilities Association of America. (n.d.). New to LD. Retrieved June 15, 2026, from https://ldaamerica.org/support/new-to-ld/

23.

Lembke

E. S.

Smith

R. A.

Thomas

C. N.

McMaster

K. L.

Mason

E. N.

(2022). Using student assessment data, analyzing instructional practices, and making necessary adjustments that improve student outcomes. In McLeskey

Maheady

Billingsley

B. D.

Brownell

M. T.

Lewis

T. J.

(Eds.), High leverage practices for inclusive classrooms (2nd ed., pp. 80–94). Routledge.

24.

Luft Baker

Park

Andress

T. T

. (2022). Longitudinal predictors of bilingual language proficiency, decoding, and oral reading fluency on reading comprehension in Spanish and in English. School Psychology Review, 52(4), 421–434. https://doi.org/10.1080/2372966X.2021.2021447

25.

McLeskey

Maheady

Billingsley

B. S.

Brownell

M. T.

Lewis

T. J.

(Eds.). (2022). High leverage practices for inclusive classrooms. Routledge.

26.

McNamara

D. S.

Graesser

A. C.

McCarthy

P. M.

Cai

(2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press.

27.

Mesmer

H. A. E.

Cunningham

J. W.

Hiebert

E. H.

(2012). Toward a theoretical model of text complexity for the early grades: Learning from the past, anticipating the future. Reading Research Quarterly, 47, 235–258.

28.

National Center for Education Statistics. (2024). English learners in public schools: Condition of education. U.S. Department of Education, Institute of Education Sciences. https://nces.ed.gov/programs/coe/indicator/cgf

29.

National Governors Association Center for Best Practices & Council of Chief State School Officers. (2010). Common core state standards for English language arts and literacy in history/social studies, science, and technical subjects.

30.

Office of Special Education. (2022). Fast facts: Students with disabilities who are English learners (ELs) served under IDEA Part B—Individuals with Disabilities Education Act. https://sites.ed.gov/idea/osep-fast-facts-students-with-disabilities-english-learners

31.

Rasinski

Homan

Biggs

(2009). Teaching reading fluency to struggling readers: Method, materials, and evidence. Reading and Writing Quarterly, 25(2–3), 192–205. https://doi.org/10.1080/10573560802683622

32.

Ripoll

Schmitz

L. M.

Sonnleitner

(2025). Evaluating AI-generated vs. human-written reading comprehension passages: An expert SWOT analysis and comparative study for an educational large-scale assessment. Large-Scale Assessments in Education, 13(1), 20. https://doi.org/10.1186/s40536-025-00255-w

33.

Serafini

E. J.

Rozell

Winsler

(2020). Academic and English language outcomes for DLLs as a function of school bilingual education model: The role of two-way immersion and home language support. International Journal of Bilingual Education and Bilingualism, 25(2), 552–570. https://doi.org/10.1080/13670050.2019.1707477

34.

Shieh

Vassel

F. M.

Sugimoto

C. R.

Monroe-White

(2025). Laissez-faire harms: Algorithmic biases in generative language models. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 8(3), 2373–2374. https://doi.org/10.1609/aies.v8i3.36722

35.

Sidwell

M. D.

Bonner

L. W.

Bates-Brantley

(2024). Utilizing text-generative AI for creating oral reading fluency probes. Intervention in School and Clinic, 60(2), 119–125. https://doi.org/10.1177/10534512241235896

36.

Smith

A. E.

Humphreys

M. S.

(2006). Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping. Behavior Research Methods, 38(2), 262–279.

37.

Steele

J. L.

Slater

R. O.

Zamarro

Miller

Burkhauser

Bacon

(2017). Effects of dual-language immersion programs on student achievement. American Educational Research Journal, 54, 282–306. https://doi.org/10.3102/0002831216634463

38.

Tortorelli

L. S.

(2019). Beyond first grade: Examining word, sentence, and discourse text factors associated with oral reading rate in informational text in second grade. Reading and Writing, 33(1), 143–170. https://doi.org/10.1007/s11145-019-09956-5

39.

University of Oregon. (2023). 8th edition of Dynamic Indicators of Basic Early Literacy Skills (DIBELS): Administration and scoring guide, 2023 edition. University of Oregon. https://dibels.uoregon.edu/

40.

Yuan

Hart Barnett

J. E.

(2026). Developing quality IEP goals in the age of artificial intelligence. Teaching Exceptional Children, 58(3), 156–166. https://doi.org/10.1177/00400599241239311