Abstract
Abstract
Fundamentals of environmental engineering (FEE) is a common core component of undergraduate curricula in civil engineering, environmental engineering, and environmental resources engineering, and it is increasingly becoming a core course in other disciplines. Conceptual understanding is an important pedagogical goal in FEE instruction. A strategy used to address this need in other fields is the development and implementation of a concept inventory (CI) test, which is an assessment tool that identifies common misconceptions about key concepts. Here, a Delphi process was used to identify FEE concepts that were deemed fundamental, important, and at the same time, prone to misperceptions by students, including (1) reactor theory, (2) the mass balance equation, (3) biochemical oxygen demand, (4) units of measure, and (5) chemical equilibrium and partitioning. A CI was developed that included incorrect answers, or “distractors,” based on student interviews that identified common FEE misconceptions. The CI was beta tested in FEE courses at six universities. Analysis of psychometric data from beta testing revealed which concepts were most difficult and identified concepts that needed further refinement. Being cognizant of student misconceptions is a prerequisite for faculty who strive to improve students' conceptual understanding of FEE concepts.
Introduction
T
Conceptual understanding is an important pedagogical goal in FEE instruction, or, indeed, in any engineering instruction. The celebrated book How People Learn emphasizes the importance of conceptual understanding (Bransford et al., 2000). The authors assert that students who organize facts and ideas within a conceptual framework are more likely to learn new information quickly and will also be able to apply what they have learned to new situations. By contrast, students new to a field of study tend to make superficial connections or shoehorn memorized formulae rather than organizing their knowledge around fundamental concepts and general principles. Studies of learners' abilities to apply their knowledge to novel problems, a process often referred to as transfer of learning, have clearly shown that for knowledge to be transferred it must be based upon general principles (Litzinger et al., 2011). However, presently there is no readily available resource that lists the general principles needed by a student to understand core concepts in FEE classes. This lack results in at least two important challenges for FEE instructors. First, there is a lack of knowledge of which fundamental, underlying concepts give students the most difficulty and prevent them from mastering course material at the desired level. Second, there is no quantitative way to assess whether different teaching approaches, such as laboratory projects, different classroom formats, or curricular ordering improve students' conceptual understanding of FEE material.
A strategy that has been used to address this problem is the development and implementation of a concept inventory (CI). The pioneering work in CI development was the Force CI, a brainchild of Hestenes et al. at Arizona State University (Hestenes and Wells, 1992; Hestenes et al., 1992). CIs have been shown to be a powerful and accessible tool to support iterative improvement in faculty teaching and to enhance the scientific literacy of students (Smith and Tanner, 2010). They have been used to catalyze curriculum reform (Hake, 1998; Smith et al., 2008), and to identify student weak spots (Garvin-Doxas et al., 2007). The CI model has been applied by educators in many science and engineering disciplines; at present CIs exist for chemistry (Krause et al., 2004), biology (Klymkowsky et al., 2010), statistics (Stone et al., 2003; Allen et al., 2004; Allen, 2006; Stone, 2006), electromagnetics (Notaros, 2002), electromagnetic waves (Roedel et al., 1998; Rhoads and Roedel, 1999), circuits (Herman et al., 2011), signals and systems (Wage and Buck, 2001), strength of materials (Richardson et al., 2003), thermodynamics (Midkiff et al., 2001), materials science (Krause et al., 2003), dynamics (Gray et al., 2003, 2005), fluid mechanics (Martin et al., 2003, 2004), and statics (Steif and Dantzler, 2005).
Although many CIs have been developed as noted above, the process of generating a CI has not been very well defined. The primary input is typically the experience of one or more faculty members who have taught the course for many years and thus have a strong sense of what concepts students are having difficulty with. In the case of FEE, this situation is more complex. Environmental engineering itself is interdisciplinary in nature; moreover, FEE is required of many majors and with a wide range of prerequisites. Because of this diversity of ways in which FEE courses are incorporated into curricula, there is likely to be a similar diversity of faculty opinions on concepts that belong to a fundamentals of environmental engineering concept inventory (FEECI). Therefore, the development of a CI for FEE courses represents a particular challenge.
The overall goal of this work was to produce an instrument for assessing conceptual understanding in a core curriculum course for civil and/or environmental engineering. Such an instrument has the potential to provide a needed technique for formative assessment of pedagogical frameworks and instructional methods in the FEE curriculum, and can play an important role in assessment for programmatic accreditation under the ABET standards. Results of the CI could also be used to indicate the effectiveness of different sequences of prerequisites or other programmatic decisions, which might be particularly beneficial for environmental engineering and science programs housed in nontraditional departments. The specific objectives were to (1) identify key concepts in FEE that are both important and difficult to understand, (2) develop a FEECI that can be used to quantify students' conceptual understanding of these key FEE concepts, (3) administer a “beta” version of the FEECI at U.S. universities with required undergraduate FEE courses, and (4) conduct a psychometric analysis of the results of its initial administration, thereby assessing the effectiveness of questions on the FEECI and identifying trends in conceptual understanding of key FEE concepts. Although the authors previously published a brief conference paper that outlined a general approach to developing a FEECI (Sengupta et al., 2013) the current article provides greater detail on the methods used, results from FEECI beta testing, psychometric analysis, and discussion of the implications of the results.
Methods
A FEECI Development Team was formed that consisted of professors from the following 12 universities: University of Colorado Denver, Florida A&M University–Florida State University Joint College of Engineering, Humboldt State University, Lehigh University, University of Massachusetts Amherst, University of Massachusetts Dartmouth, Missouri University of Science & Technology, Northeastern University, University of South Carolina, University of South Florida, University of Utah Salt Lake City, and Utah State University. The institutions were selected to represent a range of sizes, to include both public and private schools, and to include both teaching-focused and research-focused universities. Faculty were identified who had taught FEE courses for multiple years.
Topic identification
Development team faculty initially submitted their syllabi to the authors to identify common topics among the courses. In CI development, a topic is a course unit, which may include multiple concepts. For example, the topic of chemical equilibrium includes multiple concepts, such as precipitation-dissolution equilibrium, acid-base chemistry, and gas-liquid partitioning. Following the identification of common topics, the FEECI was developed following the general procedure outlined by Adams and Wieman (2011) using the specific steps described below.
Delphi study
The rationale of the Delphi method is that it iteratively moves a team of experts toward a consensus (Dalkey and Helmer, 1963; Streveler et al., 2003). The process recognizes that expert judgment is necessary to draw conclusions in the absence of full scientific knowledge. The method avoids relying on the opinion of a single expert or merely averaging the opinions of multiple experts (Goldman et al., 2008). Instead, experts share ideas and beliefs based on their expertise in teaching (in this case, FEE), so that each can make a more informed decision, but in a structured way. This can prevent the situation in which, for example, in round-table discussions a few panelists have excessive influence (Pill, 1971).
In the present study, an online Delphi study was used to identify concepts in FEE courses that are critical but prone to difficulty among students. Each member of the FEECI Development Team completed an on-line survey as the first step of the Delphi study (stage 1). The on-line survey asked members of the Development Team to identify up to five key concepts within each of the identified topics in an open-ended format. For each concept listed, the Development Team members also indicated their opinion or perception of how important that concept is, and how difficult it is for students, using a five-point Likert scale. The members were also asked to state their rationale for including that particular concept in the list. Following identification by the authors of the most common concepts for each topic, members of the Development Team were resurveyed (stage 2). In this round, members were informed of the frequency with which each concept was reported in stage 1, were informed of the means and standard deviations for the team's Likert scale values in stage 1, and were then asked again to rate the importance and the difficulty of each concept.
Surveys were also administered to 10 students at each of three universities: University of Massachusetts Dartmouth, University of South Florida, and University of Utah Salt Lake City. The 30 students were volunteers from a pool of students who had recently completed an FEE course. Students were given the results of the stage 1 Delphi study described above and asked to rank the importance and the difficulty of each concept using a five-point Likert scale.
Generation of FEECI questions and distractors
Based on the Delphi Study, eight concepts (discussed in detail in Results of CI Development section) were initially identified as being both important and difficult for students in FEE courses. Questions were formulated following the heuristic propounded by Simoni et al. (2004) for each of the initial eight concepts for possible inclusion in the FEECI test. Simoni's heuristic consists of the following guidelines:
(1) Each question should cover only a single concept. (2) The wording of the problem statement and the choice of answers should be such that little computation is required for solving the problem. (3) The answers should be framed such that the correct answer identifies proper understanding of the concept, and the incorrect answers accurately represent the students' misconceptions. (4) Each problem should not be dependent upon a nonstandard term or definition.
Each CI question consists of a “stem” (i.e., the prompt or question to be answered) along with the correct answer and a set of “distractors” (incorrect answers that represent common student misconceptions). To develop the distractors, the stems were administered to student volunteers in an open-ended or “free response” fashion, that is, without any choices for selection. Thirty student volunteers were chosen, 10 each from University of Massachusetts Dartmouth, University of South Florida, and University of Utah. Necessary IRB approvals to interact with these students were obtained from each campus (Approval No. 10.069 and its extension No. 12.032 under the “Expedited Category No. 6 & 7” at UMass Dartmouth; approval No. Pro00003722 at USF; and approval No. 00048622 at University of Utah). These student volunteers, who were either registered in the FEE course at their campus at that time or had recently completed it, were asked to describe (both verbally and in writing) their rationale or thought process for each question as they worked through the answers. The study authors took notes during these sessions. If multiple student volunteers indicated a particular misconception or incorrect approach, that misconception was used to formulate a distractor for the multiple-choice version of the FEECI. Sengupta et al. (2013) provide additional details on the development of FEECI questions and distractors.
FEECI beta tests
A beta version of the FEECI was administered to faculty volunteers during the Association of Environmental Engineering and Science Professors (AEESP) bi-annual Education and Research Conference, which was held in Golden, CO, in 2013. Faculty were asked to take the test and provide their comments or questions. They were also solicited for suggestions for additional questions to be incorporated into the FEECI. The results of the faculty responses were used to revise the CI. Subsequently, a 21-question CI was administered to 320 students at the following six campuses during the Fall 2013 or Spring 2014 semesters: University of Massachusetts Amherst, University of Massachusetts Dartmouth, University of South Carolina, University of South Florida, University of Utah, and Northeastern University. The 1-h, multiple-choice test was typically administered near the end of the semester, that is, after the concepts tested had been covered in the course during the semester. Some researchers have experimented with administering this instrument twice, at the beginning of the semester (pretest) and again at the end of the semester (posttest) and computing “normalized gain,” which is defined as (Hake, 1998):
that is, the fraction of the available improvement in score that was obtained during the course. Thus, the higher the gain, the more effective the course was in instilling fundamental concepts. We did not follow this procedure because we felt students in general would be unfamiliar with environmental engineering terminology at the beginning of the semester even though they may understand the basic principles involved. In such a situation, the pretest would not be able to provide any meaningful information.
Data analysis
Item analysis was conducted based on classical test theory (CTT) to examine the validity and reliability of the FEECI using the data from the beta tests on the six campuses. Several statistical parameters were computed to indicate the quality of individual questions and the whole test. All item and distractor analyses were conducted using the IA_CTT SAS macro developed by Yi-Hsin and Li (2015).
Item difficulty index
The item difficulty index is the proportion of students who answer a question correctly, and ranges from 1 (i.e., all students answer a question correctly) to 0 (i.e., all students answer a question incorrectly). Items with difficulty >0.90 (too easy) or <0.10 (too hard) need to be revised or eliminated from the test because they provide little information in terms of students' performance on the test.
Item discrimination indices
Item discrimination indices are used to measure how well a question distinguishes students with different performance levels on the scale of interest. Three item discrimination indices were computed in this study: (1) discrimination from two extreme groups, (2) item-total correlation, and (3) item-concept correlation.
We classified the top 25% students (Q1; ≥12 of 21 items correct) and bottom 25% of students (Q4; <7 correct items) into the highest and lowest performing groups, respectively. Then, discrimination from the two extreme groups was computed as the fraction of Q1 answering the item correctly minus the fraction of Q4 answering correctly; this metric ranges from 1 (all students in the high performing group answer a question correctly and all students in the low performing group answer incorrectly) to −1. Items with a large positive value indicate a good quality test, while items with values <0.20 are considered poor.
The item-total correlation is the correlation between the item response (1 for the correct answer and 0 for the incorrect answer) and the total score for all remaining items on the CI. A large positive item-total correlation indicates consistency between the response to an individual item and overall performance on the test. Questions with an item-total correlation <0.10 were deemed candidates for revision or elimination (Nunnally, 1967; Stone, 2006; Revelle, 2014).
The item-concept correlation was used to examine whether a response to a question was consistent with a student's performance on the concept to which that question belongs. Thus, the item-concept correlation was obtained by computing the correlation between the response to a question and the total score for the remaining questions within the same concept. A large positive item-concept correlation value for a question indicates that an individual question measures the same construct that the test or concept is designed to measure.
Cronbach's alpha
Cronbach's alpha (Cronbach, 1951) has become a standard reliability measure under CTT. The value of alpha represents the smallest fraction of total score variance that is due to true score variance (rather than errors in measurement). The minimum acceptable value for alpha is typically set at 0.70 (Litwin, 1995), such that less than 30% of the score variance is due to measurement errors (Bardar et al., 2007). For the beta test of the FEECI, the Cronbach's alpha was computed using the IA_CTT macro in SAS® (Yi-Hsin and Li, 2015).
Alpha-if-item-deleted
The alpha-if-item-deleted statistic is based on the value of Cronbach's alpha calculated if an individual item is removed from the test. A decrease in alpha means that removing a particular question makes the whole test worse, indicating that that question is a good question, whereas questions exhibiting an increase in alpha are poor questions.
Distractor analysis
We examined the frequency with which each distractor was chosen to gain insight into which misconceptions were most common for each concept. We also drew trace lines for each correct item and distractor by classifying the students into five different performing groups based on their raw scores (Table 1) and computing correct item and distractor frequency percentages for these groups. A distractor trace line is expected to decrease with increasing performing levels (i.e., higher-performing students select any incorrect answer less frequently), while a trace line for the correct option, or item characteristic line (ICL), is expected to increase with increasing performing level (i.e., higher-performing students select the correct answer more frequently).
Results of CI Development
Topic identification
A summary of the topics covered at participating universities was developed based on the syllabi provided by the faculty and is shown in Table 2. Five topics were chosen (mass balances, environmental chemistry, measurements and units, risk assessment, and environmental biology) for use in the Delphi study because they were common to FEE courses for at least half of the participating universities. It may be noted that many of the concepts that challenge students in topics such as water treatment or wastewater treatment would also fall under the heading of one (or more) of the five topics identified here. For example, “Monod growth kinetics” or “biochemical oxygen demand (BOD)” are both concepts that are applicable to the topic of wastewater treatment, but here were listed under the topic of environmental biology.
Delphi study and selection of concepts for inclusion in the CI
In stage 1 of the Delphi study, members of the FEECI Development team were asked to identify fundamental concepts within each topic area, indicate their rationale for concept inclusion, and rate the degree of importance and difficulty of each concept using a five-point Likert scale. The initial identification of concepts by the Development Team members was “open-ended.”; in other words, members were not prompted to select a concept from a menu. Instead, members were able to phrase their responses in any manner desired. Because of this, some interpretation of members' responses was required to determine whether two members had identified the same concept. For example, if one member wrote “oxygen demand,” one wrote “BOD,” and one wrote “biochemical oxygen demand,” these were all considered to be the same concept.
In stage 2 of the Delphi study, the FEECI Development Team was given a summary of the data from stage 1 and resurveyed. The data from stage 2 are shown in Fig. 1, which provides detailed information on the concepts identified, their degree of importance, and degree of difficulty. Student volunteers were given a similar survey. Based on these surveys, eight concepts were identified as both important and difficult: (1) reactor theory (RT), (2) the mass balance (MB) equation (MB), (3) BOD, (4) units of measure (UOM), (5) chemical equilibrium, (6) partitioning, (7) reaction kinetics, and (8) definition of risk. The primary criterion for this list of concepts was the frequency of responses of the development team and the student volunteers.

Concepts, degree of importance, and degree of difficulty identified by the FEECI Development Team within the following topics:
Following the question-writing sessions, concepts (7) “reaction kinetics” and (8) “definition of risk” were dropped from the CI to reduce the number of concepts tested to the practical limit. The practical limit is the number of concepts that can be covered on a CI test, as the test is expected to consist of no more than 30 questions (preferably fewer), and each concept must be represented by multiple questions to ensure reliability. Although these concepts were identified as both important and difficult, they did not readily lend themselves to CI questions that followed the heuristic described above. In addition, it was found that the concepts of (5) “chemical equilibrium” and (6) “partitioning” could be understood as a single concept, chemical equilibrium and partitioning (CEP). The final list of concepts, along with a discussion of their importance and difficulty, is provided in Table 3. Because of the trade-off between the number of concepts that can be covered and the number of questions that can be included for each concept, we deemed that the maximum number of concepts included was five, with at least four questions included for each concept.
Difficulties related to the students not having the knowledge and skills from previous coursework or experiences that faculty expect that they ought to have when enrolled in an FEE course.
BOD, biochemical oxygen demand; CEP, chemical equilibrium and partitioning; DO, dissolved oxygen; FEE, fundamentals of environmental engineering; MB, mass balance; RT, reactor theory; UOM, Units of measure.
Specific difficulties listed in Table 3 can be grouped into the fundamental categories. They are lack of (1) basic knowledge and skills; (2) understanding of constitutive concepts; (3) methods of measurements; (4) knowledge of basic scientific laws; and (5) ability to transfer knowledge to new situations. For example, difficulties shown with an asterisk (*) in Table 3 were mainly related to students not having the knowledge and skills from previous coursework or experiences that faculty expect that they ought to have when enrolled in an FEE course. Difficulties 2 and 4, although each only appears once in Table 3, may be of importance in other concepts not measured in the FEECI. Another issue is that the concepts identified by the Delphi study are themselves constituted by other concepts, such as the ability to use and apply equations properly, being able to discern between what an instrument is measuring and what the measurement indicates, or understanding the equivalency of different measures of concentration.
Generation of FEECI questions and distractors
Distractors were developed for each question based on misconceptions that became apparent when students responded to stems in an open-ended format. Distractors were designed to distinguish between a correct conceptual understanding and the most commonly identified misconceptions. For example, Fig. 2 shows an example stem for the RT concept, along with the correct answer and three distractors developed from student responses. Additional details, including samples of student responses, are provided in Sengupta et al. (2013).

Example of a FEECI test question with stem, correct answer, and distractors.
Results of CI Testing
The 21-question beta version of the FEECI was administered to 320 students on six campuses. Table 4 presents the results of the statistical analysis of the FEECI questions.
Discrim. = item discrimination from two extreme groups; Item-total correl = the correlation between the response to a question and the total score for the remaining questions; Item-concept correlation = the correlation between the response to a question and the total score for the remaining question in the same concept; Alpha-item-deleted = the test reliability when a question is removed from the test; Alpha-deleted-rank = the order that a question is deleted from the test based on alpha-item-deleted.
Test alpha = 0.577.
Item difficulty index
The item difficulty for the FEECI ranged from 0.150 (question 11; the most difficult question) to 0.763 (question 2; the easiest question). Most questions had moderate difficulty level (0.30 to 0.70). Five out of 21 questions were difficult (<0.30; questions 1, 8, 11, 18, and 20). Only question 2 was an easy question (>0.70). The average difficulty for the FEECI test was 0.426, indicating that on average 42.6% of students answered a question correctly. For the purpose of diagnosing misconceptions, the FEECI questions had appropriate item difficulty level.
Item discrimination indices
The cutoff values for discrimination from two extreme groups and item-total correlation were 0.20 and 0.10, respectively. Questions with indices less than the cutoff value were considered to be poorly discriminating questions. Questions 1, 11, 14, and 16 were identified as poorly discriminating questions based on multiple metrics. In contrast, questions 10, 17, and 19 had high discriminating power (discrimination >0.5 and item-total correlation >0.3). As for the concepts, the BOD and RT questions tended to have both higher discrimination and higher item-concept correlation, followed by the questions for MB and UOM. Except for the CEP, average discrimination values for all other concepts were well above the cutoff criteria.
Cronbach's alpha
Cronbach's alpha was calculated for the FEECI and was found to have a value of 0.58 with 95% confidence interval (0.51, 0.65). Because the alpha value was below 0.70, it suggests that the beta version of the FEECI does not have sufficient reliability to be usable to assess students' conceptual understanding of FEE. However, there are some questions about what alpha actually measures (Sijtsma, 2009; Tavakol and Dennick, 2011). Cronbach (1951) developed alpha to measure the internal consistency of a test where “internal consistency describes the extent to which all the items in a test measure the same concept or construct and hence it is connected to the inter-relatedness of the items within the test” (Tavakol and Dennick, 2011). Given that the FEECI by design measures multiple concepts across items, an alpha value of 0.58 does not mean that it should be discarded. Rather, it indicates that there may be some items that need to be revised or eliminated.
Alpha-if-item-deleted
In Table 4, there were four questions that if deleted would increase alpha (questions 1, 11, 14, and 16) where question 14 had a largest increment. In contrast, questions 10, 17, and 19 had the largest decrease in alpha. The last column in Table 4 presents item rank based on the alpha-item-deleted to indicate the item order for revision or elimination.
Distractor analysis
ICLs and distractor characteristic lines (DCLs) were used to further explore the quality of items and distractors (Fig. 3). For questions 10 and 17 (best questions), the item lines for the correct answers increase from low to high-performing groups, showing that they have high discriminating power: lower-performing groups tend to select an incorrect distractor while higher-performing groups tend to select the correct answer. However, ICLs for questions 14 and 1 (least discriminating) showed fluctuations when moving from the low to high scoring groups, indicating low discriminating power: higher-performing groups are not necessarily more likely to select the correct answer. DCLs also show discrimination power for each distractor. For question 17, distractor lines for options D and E showed better discriminating power than those for options B and C. Figure 3 suggests that options D and E for question 17 were well-operating distractors but options B and C might be implausible distractors.

Item trace lines and distractor trace lines for questions 10 and 17, which show good discrimination and questions 14 and 1, which show poor discrimination. Note that group 1 was the lowest performing group and group 5 was the highest performing group. The correct options are shown with an asterisk.
Discussion: Students' Conceptual Understanding of Fundamental Concepts
Reactor theory
The difficulty level for the four RT questions (percentage of students answering correctly) ranged from 30% to 51% on the four questions, with an average of 44%. Students generally performed well (48–51% correct) on two questions that posed hypothetical physical situations and asked students to choose which type of ideal reactor best described that situation. On questions that related more specifically to the aspects of completely mixed flow reactors and plug flow reactors, students did not perform as well (30–47% correct). An advantage of the FEECI is that it enables faculty to identify such weak areas and to plan corrective actions that can be administered either as part of the Fundamentals course, or as part of a subsequent course later in the curriculum. For instance, to help students develop better conceptual understanding regarding these different types of reactors, one possible solution would be to incorporate demonstrations or laboratory activities involving flow-through reactors, for example, tracer tests with non-reactive dyes. Such an exercise might be beyond the scope of a Fundamentals course, but administering the FEECI provides a diagnostic assessment of what concepts must be given additional attention at some point in the curriculum.
The MB equation
On four questions related to the MB, students scored 22–61% correctly, with an average of 38% correct. Compared to the other four topics, the questions related to the MB appear to have moderate difficulty, moderate ability to discriminate between high- and low-scoring students, and moderate correlation with overall student performance. MB questions were able to reveal important student misconceptions. When asked to select a material balance equation that described a hypothetical situation, 40% of students selected distractors that were not actually MB equations, but rather recognizable formulae that they may have seen during their FEE course. This suggests that many students may not have a fundamental understanding of the balance equation, its meaning, or how the balance equation can be used to describe a particular physical problem; instead, these students may try to solve MB problems by “finding the right formula.” Another pattern of misconception was that students do not properly understand that the balance equation must be applied to mass of a species or constituent, and that each term in the equation must apply to the same species or constituent, as stated in the “Difficulty” column in Table 3.
As was observed previously, identification of such difficulties via administration of the FEECI allows faculty to plan corrective actions. For instance, laboratory exercises could be easily designed that involve mass input to a system and mass output from the system; asking students to apply the balance equation to a simple physical system might help students develop the proper conceptual understanding of what the balance equation “means.” Another advantage of the FEECI is that it could be used to diagnose whether such intervention activities do, in fact, lead to enhanced conceptual understanding by the students.
Biochemical oxygen demand
In general, students did well on the BOD concept questions, with 43% answering correctly on average. On one of the five BOD questions, only 18% of students answered correctly; on the other four questions, the difficulty score ranged from 0.36 to 0.58, indicating good conceptual understanding by the students. Furthermore, BOD questions tended to score relatively high on the discrimination between high- and low-performing groups, on the correlation between individual items and the overall exam, and on the correlation between individual items and the rest of the concept group. This indicates that the BOD questions are probably reliable indicators of conceptual understanding. An analysis of the answers to the FEECI questions on BOD shows some of the common misconceptions related to this concept, which included the ideas that BOD is a measure of the dissolved oxygen (DO) concentration in a water sample, that the concentration of soluble organic substrate increases over time during a BOD test, and that wastewater containing BOD “contains bacteria that enter the water and consume oxygen.” Only 18% of students could determine the ultimate BOD concentration given a simple sketch of DO concentration versus time during a BOD test. The results point to a need to incorporate more laboratory activities and demonstrations of the BOD test into FEE classes. On the positive side, few students chose distractors that suggested that BOD provided a measure of the hazardous or toxic nature of wastewater effluents.
Units of measure
The average percentage of correct scores on the four UOM questions was 54%, the highest of any of the five concepts. However, the discrimination index and the correlation with overall results were relatively low. This probably indicates that, because most students were able to perform so well on this set of questions, the questions were not as effective at distinguishing between high- and low-scoring students. Despite the overall high scores on the UOM questions, some misconceptions are apparent from the incorrect student responses. When explicitly given concentrations of ammonium and nitrate and asked what calculation would be needed to express those concentrations “as nitrogen,” 76% of students selected the correct answer. However, when students were given a more complex stem in which ammonium is converted to nitrate, and were then asked for the conversion necessary to express the nitrate concentration “as nitrate-nitrogen (NO3−-N),” only 35% answered correctly. This contrast may be indicative of students memorizing a procedure (i.e., how to convert given concentrations of NH4+ and NO3− to concentrations “as nitrogen”) rather than developing a true conceptual understanding of this system of units. The results indicate that faculty should not assume that students have gained required skills for addressing UOMs in prerequisite courses, such as chemistry and physics, particularly when dealing with units particular to environmental engineering.
Chemical equilibrium and partitioning
Three of the four questions on CEP were found very difficult by students, with difficulty scores ranging from 0.15 to 0.32. These three questions all had discrimination indices below 0.10 and correlation values below 0.10. This suggests that, as a whole, the CEP questions were “too difficult” and therefore were not able to distinguish between high- and low-scoring students. Of these three “too difficult” questions, two will be revised for future versions of the FEECI, and one will be replaced altogether as further analysis suggested that the question was conflating multiple concepts, and therefore not truly testing students' conceptual understanding of chemical equilibrium. Despite that, an analysis of the incorrect answers selected by students is still able to reveal some common misconceptions related to this topic. In particular, one of the most common misconceptions is that the solubility product (Ksp) for precipitation-dissolution equilibrium quantifies a relationship for the change in the concentration of the ions involved, rather than a relationship between the equilibrium concentrations.
As with the other concepts discussed above, identification of difficulties in CEP via administration of the FEECI allows faculty to plan corrective actions. For instance, Visual MINTEQ, a freeware chemical equilibrium modeling software can be incorporated into an FEE course. Students can be presented with scenarios of aqueous chemical systems and asked to apply the governing equations in Visual MINTEQ, and check the results generated by the software. This exercise might help students develop the proper conceptual understanding of relationship between species under equilibrium conditions. Another advantage of the FEECI is that it could be used to diagnose whether such intervention activities do, in fact, lead to enhanced conceptual understanding by the students.
Implications and Limitations of Analysis
Although FEE courses are taken by students in many engineering disciplines and even nonengineering majors, and although prerequisites for these courses vary widely, the FEECI was able to distill this wide spectrum to some universal concepts that can be tested in a 1-h instrument. Beta testing provided the authors a list of misconceptions commonly made by students. Nationwide online testing will provide more granular data, but beta testing has already set into motion a discussion about how to remedy this situation: can new technologies (e.g., computer animation, visualization software, chemical equilibrium software, etc.) help in this effort? Should the laboratory experiments that usually accompany an FEE lecture be modified? Would tighter integration of some concepts (e.g., BOD with chemical equilibrium) make the subject more holistic for the student?
Summary
FEE is a foundational course needed for engineers and scientists who hold the future of our national and global environmental and public-health infrastructure. An analysis of syllabi from representative U.S. universities and colleges was carried out to identify common topics in FEE. A Delphi process with both faculty experts and students was used to identify concepts that are both important and difficult to understand for FEE students. A beta version of the CI was developed through detailed question writing sessions and feedback from faculty at a major environmental engineering education conference. The beta test was given to FEE students at six universities and a detailed psychometric analysis of the results was undertaken. Although the psychometric analysis revealed that several of the FEECI questions need to be reevaluated, the results revealed many misconceptions related to concepts of RT, the MB equation, BOD, UOM, and CEP. By understanding these misconceptions, environmental science and engineering faculty can improve students' conceptual understanding of FEE concepts. FEECI will be made available to interested FEE faculty members provided that the instrument is not compromised.
Footnotes
Acknowledgments
This material is based upon work supported by the National Science Foundation, under grant number DUE 1044063. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors would like to thank the following faculty members for their contributions to this research: Caitlyn Butler, Eileen Cashman, Amy Chan Hilton, Ryan Dupont, Elizabeth Eschenbach, Joseph Flora, Kristen Jellison, Laurie McNeill, Dan Oerther, Annalisa Onnis-Hayden, Chul Park, Jason Ren, Maya Trotz, and Yeomin Yoon. Andrea Stone and Dilek Özalp assisted with data analysis.
Author Disclosure Statement
No competing financial interests exist.
