CO-INVESTIGATORS: Joel Kuipers, Curtis Pyke, Michael Szesze, Bonnie Hansen-Grafton
PROJECT OVERVIEW: Background: Despite the best intentions to increase equity and to close achievement gaps, the science education reform movement has failed to adequately respond to the diversity of the U.S. student population. For instance, eighth grade science data from the 2000 National Assessment of Educational Progress (NAEP) disaggregated by a proxy for SES--the educational level of parents --showed that for 13-year-old students whose parents had low levels of education, NAEP science scores decreased over the previous nine years (National Center for Educational Statistics, 2000). One possible reason for this situation is the dearth of good curriculum materials that improve student outcomes, as well as concerns that curriculum materials may not reach students currently underserved by science education. Moreover, reform-based curriculum materials designed to increase deep understanding of science concepts have failed to scale-up; traditional textbooks still dominate instruction in U.S. science classrooms. In response, SCALE-uP will explore the effectiveness and scale-up of three different highly-rated middle school science curriculum units, using both quantitative and qualitative research methods.
Purpose: The purpose of SCALE-uP is to explore, using both quantitative and qualitative research methods, the effectiveness and scale-up of three different middle school science curriculum units "highly rated" according to the American Association for the Advancement of Science (AAAS) Project 2061 Curriculum Analysis (AAAS, 2003). Research goals include conducting a series of rigorously designed quasi-experiments on the effectiveness of the units in middle schools characterized by high levels of student diversity, and disaggregating data (by gender, ethnicity, and eligibility for the Free and Reduced-price Meals System [FARMS], English for Speakers of Other Languages [ESOL], or special education [SPED] services) to understand the impact of the materials on various subgroups of students. In addition, the ethnographic analyses of video data illuminate how the curriculum materials function for students in science classrooms. The study will also explore how the units move to scale over five years, analyzing fidelity of implementation, scale, and school science departments increasing experience with the units.
Interventions: SCALE-uP is studying the effects of three "highly rated" middle school science curriculum units. Each unit focuses on a specific, challenging target idea and each has a different instructional profile; however, all tend to fulfill more requirements of the Curriculum Analysis than do traditional textbooks. The three units are: Chemistry That Applies (State of Michigan, 1993); The Real Reasons for Seasons: Sun Earth Connections (Great Explorations in Math and Science, Lawrence Hall of Science, University of California at Berkeley, 2000); and ARIES: Exploring Motion and Forces: Speed, Acceleration, and Friction. (Harvard-Smithsonian Center for Astrophysics, 2001).
Setting: SCALE-uP is being conducted from 2001 - 2007 in Montgomery County Public Schools (MCPS), Maryland--a large and socio-economically, ethnically, and linguistically diverse school district comprised of almost 140,000 students representing 163 countries and speaking 123 languages, and located in the suburbs of Washington, DC. The study samples were constructed carefully to represent schools, classrooms, and students across this school district.
Research Design: Primary Methodologies: Quasi-experimental; video ethnography Secondary Methodologies: Correlational, descriptive, interview, observation, statistical modeling, statistical survey.
In the quasi-experiments, the population under investigation is students from 38 middle schools in a large, diverse school system (N = 31,600 middle school students). The sampling frame consists of 35 MCPS middle schools divided into five different School Profile Categories (SPC) stratified by FARMS as a proxy for SES, with approximately seven schools in each category. The study samples reported for each grant year, Years 0 - 5, are summarized in Table 1. While the sampling unit is the school, the unit of analysis is usually at the student level, although some of the analyses have explored the nested effects at the classroom level. When studying the characteristics of implementation, which compares the implementations of curriculum materials in Treatment and Comparison conditions, the classroom was used as the unit of analysis (n = 100 classrooms over two years).
The samples in the video ethnographic studies include one classroom per year, with a focus on groups of four diverse students at lab tables, as the Treatment curriculum units are enacted.
Schools were randomly assigned to Treatment or Comparison Condition (see study sample, above) using stratified sampling. Consequently, students in the Comparison groups are demographically extremely well matched with the Treatment groups. Moreover, analyses of pretest scores and of other standardized test scores available have indicated that students in the Comparison condition are extremely similar in their specific knowledge of the target science concepts, and in reading and mathematics. The curriculum materials used in Comparison classrooms included a variety of materials to teach the seasons and motion and forces benchmarks in 2003-06. Most materials were selected from a menu of options available to teachers from the MCPS central office. Although the materials used in the Comparison condition were similar for teachers within schools, they varied somewhat from school to school. A comprehensive analysis of the Comparison condition is available in the "Characteristics of Implementation Internal Report" available on the SCALE-uP website.
All originally-collected data and SCALE-uP protocols are available in the Internal and Annual Reports posted on the SCALE-uP web site (www.gwu.edu/~scale-up) as follows:
Dependent variables for the quasi-experiments include student scores on assessments of the target idea, and scores on measures of goal orientation, engagement, and epistemological views of science. Therefore, outcomes are measured using three content assessments developed by national assessment experts and include the following: Conservation of Matter Assessment (COMA); Causes of Seasons Assessment (CSA); and Motion and Forces Assessment (MFA). A Validation Interview Protocol (VIP) has also been developed for each content area. Motivation and engagement has been assessed using the Science Learning Orientation and Engagement Questionnaire (SLOESQ), an instrument composed of previously validated scales (Marks, 2000; Midgely et al., 2000). Outcomes have been measured each year of the grant (2001-2007) as outlined in Table 1. Methods employed to analyze the data include ANCOVA, ANOVA, and HLM.
Ethnographic outcomes are measured using ethnographic codes developed by SCALE-uP and are summarized in the Atlas.ti Training Manual. Outcomes have been measured during Years 0 - 4 of the grant and include discourse analysis.
Characteristics of implementation in the Treatment and Comparison classrooms are measured using the following instruments developed by SCALE-uP: Instructional Strategies Classroom Observation Protocol (ISCOP); Lesson Flow Classroom Observation Protocol (LFCOP); Student Responsiveness Questionnaire (SRQ); and Comparison Teacher Interview Protocol (CIP). Outcomes have been measured in Years 3 and 4; methods employed include descriptive data, t-tests, bivariate analysis, and multiple regression.
Fidelity of implementation in the Treatment condition has been measured using two instruments developed by SCALE-uP: Adherence Classroom Observation Protocol (ACOP) and Adherence Interview Protocol (AIP), as well as instruments used above for characteristic of implementation. Student outcomes have been measured in Year 4; methods employed include bivariate analysis and multiple regression.
Findings: Quasi-experimental results:
CTA: In Year 0 (2001-2002; see Table 1), SCALE-uP investigated CTA in five matched pairs of MCPS middle school classrooms and replicated the study in Year 1. The results of those studies show that all ethnically, linguistically, and socioeconomically diverse subgroups of students demonstrate greater understanding of the benchmark concept Conservation of Matter when CTA was used than other curricula addressing the same target benchmark, with small to medium effect sizes.
M&F: Results from the quasi-experiment in Year 2 showed that on average M&F was only somewhat more effective overall than the menu of curriculum options available to MCPS teachers, but there were no significant differences in Year 3; however, disaggregated data in both years showed that while assessment scores increased for all subgroups from the pretest to the posttest, M&F seemed to have differential effects on some subgroups of students, with the Comparison condition superior for students eligible for FARMS or ESOL services. When Year 2 M&F data were analyzed with HLM, which utilizes both individual and classroom level variables simultaneously, differential effects of M&F for different demographic subgroups were not observed but the overall effect of M&F was maintained. The implications of these contrasting analyses are still being explored. M&F is therefore currently being replicated for a third time in a new set of five matched pairs of schools to ascertain if these results hold, with great attention to reasons for these patterns.
Seasons: The analyses of the Year 2 and 3 data for the Seasons unit showed that although students learned with the Seasons curriculum, the average level of understanding of the benchmark concepts was statistically significantly lower for students in the Treatment condition than for students in the Comparison condition, with small to medium effect sizes. Disaggregated results show that outcome scores were significantly greater in the Comparison condition for all subgroups except for students currently receiving special education services. When Year 2 Seasons data were analyzed with HLM, the overall effect of Seasons was maintained, but the HLM results produced a larger effect size favoring the Comparison condition.
By using codes as analytical dimensions for interpreting the video and transcription data, SCALE-uP is able to sort, classify, and analyze video data in ways that provide insights into how each curriculum unit functions from the standpoint of the students themselves. Ethnographic findings show that students own sense-making practices differed markedly between two of the three units that were investigated. We compared the units in terms of the ways in which students used scientific terms, engaged in object manipulation, and employed printed material. While students in CTA were provided relatively few nominalizations in the textbooks, they invented and used many of their own to describe their experiences: a hallmark of "talking science" (Lemke 1990; Halliday & Martin, 1993). Indeed, students in CTA increased their use of scientific nominals over time. In M&F, while text materials provided a number of technical nominalizations, the students themselves almost never used them. We also found that students in M&F seldom read from the text in solving their problems, but often wrote into it, engaging in far more acts of writing than students in CTA, including several challenging graphing projects (see Roth, 2001). By contrast, students in CTA often read aloud from the text as a way of solving the problems. The students in M&F were required to make sense of the material haptically, engaging with the materials in a tactile manner, and they all did so. In CTA, students also had hands-on experiences, but the distribution of those experiences was more uneven (see Slavin, 1981). Finally, given the results from the quasi-experiment, Year 2, we videotaped two complete enactments of the seasons benchmark in Year 3, one in a Treatment classroom and one in a Comparison classroom. Although these video data are not yet fully analyzed, preliminary analyses indicate that in the Seasons Treatment class the interaction is relatively teacher-centered, with little direct manipulation of relevant phenomena. Control over relevant information to complete the tasks in the curriculum unit rested largely in the hands of the teacher. In contrast, in the Comparison class, control over the information to complete the curriculum tasks lay more with the students, and the sources of relevant information were the textbooks, direction experience of relevant phenomena, and group deliberations.
PROJECT PUBLICATIONS: Lynch, S., Kuipers, J., Pyke, C., & Szesze, M. (2005). Examining the effects of a highly rated science curriculum unit on diverse populations: Results from a planning grant. Journal of Research in Science Teaching, 42(8), 912-946.
Lynch, S., Szesze, M., Pyke, C., & Kuipers, J. (in press). Scaling-up highly rated middle science curriculum units for diverse student populations: Features that affect collaborative research, and vice versa. Forthcoming in B. Schneider and S. McDonald (Eds.) Scale-Up in Education, Vol. 1: Practice. Lanham, MD: Rowman & Littlefield.
Lynch, S., Taymans, J. Watson, W. & Pyke, C. (in press). Scaling up highly rated curriculum units for students with disabilities in mainstream classrooms: Initial findings and implications for scale-up. Exceptional Children.
Lynch, S. (in preparation). ISO metaphor and theory for scale-up research: Eagles in the Anacostia and activity systems.
Ochsendorf, R., Lynch, S., & Pyke, C. (in preparation). Evaluating a science curriculum unit: Learning through the process.
Pyke, C., & Bangert-Drowns. (in preparation). Highly rated curriculum and conceptual change: Goals, engagement, and equity.
Viechnicki, G., & Kuipers, J. (in review). It s all human error! Verbal negotiation of scientific authority in a middle school science classroom.
Jones, L., & Kuipers, J. (2006, February). Atoms and...stuff like that: Scientific term use as participation in a middle school science classroom. Paper presented at the annual meeting of the Ethnography in Education Research Forum, Philadelphia, PA.
Kuipers, J. (2004, April). "Lesson purpose" in its social and cultural context: Video ethnographic evidence on the functioning of a highly rated curriculum unit in diverse middle school science classrooms. Paper presented at the annual meeting of the American Educational Research Association, San Diego, CA.
Kuipers, J., Viechnicki, G.B., Fugate-Brunino, E., Jones, L. & Wright, L. (2006, April). A video ethnography of talking and writing about motion and forces in a diverse science classroom. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
Kuipers, J., Viechnicki, G.B., Fugate-Brunino, E., Jones, L., Wright, L. (2006, April). Learning to Talk Science: "Push" and "Pull" in a Diverse Classroom. Paper presented at the Annual Meeting of the National Association for Research in Science Teaching, San Francisco, CA.
Lynch, S. (April, 2006). An overview of SCALE-uP and results for Chemistry That Applies. Are "highly-rated" middle school science curriculum materials effective and for whom?: Results from a set of implementation studies. Paper set presented at the Annual Meeting of the National Association for Research in Science Teaching, San Francisco, CA.
Lynch, S., & O'Donnell, C. (2005, April). Examining the fidelity of implementation of highly rated middle school science curriculum materials. Paper presented at the Annual Meeting of the American Educational Research Association, Montreal, Canada.
Lynch, S., & O'Donnell, C. (2005, April). The role of fidelity of implementation in experimental and quasi-experimental research designs: Applications in four studies of innovative science curriculum materials and diverse student populations. Symposium presented at the annual meeting of the American Educational Research Association, Montreal, Canada.
Lynch, S., O'Donnell, C., Merchlinsky, S., Hatchuel, E., & Rethinam, V. (2006, April). The role of the "comparison group" in quasi-experimental research designs: Unexpected results from an effectiveness study of middle school curriculum units. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
Ochsendorf, R., Pyke, C., Lynch, S., Watson, W. (April, 2006). The impact of a middle school motion and forces curriculum unit on student outcomes: Results from consecutive quasi-experimental studies. Paper presented at the Annual Meeting of the National Association for Research in Science Teaching, San Francisco, CA.
Ochsendorf, R., Watson, W., Pyke, C., Lynch, S. (April, 2006). The impact of a highly rated science curriculum unit on middle school students' epistemological beliefs: Implications for student learning. Paper presented at the Annual Meeting of the National Association for Research in Science Teaching, San Francisco, CA.
O'Donnell, C., Kuipers, J., & Lynch, S. (2004, April). When numbers get in the way of student understanding in science: An ethnographic case study. Paper presented at the annual meeting of the American Educational Research Association, San Diego, CA.
O'Donnell, C., Watson, W., Pyke, C., Lynch, S. (April, 2006). Using the Project 2061 Curriculum Analysis to understand the results of a quasi-experimental curriculum unit evaluation of Seasons. Paper presented at the Annual Meeting of the National Association for Research in Science Teaching, San Francisco, CA.
Pyke, C., & Ochsendorf, R. (2006, April). Concept assessment in curriculum unit evaluation. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
Rethinam, V., Pyke, C., & Lynch, S. (2006, April). Examining individual and classroom factors that influence students' gain score in two highly rated science curriculum units. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA.
Watson, W., Kuipers, J., & Lynch, S. (2006, February). Competing discourses in the science classroom: Argumentation and social status in middle school science lessons. Paper presented at the annual meeting of the Ethnography in Education Research Forum, Philadelphia, PA.
Watson, W., Lynch, S., Rethinam, V., O'Donnell, C. (2006, April). Development of an instrument to measure student responsiveness to implementation of science curriculum materials. Paper presented at the Annual Meeting of the National Association of Research in Science Teaching, San Francisco, CA.
Wright, L., Viechnicki, G.B., & Kuipers, J. (2006, February). Hands-off science: The impact of social dynamics on object control and manipulation at a middle school science table. Paper presented at the annual meeting of the Ethnography in Education Research Forum, Philadelphia, PA.
ON THE WEB: Two web sites have been developed to support the SCALE-uP study. The GWU web site contains all publicly available descriptions of and reports on the SCALE-uP study, including academic publications, internal reports and instrumentation, conference papers, and press releases. URL's include: