Bransford, Darling-Hammond, and LePage (2005) indicate that quality instruction requires teachers to navigate learner characteristics, content knowledge, and pedagogy through social and political climates. In the past 30 years, every state has increased requirements for teachers to enter the profession (Zumwalt & Craig, 2005). Yet assessing effective teachers remains difficult; although teacher inputs play a role, so too do student prior achievement and family and community norms (Fallon, 2006; Hanushek & Rivken, 2006, 2010). Traditionally, assessment of teacher quality relied on standardized testing, peer and administrator observations, student achievement, and related activities. Yet many aspects of effective teaching remain elusive with these assessments (Cochran-Smith & the Boston College Evidence Team, 2009; Nazier, 1997).
Standardized test scores aligned to state and national standards are common measures of teacher and school quality in the United States. They allow administrators and policy makers to track and compare student achievement over time. They also certify that teachers have sufficient background knowledge in specified subject areas. However, standardized tests measure content knowledge while minimally considering classroom practice, student background, or individual needs and circumstances. To complement standardized subject assessment exams, teacher education programs provide multiple field experiences for purposes of guidance, methods exploration, and student accountability. Shaping and measuring the success of these programs is difficult and costly. Placements are often made at schools located several miles from university settings. Cooperating teachers and mentors are sometimes selected on availability rather than quality. Periodic observations require large time commitments, may interrupt classroom dynamics, and are limited in number. In-service teachers face similar challenges. Although support personnel reside within local schools, they often have their own teaching or administrative responsibilities and may lack content knowledge to judge instructional quality (Kelley, 2004; West, Rich, Shepherd, Recesso, & Hannafin, 2009).
To better measure teacher effectiveness, Fallon (2006) and the National Research Council (2005) suggest that researchers turn to longitudinal, quantitative studies focused on student achievement. However, others question the type and quality of evidence these studies reveal. Cochran-Smith and the Boston College Evidence Team (2009) claim that value questions embedded within educational research may not be addressed with traditional quantitative methods and recommend using multiple approaches to explore teacher effectiveness. Cochran-Smith (2006) suggested that researchers must consider the purpose of evidence collection, who collected it, and under what contexts as they examine teacher effectiveness. These circumstances guide researchers on the quality and usefulness of evidence collections.
In addition to these concerns, teachers often lack proper training to conduct rigorous, scientific research and lack access to effective sample sizes (Fallon, 2006; Rippon & Martin, 2006). A teacher's primary responsibility is instruction. It is doubtful that teachers can conduct advanced quantitative research while simultaneously meeting students' learning needs. Researchers who possess these credentials are often distanced from classroom settings and may fail to capture relevant and adequate measures of classroom instruction. While not minimizing the importance of evidence identified through complex quantitative studies, these approaches are unrealistic for most teachers, who must rely on their own devices for classroom evaluation and development. Alternative evidence collection, organization, and analysis methods are needed. The purpose of this article is to propose a framework of evidence selection and organization through portfolio development and provide guidance for evidence selection that accommodates valid formative practices in classroom settings.
Portfolios to Capture and Evaluate Teaching
Portfolios are collections of purposefully selected materials, organized to depict and examine professional practice (Grossman, 2005; Hartmann, 2004). Unlike experimental and quasi-experimental studies that require numerous participants to draw generalized conclusions, portfolios may allow practicing teachers to focus on themselves and their classrooms for purposes of assessment and professional development. Portfolios are believed to unobtrusively document events through classroom artifacts gathered by instructors and support personnel (Habib & Wittek, 2007; Zepeda, 2002). Since the mid 1980s many teacher education programs turned to portfolios to document teacher practice and growth; promote inquiry, reflection, and skill development; and assess competency (Dhonau & McApline, 2005; Hallman, 2007; Rickards et al., 2008; Wetzel & Strudler, 2005).
Land and Zembal-Saul (2003) found that portfolios documented the thought processes of 20 preservice teachers who were exploring properties of light (e.g., reflection, refraction). Using factor analysis to compare four portfolio implementations across pre- and in-service teachers, Beck, Livne, and Bear (2005) found that professional development portfolios heightened perceived knowledge of teacher roles, reflective practices, and peer collaboration. Portfolios may also structure mentoring relationships and help novice teachers examine their own and others' practices (Kelley, 2004; Orland-Barak, 2005; Redish, Webb, & Jiang, 2006). Rolheiser and Schwartz (2001) reported that teachers who incorporated portfolios to assess student performance clarified their own teaching philosophies and developed structured arguments regarding student achievement to share with parents and administration. The act of identifying, collecting, organizing, and examining teaching artifacts to recreate events may help teachers reconstruct and improve practice.
Teachers completing National Board for Professional Teaching Standards (NBPTS) certification must create portfolios that include video recordings, student work samples, and other commentaries to depict classroom practice and aid summative assessment (NBPTS, 2010; Silver, Mesa, Morris, Star, & Benken, 2009). In these instances pre- and in-service teachers gather evidence of classroom practice, organize them in portfolio entries, and examine and evaluate their work through reflective entries, standards-based practices, and program expectations. Consistent with claims by Cochran-Smith (2009), these portfolios may allow for collections of evidence to depict multi-faceted elements of practice as opposed to relying on single assessments. They may also foster longitudinal examinations through sustained collection and analysis (Anderson & Friesen, 2004; Heinrich, Bhattacharya, & Rayuda, 2007).
Limitations of Portfolio Evidence
Yet portfolios have limitations. Researchers describe needs for teacher training and support to complete these assessments (Fallon & Watts, 2001; Shepherd & Hannafin, 2008). Additionally, little research has examined the validity of portfolio findings or students' ability to collect and organize evidence that justifies conclusions.
Practicality. Effective evidence collection and examination require dedicated time and planning (Fallon & Watts, 2001; Hadley, 2007; Kjaer, Maagaard, & Wied, 2006). Prior to evidence collection, teachers must identify portfolio purposes as well as tangible artifacts that accurately represent those purposes. Teachers gather these artifacts while simultaneously attending to lesson objectives, student needs, and other teaching responsibilities. Once artifacts are gathered, they must be organized and presented. This often requires written descriptions identifying included artifacts and articulating purposes. Next, teachers examine evidence, draw conclusions, and develop action plans for improvement (Recesso et al., 2009). To facilitate evidence collection and analysis, many programs turn to direct coaching supplemented with portfolio question prompts (Strudler & Wetzel, 2005; Wray, 2007; Zepeda, 2002). Others develop detailed guidelines for artifact collection and examination (NBPTS, 2010; Silver et al., 2009).
Despite personal coaching and detailed guidelines, little research focuses on the practicality of sustained evidence collection among teachers. Indeed, Burroughs (2001) questioned the validity of National Board Certification because teachers were asked to reflect and write in ways that were not taught or expected of the profession. Although summative methods exist to capture and examine teaching practice, more research is needed on the feasibility of methods for sustained, formative inquiry among teachers.
Validity and reliability. Ultimately portfolios are containers that house and disseminate evidence collections. Thus, individual and programmatic purposes influence what artifact collection and examination methods are deemed relevant. Large variability in portfolio practice exists. Wade and Yarbrough (1996), Zeichner and Wray (2001), and Zepeda (2002) argue that researchers must explain portfolio purposes, artifact selection and organization procedures, degree of creative freedom, provided supports, and assessment criteria to clarify outcomes and compare approaches. Identifying procedures for artifact selection and examination helps researchers to compare intentions, rules, and standards applied to individual practice and their influence on portfolio outcomes (Darling, 2001). Yet few studies focus on evidence selection and examination.
The extent that artifacts justify portfolio outcomes is questionable. While examining six measures of validity among 128 preservice teacher portfolios, Yao et al. (2008) found that artifacts matched portfolio purposes and standards but were under-represented and based primarily on reflective entries...