Understanding postsecondary non-proficient writing using human-scored and automated measures

Understanding postsecondary non-proficient writing using human-scored and automated measures

First Author: Dolores Perin -- Teachers College Columbia University
Additional authors/chairs: 
Mark Lauterbach
Keywords: Writing performance, Adults With Low Literacy Skill
Abstract / Summary: 

Based on the theoretical frameworks of Hayes (1996) and Berninger and Winn (2006), the purpose of this study was to investigate the persuasive writing of postsecondary students with low academic skills, in comparison with higher performing students. This was a descriptive study using correlations, analyses of variance, cluster analysis and discriminant analysis. Participants were N=65 postsecondary developmental education/ remedial students, N=72 typically-performing undergraduates and N=112 Masters students. Participants responded to two prompts on controversial topics. Twelve human-scored and automated measures covering writing quality, vocabulary usage and linguistic aspects of writing were analyzed. Writing quality was measured using a human-scored 7-point holistic scale and the total score from the automated Project Essay Grade (PEG) tool. Vocabulary usage was assessed using the automated VocabProfile measure as well as two measures from Tool for the Automatic Analysis of Cohesion (TAACO). Linguistic measures consisted of six TAACO variables assessing density and cohesion of writing. Results included a lack of statistically significant differences between English and Spanish native speakers. Not surprisingly, differences were found in the quality of writing among the three groups, with the developmental education students producing the lowest quality. Several cluster analyses were conducted, resulting in two reliable clusters based on writing quality. Cluster membership aligned with only 68% of educational placement (developmental, undergraduate or Masters), suggesting the presence of non-proficient writers even at the higher educational levels. Discriminant analysis predicting cluster membership from the automated measures resulted models ranging from 59% to 73% accuracy.