Identifying salient academic words in content-area texts using semantic network centrality measures

Identifying salient academic words in content-area texts using semantic network centrality measures

First Author: Jeff Elmore -- MetaMetrics
Additional authors/chairs: 
Jill Fitzgerald
Keywords: Academic Language, science, Informational Text, Semantics, Measurement
Abstract / Summary: 

Purpose:
Many factors are considered in selecting words from texts for instructional focus. Academic words can be selected automatically from texts using online tools, however automatically identifying the words that are centrally important in texts is less well explored. We ask: Do semantic network centrality measures of academic words align with subjective judgments of salience for key concepts in a text set?

Methods:
Using lists produced in a prior study, we selected general and science-specific academic words from a text-set on survival and adaptation developed at the READS Lab, led by Dr. James Kim. A semantic network of the academic words was produced computationally and four network centrality measures were calculated for each word: degree, closeness, betweenness, and constraint.

A scale of topic salience for the academic words was established with a best-worst-scaling approach using crowd-workers. Judgments were analyzed using a Rasch model to create a continuous scale. The scale was regressed on the four network centrality measures along with word frequency in the texts.

Results:
In total 229 general and science-specific academic words were identified from the texts.

Salience scale separation index reliability was 0.74 indicating 2 to 3 reliably separable strata. Top 5 words were: survive, climate, camouflage, adapt, and environment; middle 5 words were: tusk, trilobite, observation, orangutan, and fern; bottom 5 words were: rhinoceros, example, wave, argument, and bulky.

Centrality measures explained 16% additional variance in salience scale over word frequency alone.

Conclusions:
Network centrality measures show promise for identifying salient academic words in texts.