The power of normalised word vectors for automatically grading essays

Latent Semantic Analysis, when used for automated essay grading, makes use of document word count vectors for scoring the essays against domain knowledge. Words in the domain knowledge documents and essays are counted, and Singular Value Decomposition is undertaken to reduce the dimensions of the se...

Full description

Bibliographic Details
Main Author: Williams, Robert
Format: Journal Article
Published: The Informing Science Institute 2006
Subjects:
Online Access:http://proceedings.informingscience.org/InSITE2006/IISITWill155.pdf
http://hdl.handle.net/20.500.11937/46415
_version_ 1848757550599634944
author Williams, Robert
author_facet Williams, Robert
author_sort Williams, Robert
building Curtin Institutional Repository
collection Online Access
description Latent Semantic Analysis, when used for automated essay grading, makes use of document word count vectors for scoring the essays against domain knowledge. Words in the domain knowledge documents and essays are counted, and Singular Value Decomposition is undertaken to reduce the dimensions of the semantic space. Near neighbour vector cosines and other variables are used to calculate an essay score. This paper discusses a technique for computing word count vectors where the words are first normalised using thesaurus concept index numbers. This approach leads to a vector space of 812 dimensions, does not require Singular Value Decomposition, and leads to a reduced computational load. The cosine between the vectors for the student essay and a model answer proves to be a very powerful independent variable when used in regression analysis to score essays. An example of its use in practice is discussed.
first_indexed 2025-11-14T09:29:53Z
format Journal Article
id curtin-20.500.11937-46415
institution Curtin University Malaysia
institution_category Local University
last_indexed 2025-11-14T09:29:53Z
publishDate 2006
publisher The Informing Science Institute
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-464152017-01-30T15:27:08Z The power of normalised word vectors for automatically grading essays Williams, Robert Normalised Word Vectors Multiple Regression Analysis Singular Value ecomposition Automated Essay Grading Latent Semantic Analysis AEG Electronic Thesaurus Latent Semantic Analysis, when used for automated essay grading, makes use of document word count vectors for scoring the essays against domain knowledge. Words in the domain knowledge documents and essays are counted, and Singular Value Decomposition is undertaken to reduce the dimensions of the semantic space. Near neighbour vector cosines and other variables are used to calculate an essay score. This paper discusses a technique for computing word count vectors where the words are first normalised using thesaurus concept index numbers. This approach leads to a vector space of 812 dimensions, does not require Singular Value Decomposition, and leads to a reduced computational load. The cosine between the vectors for the student essay and a model answer proves to be a very powerful independent variable when used in regression analysis to score essays. An example of its use in practice is discussed. 2006 Journal Article http://hdl.handle.net/20.500.11937/46415 http://proceedings.informingscience.org/InSITE2006/IISITWill155.pdf The Informing Science Institute restricted
spellingShingle Normalised Word Vectors
Multiple Regression Analysis
Singular Value ecomposition
Automated Essay Grading
Latent Semantic Analysis
AEG
Electronic Thesaurus
Williams, Robert
The power of normalised word vectors for automatically grading essays
title The power of normalised word vectors for automatically grading essays
title_full The power of normalised word vectors for automatically grading essays
title_fullStr The power of normalised word vectors for automatically grading essays
title_full_unstemmed The power of normalised word vectors for automatically grading essays
title_short The power of normalised word vectors for automatically grading essays
title_sort power of normalised word vectors for automatically grading essays
topic Normalised Word Vectors
Multiple Regression Analysis
Singular Value ecomposition
Automated Essay Grading
Latent Semantic Analysis
AEG
Electronic Thesaurus
url http://proceedings.informingscience.org/InSITE2006/IISITWill155.pdf
http://hdl.handle.net/20.500.11937/46415