The development of a corpus-informed list of formulaic sequences for language pedagogy

Discussion around the importance and prevalence of multiword expressions in the lexicon and the teaching of vocabulary has existed for a number of years in applied linguistics (e. g. lrujo, 1986; Pawley and Syder, 1983; Sinclair, 1987; Wray, 2002). While there seems to be a general agreement among s...

Full description

Bibliographic Details
Main Author: Martinez, Ron
Format: Thesis (University of Nottingham only)
Language:English
Published: 2011
Online Access:https://eprints.nottingham.ac.uk/12963/
_version_ 1848791618679734272
author Martinez, Ron
author_facet Martinez, Ron
author_sort Martinez, Ron
building Nottingham Research Data Repository
collection Online Access
description Discussion around the importance and prevalence of multiword expressions in the lexicon and the teaching of vocabulary has existed for a number of years in applied linguistics (e. g. lrujo, 1986; Pawley and Syder, 1983; Sinclair, 1987; Wray, 2002). While there seems to be a general agreement among scholars that formulaic language should feature in language learning and, perhaps to a lesser extent, language testing, there appears to be rather less agreement when it comes to how to select and/or prioritize specific items for inclusion. One criterion for selection which has been used often for vocabulary items of single words is frequency (i.e. how relatively common a word is), data for which can be consulted using various frequency lists that have long existed and are in the public domain, such as the General Service List (West, 1953). However, to date, no list of formulaic language that could be considered comparable to the General Service List in terms of intended use and relevance to language instruction has been attempted. The work presented in the present thesis aims to address this lack. The thesis first presents the need for such a list, and then describes the methodology employed by the researcher to ultimately produce a frequency-informed and pedagogically-relevant list of multiword expressions that can be used in conjunction with existing lists single orthographic words to help inform such instruments of L2 pedagogy as language textbooks and language tests, entitled the PHRASal Expressions List, or PHRASE List. To that end, two projects are also presented in the thesis which exemplify ways in which the list may be usefully employed. The first is a research validation exercise carried out in collaboration with the English Profile project in order to compare the phraseological component of the English Profile Wordlist to the expressions in the PHRASE List. The second project presents the development and validation of a kind of vocabulary test that samples from the PHRASE List, and which is intended to be used to supplement knowledge assessed in existing tests of single orthographic words, such as the Vocabulary Size Test (Nation & Beglar, 2007).
first_indexed 2025-11-14T18:31:23Z
format Thesis (University of Nottingham only)
id nottingham-12963
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T18:31:23Z
publishDate 2011
recordtype eprints
repository_type Digital Repository
spelling nottingham-129632025-02-28T11:22:22Z https://eprints.nottingham.ac.uk/12963/ The development of a corpus-informed list of formulaic sequences for language pedagogy Martinez, Ron Discussion around the importance and prevalence of multiword expressions in the lexicon and the teaching of vocabulary has existed for a number of years in applied linguistics (e. g. lrujo, 1986; Pawley and Syder, 1983; Sinclair, 1987; Wray, 2002). While there seems to be a general agreement among scholars that formulaic language should feature in language learning and, perhaps to a lesser extent, language testing, there appears to be rather less agreement when it comes to how to select and/or prioritize specific items for inclusion. One criterion for selection which has been used often for vocabulary items of single words is frequency (i.e. how relatively common a word is), data for which can be consulted using various frequency lists that have long existed and are in the public domain, such as the General Service List (West, 1953). However, to date, no list of formulaic language that could be considered comparable to the General Service List in terms of intended use and relevance to language instruction has been attempted. The work presented in the present thesis aims to address this lack. The thesis first presents the need for such a list, and then describes the methodology employed by the researcher to ultimately produce a frequency-informed and pedagogically-relevant list of multiword expressions that can be used in conjunction with existing lists single orthographic words to help inform such instruments of L2 pedagogy as language textbooks and language tests, entitled the PHRASal Expressions List, or PHRASE List. To that end, two projects are also presented in the thesis which exemplify ways in which the list may be usefully employed. The first is a research validation exercise carried out in collaboration with the English Profile project in order to compare the phraseological component of the English Profile Wordlist to the expressions in the PHRASE List. The second project presents the development and validation of a kind of vocabulary test that samples from the PHRASE List, and which is intended to be used to supplement knowledge assessed in existing tests of single orthographic words, such as the Vocabulary Size Test (Nation & Beglar, 2007). 2011 Thesis (University of Nottingham only) NonPeerReviewed application/pdf en arr https://eprints.nottingham.ac.uk/12963/1/555398.pdf Martinez, Ron (2011) The development of a corpus-informed list of formulaic sequences for language pedagogy. PhD thesis, University of Nottingham.
spellingShingle Martinez, Ron
The development of a corpus-informed list of formulaic sequences for language pedagogy
title The development of a corpus-informed list of formulaic sequences for language pedagogy
title_full The development of a corpus-informed list of formulaic sequences for language pedagogy
title_fullStr The development of a corpus-informed list of formulaic sequences for language pedagogy
title_full_unstemmed The development of a corpus-informed list of formulaic sequences for language pedagogy
title_short The development of a corpus-informed list of formulaic sequences for language pedagogy
title_sort development of a corpus-informed list of formulaic sequences for language pedagogy
url https://eprints.nottingham.ac.uk/12963/