What students need (and want): semantically-oriented queries in large online corpora
MetadataVis full innførsel
OriginalversjonSYNAPS - A Journal of Professional Communication 24(2010) pp.27-39
The 400 million word Corpus of Contemporary American English (COCA) [1990-2009] is the only large, balanced, up-to-date corpus of English that is publicly available. There are many features in this corpus that allow learners of English to quickly and easily perform semantically-oriented queries. These include the following: 1) one-step collocates (with limiting by part of speech and sorting and limiting by Mutual Information score), 2) comparing collocates across genres (e.g. collocates of “chain” in fiction and academic), 3) comparison of collocates of two words (e.g. sheer / utter) 4) use of integrated thesaurus (entries for 60,000+ words) to see frequency of all synonyms (including by genre) and to create more powerful queries (e.g. all forms of all synonyms of “clean” + a noun in a particular semantic domain) and 5) customized wordlists (including hundreds or thousands of words in a semantic domain).