Toggle Main Menu Toggle Search

Open Access padlockePrints

CANELC: constructing an e-language corpus

Lookup NU author(s): Dr Dawn Knight



This paper reports on the early stages of constructing CANELC: the Cambridge and Nottingham eLanguage Corpus. CANELC is a 1 million word corpus of digitally based communication in English, taken from online Discussion Boards, Blogs, Tweets, Emails and SMS messages. The paper outlines the approaches used when planning the corpus; obtaining consent; piloting the collection of data and, finally, compiling the corpus database. In addition, the paper outlines some of the ways in which corpora of this nature can be used, by providing some results of a detailed analysis carried out on a cross section of the corpus (500,000 words). The analysis questions to what extent forms of eLanguage are used and how they function and operate in similar and/or different ways to spoken and written forms of communication. Results from this analysis start to question where eLanguage is based placed on the 'continuum of communication': between spoken and written forms of discourse.

Publication metadata

Author(s): Knight D, Adolphs S, Carter R

Publication type: Article

Publication status: Published

Journal: Corpora

Year: 2014

Volume: 9

Issue: 1

Pages: 29-56

Print publication date: 01/05/2014

Date deposited: 12/02/2013

ISSN (print): 1749-5032

ISSN (electronic): 1755-1676

Publisher: Edinburgh University Press


DOI: 10.3366/cor.2014.0050


Altmetrics provided by Altmetric