CANELC: constructing an e-language corpus

Knight, D; Adolphs, S; Carter, R

doi:10.3366/cor.2014.0050

CANELC: constructing an e-language corpus

Lookup NU author(s): Dr Dawn Knight

Downloads

Accepted version [.pdf]

Licence

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

Abstract

This paper reports on the early stages of constructing CANELC: the Cambridge and Nottingham eLanguage Corpus. CANELC is a 1 million word corpus of digitally based communication in English, taken from online Discussion Boards, Blogs, Tweets, Emails and SMS messages. The paper outlines the approaches used when planning the corpus; obtaining consent; piloting the collection of data and, finally, compiling the corpus database. In addition, the paper outlines some of the ways in which corpora of this nature can be used, by providing some results of a detailed analysis carried out on a cross section of the corpus (500,000 words). The analysis questions to what extent forms of eLanguage are used and how they function and operate in similar and/or different ways to spoken and written forms of communication. Results from this analysis start to question where eLanguage is based placed on the 'continuum of communication': between spoken and written forms of discourse.

Publication metadata

Author(s): Knight D, Adolphs S, Carter R

Publication type: Article

Publication status: Published

Journal: Corpora

Year: 2014

Volume: 9

Issue: 1

Pages: 29-56

Print publication date: 01/05/2014

Date deposited: 12/02/2013

ISSN (print): 1749-5032

ISSN (electronic): 1755-1676

Publisher: Edinburgh University Press

URL: http://dx.doi.org/10.3366/cor.2014.0050

DOI: 10.3366/cor.2014.0050

Altmetrics

Altmetrics provided by Altmetric

ePrints

CANELC: constructing an e-language corpus

Downloads

Licence

Abstract

Publication metadata

Altmetrics

Share