Corpus Linguistics seeks to provide a comprehensive sampling of real-life usage in a given language, and to use these empirical data to test language hypotheses. Modern corpus linguistics began fifty years ago, but the subject has seen explosive growth since the early 1990s. These days corpora are being used to advance virtually every aspect of language study, from computer processing techniques such as machine translation, to literary stylistics, social aspects of language use, and improved language-teaching methods. Because corpus linguistics has grown fast from small beginnings, newcomers to the field often find it hard to get their bearings. Important papers can be difficult to track down. This volume reprints forty-two articles on corpus linguistics by an international selection of authors, which comprehensively illustrate the directions in which the subject is developing. It includes articles that are already recognized as classics, and others which deserve to become so, supplemented with editorial introductions relating the individual contributions to the field as a whole. This collection of readings will be useful to students of corpus linguistics at both undergraduate and postgraduate level, as well as academics researching this fascinating area of linguistics.
Geoffrey Sampson is Professor of Natural Language Computing at the School of Informatics, University of Sussex. Diana McCarthy is a Royal Society Dorothy Hodgkin Fellow, in the Department of Informatics at Sussex University.