Review of CLGM in Lingua

Zhen Dong and Fan Pan reviewed Corpus Linguistics: A Guide to the Methodology for Lingua. It is behind the Elsevier paywall here.

It is a very positive review, emphasizing three areas in which the reviewers see it as particularly valuable: first, the extensive introduction to statistical thought and practice, second, the case studies drawing from a broad range of linguistic phenomena, and third, the focus on reproducibility (something I want to expand on in potential future editions).

The reviewers also raise two critical points that will be useful for me when working on future editions:

  1. They criticize that the book does not include a chapter on how to construct specialized corpora in cases where available corpora do not meet the needs of a particular research project. They are right — although Section 2.1 talks about the design of “representative” or “balanced” corpora at length, it does not discuss the design of corpora for specific research projects, nor does it give any practical advice. Part of the reason for this is that I felt that a section (or even a chapter) on this topic would have to raise not only practical issues (where to find texts, how to store and process them, etc.), but also legal issues (how to deal with copyrighted texts). The latter issue, apart from the fact that it is beyond my expertise, is very dependent on the researcher’s jurisdiction, which makes it difficult to discuss in general terms. However, I will certainly think about including such a section in potential future editions. In the meantime, I can only recommend Martin Wynne’s excellent open-access book Developing Linguistic Corpora: a Guide to Good Practice, which my potential chapter would be based on in large parts!
  2. They criticize the absence of any reference to specific software tools for concordancing and statistical analysis. I do see their point (as I saw Kevin Gerigk’s point concerning my focus on manual statistic analysis when there are tools that will do some of this analysis for you). I just feel that given the quick pace of software development, a textbook that builds on specific software tools will be outdated too quickly. A discussion of software tools is one of the things that this blog is meant to provide, if only I had time…

