Kevin Gerigk has reviewed Corpus Linguistics: A Guide to the Methodology for the International Journal of Corpus Linguistics. The review is open access, so you can read it here.
The review is very useful, because it draws attention to ways in which a future edition of the book might be improved. I would like to respond very briefly to three issues raised in the review.
- Kevin rightly criticizes my emphasis on a Popper-inspired perspective on deductive hypothesis testing in Chapter 3, claiming that it excludes exploratory work (which is characteristic of corpus linguistics and of big data approaches in general) from corpus linguistics proper. I do emphasize the existence of an exploratory (inductive) perspective in Chapter 3 (on p. 62), and I come back to it in Chapter 7 (p. 224) and in several case studies (e.g. 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11). However, the book is missing a principled discussion of the inductive and the deductive perspective and the relationship between them from an epistemological perspective and in the context of corpus linguistic research. This is definitely something that must be remedied in a future edition!
- Kevin notes that the case studies often presuppose a deep understanding of statistics and linguistics that is at odds with the introductory nature of the book. This is to some extent unavoidable, as my aim was to show a broad range of applications across many different areas of linguistics, and in fact, I would like to expand this range even further in a future edition. However, he is not the first colleague to note this, so I will definitely have to find a way to make the case studies more accessible – perhaps by providing a basic introduction to each area before presenting the case studies. Unfortunately, the book is already quite long as it is – any ideas on how to solve this problem are certainly welcome!
- Kevin criticizes my detailed, step-by-step, often manual approach to statistics and data analysis, which omits any mention of tools like CQPweb , LancsBox or AntConc, that automate some types of quantitative analysis (we might add SketchEngine, WordSmith and other tools to this list). Here, I have a very principled position: As useful as such tools may be for quick and dirty pilot studies or for language teachers who just need a quick overview of some phenomenon, I do not believe that they should be used for scientific research – a researcher has to perform each step of the analysis themselves and be able to freely choose (and motivate their choices) how to proceed at every step. We cannot rely on tools that make these choices for the researcher and that hide large parts of the analysis from scrutiny. In a sense, my book (like, I believe, many introductory textbooks in this area) are meant as a tool of empowerment from such automated tools. Perhaps I need to stress this more clearly in a future edition of the book!