2. The Relative Frequency of the English Verb Forms in Natural Discourse
Even using modern computer technology and corpora (large collections of digitalized texts and transcriptions of spoken discourse) it is more difficult to count the frequency of verb forms or—harder still—the uses to which they are being put than it is to calculate the prevalence of individual words. Such a task still largely has to be done by human hand and brain. However, even a superficial investigation, reveals some clear overriding patterns.
The statistics I present here are culled from a 1969 Czech thesis (Krámský, 1969), which was merely the one I found most readily to hand on the Internet and most easily adaptable for use by non-linguists. Modern linguistics tends to use terminology and categories that differ significantly from those still employed by language teachers and in popular parlance. However, since my aim here is to investigate the prevalence of precisely those categories still regularly employed in language learning, a more old-fashioned study is more useful. The subject of the usefulness or not of modern scientific linguistic categories for language learning purposes is one to which I will return in future posts, but it is not my main focus here.
Krámský’s thesis is based on a fairly crude, but sound, descriptive statistical analysis of three kinds of corpus (middle-brow novels, plays in colloquial modern language, and academic textbooks). I have checked his findings using three texts of my own choosing (a different, more modern, middle-brow novel, a TV screenplay, and an academic paper—in the field of medicine) and my results (not given here) largely concur with his, suggesting both that Krámský’s findings are broadly accurate for these three kinds of text and that there has been little change over the past 45 years.
I summarize Krámský’s findings in the following table:
| Frequency of Verb Forms by Type of Text | |||
| Novel/Prose Narrative | Drama/Conversational Discourse | ||
| Verb Form | Frequency (%) | Verb Form | Frequency (%) |
| Preterite Simple Active | 48.5 | Present Simple Active | 44 |
| Present Simple Active | 30.1 | Preterite Simple Active | 26.6 |
| Pluperfect Simple Active | 5.1 | Present Perfect Simple Active | 7 |
| Present Perfect Simple Active | 3.1 | Future Active | 5.1 |
| Conditional Present Active | 3.1 | Present Continuous Active | 5.1 |
| Present Continuous Active | 2.7 | Conditional Present Active | 3.6 |
| Preterite Continuous Active | 2.3 | Pluperfect Simple Active | 1.9 |
| Preterite Simple Passive | 2.1 | Present Perfect Continuous Active | 1.5 |
| Future Active | 1.3 | Preterite Continuous Active | 1.4 |
| Conditional Past Active | 1.3 | ||
| Present Simple Passive | 1 | ||
| TOTAL | 98.3 | TOTAL | 98.5 |
| All other verb forms | 1.7 | All other verb forms | 1.5 |
| Academic | Total | ||
| Verb Form | Frequency | Verb Form | Frequency |
| Present Simple Active | 67.5 | Present Simple Active | 44.1 |
| Preterite Simple Active | 10.2 | Preterite Simple Active | 27.6 |
| Conditional Present Active | 4.8 | Future Active | 4.5 |
| Present Perfect Simple Active | 3.8 | Present Perfect Simple Active | 4.1 |
| Present Simple Passive | 2.9 | Conditional Present Active | 3.8 |
| Future Active | 2.4 | Present Simple Passive | 3.8 |
| Present Continuous Active | 2.2 | Present Continuous Active | 3.3 |
| Pluperfect Simple Active | 1.2 | Pluperfect Simple Active | 2.4 |
| Preterite Simple Passive | 1.8 | ||
| Preterite Continuous Active | 1.5 | ||
| TOTAL | 95 | TOTAL | 96.9 |
| All other verb forms | 5 | All other verb forms | 3.1 |
These figures require a number of comments, explanations and criticisms.
Krámský uses the term Preterite, where I use Past. Both terms are in current use, but I think that Past is now the more common, at least in the ELT community, and more easily understood, so I shall continue to use it here. Likewise Krámský uses the term Pluperfect where I follow more recent usage in referring to this form as the Past Perfect.
It is not clear whether Krámský’s ‘Future Active’ refers to the ‘will+bare verb’ form or to all the various forms used to express a future action or prediction. I presume the former. Likewise, it is unclear whether the Present Continuous categories include both the use of the form for a continuing action and to refer to the near planned future or just a continuing action. Again I presume the former. Given this, the ‘Future’ may be somewhat underestimated in the analysis of these corpora.
Krámský does not include the infinitive (bare or not) constructions involving modal verbs or the –ing form of the verb, except where this is construed as part of a ‘continuous’ construction.
These limitations, however, (except perhaps for the one relating to the various ‘future’ forms) are mere quibbles; the overall pattern is clear.
The conclusions can be summarized as follows:
- In all forms of discourse the simple active past and present forms are overwhelmingly the most prevalent: 78.6% in the case of narrative prose; 70.6% for drama as a proxy for conversational discourse; 77.7% for academic writing; and 71.6% overall.
- In prose narrative, the past simple accounts for 48.5% of the total, compared to 30.1% for the present simple. In drama, this pattern is reversed, with 44.0% for the present simple and 26.6% for the past simple. In academic discourse a full 67.6% of verbs are in the present simple form, with only 10.2% for the past simple. This distinction between prose narrative and spoken discourse in this respect concurs with common-sense intuition, but the overwhelming prevalence of the present simple active in academic discourse belies the commonly-held belief or anxiety that this kind of discourse is especially complex, difficult or deliberately alienating. The most complex style of discourse, according to these statistics, so far as verb forms are concerned, is, in fact, spoken conversational discourse and the reality of such discourse, as opposed to a skilled playwright’s imitation of it, may be more complex and varied still.
- Continuous forms of the verb are far rarer than simple ones in all forms of discourse. 5%+<1.7% in prose narrative, compared to 93.3%+<1.7% for simple forms. 8%+<1.5% in drama, compared to 90.5%+<1.5%. 2.2%+<5% in academic discourse compared to 92.8%+<5%; and 4.8% +< 3.1% compared to 92.1% + < 3.1% overall. This is vastly disproportionate to the quantity of content and priority accorded to this form in textbooks and classrooms. A matter upon which I shall expand in the next section.
- Perfect forms account for 8.2+<1.7% of all verb forms in prose narrative, 10.4% + <1.5% in dramatic discourse, 5% + < 5% in academic writing, and 6.5% + < 3.1% overall. Perfect forms are thus considerably more frequent in all types of discourse than continuous ones.
- Passive forms are very rare, even in academic discourse, the approximate frequencies being 2.1%, 1%, and 2.9% in narrative, conversational and academic discourse respectively, and 5.6% overall.
These results back up my claim in the previous section that the English verb system is much simpler than it is held to be by learners, teachers and linguists alike. The more complex forms of the verb appear, when viewed from this perspective, as exotic, largely decorative, lace-like features, lining an otherwise fairly easily produced and reproduced, and much more extensive, fabric of simple past and present forms. These forms are ‘simple’ both in the technical grammatical and vernacular sense of the word.
This simple finding, documented over forty years ago in an obscure journal from the eastern side of the Iron Curtain, but easily replicated today, raises numerous issues.
Why do materials used in the English language teaching community still today tend not to reflect this clear pattern of usefulness and frequency? If contemporary scientific linguists are right in claiming that more in-depth and extensive empirical studies of such language features invalidate traditional categorizations and terminology and learning methods, why have these supposed advances in language science not yet been taken up by language-teaching professionals? Why have apparently obviously needed reforms not been introduced into language teaching and why is any attempt to do so vigorously opposed or ignored by learners and teachers alike?
These are hard questions, more easily brushed aside by the various institutions and individuals that have competing stakes in them than confronted head-on. In the following sections of this series, I shall attempt to scratch away at the hard surface of these three crucial questions as best I can.
In Part III, I will endeavor to show how and suggest why language-teaching materials almost universally diverge, in terms of content, emphasis and order of priority, from actual natural usage. In Part IV, I explore, as best I can, how the principles of modern linguistics both undermine and replicate the prescriptivism of classical rhetoric and how they have thus failed to gain much traction in the actual business and practice of language teaching and learning. In Part V, I reflect on why the apparently eminently commonsensical ‘lexical approach’ to language teaching methodology has had such little impact on the industry, despite having been around for over twenty years. Finally, in Part VI, I shall attempt to pull all these threads together and present the issues in a broader historical, political, economic and ideological context that may prove more illuminating than any narrow focus on linguistic prescription and abstraction.
References
Krámský, Jirí (1969) “Verb-form Frequency in English” in Brno Studies in English Vol. 8. http://www.phil.muni.cz/plonedata/wkaa/BSE/BSE_1969-08_Scan/BSE_08_16.pdf
[…] https://oudeis2005.wordpress.com/2015/06/15/the-truth-about-english-verbs-part-2/ […]
[…] Part 2 […]