| 60. Back in 1940 a linguist by the name of Benjamin Lee Whorf wrote a paper introducing the work of phoneticians to the world of science and technology.1 His attempt was to adapt the formal phonetics of his day to a more familiar mathematical apparatus. The chosen domain of his discussion was a single structural formula for any word of one syllable in the English language. This section is a quote and paraphrase portions of Whorfs article. In the course of the presentation his formula is reshaped into the apparatus and terminolgy of the phonological descriptions introduced in the previous sections. Figure 1 represents my best reading of the formula as he originally presented it. |

|
61.
The overall structure of his formula translates to figure 2, but is only the first of several structural formulas which describe the same phenomena as in figure 1.
Even though this sort of diagram is more involved than the one given by Whorf in figure 1, it will be found to be more explicit and hopefully easier to understand in the end.
Subsequent figures give the syntactic order and arrangement of the phonetic structural sub-elements.
The right arrow signifies that the element to the left is composed of the elements on the right.
The optional elements are labeled in dashed-line boxes and the obligatory in solid-line boxes.
The use of boxes allows us to provide each structural element with a descriptive name.
The level of analysis is given by the coloring of the box.
The tan boxes contrast with the turquoise boxes.
These tan elements represent intermediate structures and are described by similar rules on the figures that follow.
The turquoise boxes represent elements that will not be analyzed further syntactically.
These are the leaf nodes, the classes of phonemes in English, the letters of Whorfs formula.
The circled numbers refer to each of the fifteen terms of Whorfs formula.
Our version of Whorfs individual terms are given as segment structure rules.
In these rules the phonemes are set in light yellow boxes.
Each such segment corresponds to a bundle of phonetic features in bright yellow boxes. The simplest possible formula for a monosyllabic word is [+Consonantal] [Consonantal] (using my apparatus this would require a rule generating two sorts of light orange boxes from the turquoise one), and some languages actually conform to this. Polynesian has the next simplest formula: { Ø, [+Consonantal] } [Consonantal]. This will show a stark contrast with the intricacy of English word structure, as shown in the rules below. |

| 62. The formula may seem rather complicated; yet for a linguistic pattern it is rather simple. In the English-speaking world, every child between the ages of two and five is engaged in learning the pattern expressed by this formula, among many other formulas. By the time the child is six, the formula has become ingrained and automatic; even the little nonsense words the child makes up conform to it, exploring its possibilities but venturing not a jot beyond them. At an early age the formula becomes for the child what it is for the adult; no sequence of sounds that deviates from it can even be articulated without the greatest difficulty. New words like blurb, nonsense words like Lewis Carrolls mome raths, combinations inteded to suggest languages of savages or animal cries, like glub and squonk all come out of the mold of this formula. When the youth begins to learn a foreign language, he unconsciously tries to construct the syllables according to this formula. Of course it wont work; the foreign words are built to a set formula of their own. Usually the student has a terrible time. Not even knowing that a formula is back of all the trouble, he thinks his difficulty is his own fault. The frustrations and inhibitions thus set up at the start constantly block his attempts to use foreign tongues. Or else he even HEARS by the formula, so that the English combinations that he makes sound to him like real French, for instance. Then he suffers less inhibition and may become what is called a fluent speaker of French bad French! |
| 63. If, however, he is so fortunate as to have his elementary French taught by a theoretic linguist, he first has the patterns of the English formula explained in such a way that they become semiconscious, with the result that they lose the binding power over him which custom has given them, though they remain automatic as far as English is concerned. Then he acquires the French patterns without inner opposition, and the time for attaining command of the lanugage is cut to a fraction. To be sure, probably no elementary French is ever taught in this way at least not in public institutions. Years of time and millions of dollars worth of wasted educational effort could be saved by the adoption of such methods, but men with the grounding in theoretic linguistics are as yet far too few and are chiefly in the highter institutions. |
| 64. Let us examine the formula for the English monosyllabic word (figure 1 and its extensions). They are intended to be an expression of pattern symbolics, an analytical method that grows out of linguistics and bears to linguistics a relation not unlike that of certain higher mathematics to physics. With such pattern formulas, various operations can be performed, just as mathematical expressions can be added, multiplied, and otherwise operated with; only the operations here are not addition, multiplication, and so on, but are meanings that apply to linguistic contexts. From these operations, conclusions can be drawn and experimental attacks directed intellegently at the really crucial points in the welter of data presented by the language under investigation. Usually the linguist does not need to manipulate the formulas on paper but simply performs the symbolic operations in his mind and then says: The paradigm of class A verbs cant have been reported right by the previous investigator; or Well, well, this language must have alternating stresses, though I couldnt hear them at first; or Funny, but d and l must be variants of the same sound in this language, and so on. Then he investigates by experimenting on a native informant and finds that the conclusion is justified. Pattern-symbolic expressions are exact, as arithmetic is, but are not quantitative in the same sense. They do not refer ultimately to number and dimension, as most fields of mathematics do, but to pattern and structure. Nor are they to be confused with theory of groups or with symbolic logic, though they are in some ways akin. |
| 65. Here are the figures describing the phonological syntax of the first part of the syllable. I have divided up the rules of the various terms of Whorffs formula, which are indicated by circled numbers, into five kinds of On-set and three types of Rhyme. |



| 66. Returning to the formula, the simplest part of it is the eighth term, consisting of a short vowel. This segment is not attached to any optional boxes nor is any diphthong in alternate sub-classes, which all means that every English word contains a vowel (not true of all languages). As the vowel is by itself without any other segments, any one of the short vowels of English can occur in the monosyllabic word (not true of all syllables of the polysyllabic English word). The same will be seen to hold of diphthongs where they appear. Next we turn to the first term, which in the above formulation is reflected by the dotted box called on-set. The dotted box means that it is optional, meaning that the vowel may be preceded by nothing; the word may begin with a vowel a structure impossible in many languages. In BLWs formulation there are commas between the terms meaning or. In our formulation we simply place the term in a separate subclass, so that each subclass is an alternative for the structure of the constituent. Boxes 7 and 8 explain how we can refer to classes of phonemes defined negatively, which is important in some of the later terms (2, 11, 15). |



| 67. The second term is the simplest configuration for the syllable on-set: any consonant except for a long-tailed n. This means that a word can begin with any single English consonant except one the one linguists designiate by a long-tailed n, which is the sound we commonly write ng, as in hang. This ng sound is common at the ends of English words but never occurs at the beginnings. In many languages, such as Hopi, Eskimo, or Samoan, it is a common beginning for a word. Our patterns set up a terrific resistance to articulation of these foreign words beginning with ng, but as soon as the mechanism of producing ng has been explained and we learn that our inabilitiy has been due to a habitual pattern, we can place the ng wherever we will and can pronounce these words with greatest of ease. The segments in the formula thus are not always equivalent to letters by which we express our words in ordinary spelling (orthographics) but are more nearly unequivocal symbols such as a linguist would assign to the sounds in a regular and scientific system of spelling (phonemes). |

| 68. According to the third term, which consists of two segments, the word can begin with any consonant specified by the features in the first segment followed by r, or with just a few of these followed by l. Notice how the braces allow us to specify just which features may appear on the segment with each other. Before an r the consonant is any stop or any unvoiced non-alveolar fricative; before an l the stop must be either bilabial or velar, whereas the unvoiced fricative must be labiodental. The s with a wedge over it means sh. Thus we have shred, but not shled. The formula represents the fact that shled is un-English, that it will suggest a Chinese pronunciation of shred or a Germans of sled (sl is permitted by term 7). The Greek theta in BLWs formula is replaced by my thorn and means th; so we have thread or a child lisping sled. BLW did not place tr, pr, and pl in this third term. The reason why is that they can be preceded by s and so he put them in the sixth term. In my formulation it was simpler to put them in both places. |

| 69. The fourth term similarly means that the word can begin with a consonant in the first segment when it is followed by w. Hw does not occur in all dialects of English; in ordinary spelling it is written backwards, wh. If the dialect does not have hw, it pronounces the spelling wh as w. Thw occurs in a few words, like thwack and thwart, and gw, oddly enough, only in proper names, like Gwen or Gwynn. Kw, ordinarily spelled qu, can have s before it and therefore belongs in term 6. |
| 70. The fifth term indicates that the word may begin with one of the consonants in the first segment followed by y, but only when the vowel of the word is u; thus we have words like hue (hyuw), cue, few, muse. Some dialects have also tyu, dyu, and nyu (e.g., in tune, due, new), but BLW set up his formula for the typical dialects of the northern United States, which have simple tu, du, nu in these words. The sixth term indicates pairs that can commence a word either alone or preceded by s, that is k, t, or p followed by r, also kw and pl (think of train, strain; crew, screw; quash, squash; play, splay). The seventh term, which means the word can begin with s followed by any one of the consonants of the second segment, completes the parts of the word that can precede its vowel. |


|
71.
The terms beyond the eighth show what comes after the vowel.
This portion is rather more complex than the beginning of the word, and it would take too long to explain everything in detail.
The general principles of the symbolism will be clear from the preceding explanations.
The ninth term has a null [Ø in BLWs formulation] to denote that a vowel can end the word if the vowel is a . . . or the vowel can end the word if it is the aw sound, as in paw, thaw.
(My own dialect does not have to contend with this last [mid back rounded short] vowel which can close a syllable for BLW.) [My own inclination is to rewrite BLWs formulation so as to assume a silent allophone of the h phoneme, which can serve as an obligatory closing consonant.] The vowel of the article an is exposed at the end of the word when the n is dropped before a word beginning with a consonant. This then becomes the mid back unrounded long vowel that is also found in speech to mark hesitation being spelled er or with interrogative intonation, huh? Similarly the vowel of pa, ma and the exclamations ah! and bah! must be written with a final h. In some dialects (eastern New England, southern United States, South British) the vowel ending occurs in words which are SPELLED with ar, like car, star (kah, stah, in these dialects), but in most of the United States dialects and in those of Ireland and Scotland these words end in an actual r. In eastern New England and South British dialects, but not in southern United States, these words cause a linking r to appear before a vowel beginning a following word (cf. § A44). Thus for far off your Southerner says fah of; your Bostonian and your Britisher say fa rof, with a liquid initial r; but most of the United States says far of, with a rolled-back r. For some dialects, term 9 would be different, showing another possible final vowel, namely, the peculiar sound which the Middle Westerner may notice in the Bostonians pronunciation of fur, cur ( ) and no doubt may find very queer.
This funny sound is common in Welsh, gaelic, Turkish, Ute, and Hopi, but I am sure Boston did not get it from any of these sources.
[BLW is speaking phonetics rather than phonemics here, for which lapse we will have to forgive him, as his phonemics refused the possibility of a null allophone (for r here).]
|

| 72. Can one-syllable words end in e, i, o, or u? No, not in English. The words so spelled end in a consonant sound, y, or w. Thus, I, when expressed in formula pattern, is ay, we is wiy, you is yuw, how is hæw, and so on. A comparison of the Spanish no with the English No! shows that, whereas the spanish word actually ends with its o sound trailing in the air, the English equivalent closes upon a w sound. The patterns to which we are habituated compel us to close upon a consonant after most vowels. Hence when we learn Spanish, instead of saying como no, we are apt to say kowmow now; instead of si, we say our own word see (siy). In French, instead of si beau, we are apt to say see bow. |
| 73. Here is figure 5 describing the phonological syntax of the rest of the syllable (to which other morphological elements have not yet been attached). |


| 74. Term 10 means tht r, w, and y may be interpolated at this point except when the interpolation would result in joining w and y with each other. [I have not included w and y since they are included as part of the rhyme without a coda.] Term 11 means that the word may end in any single English consonant except h [note: this possibility is also included in the rhyme without a coda]; this exception is most unlike some languages, e.g., Sanskrit, Arabic, Navaho, and Maya, in which many words end in (phonetically sounded) h. |


| 75. The reader can figure out terms 12, 13, and 14 if he has stuck so far. Term 13 expresses the possibility of words like gulch, bulge, lunch, lounge. |


| 76. Term 14 represents the pattern of words like health, width, eighth (eytþ), sixth, xth (eksþ). [The first two examples are abstract nouns derived from adjectives in Old English; the last ones the th-formative for expressing ordinal numbers.] Although we can say nth power or fth power, it takes effort to say the unpermitted sth power or hth power. Hth would be symbolized *eyčþ, the star meaning that the form does not occur. Term 14, however, allows both mþ and mpf [see previous figure], the latter in words like humph or the recent oomph (umpf). [I have taken the p in this word to be a phonetic parasite resulting from the movement of the vocal apparatus between the real intended sounds (phonemes). The pronunciation assumed results from sounding out the conventional spelling. Its original pronunciation is not phonemic, cf. § A18. The second word is also derived from non-phonemic sounds (made under strenuous exertion), but its pronunciation has clearly been regularized.] |

| 77. The elements of term 15 may be added after anything the t and s forms after voiceless sounds, the d and z after voiced sounds. Thus, towns is tæwnz, with wnz obtained by term 10 plus 11 plus 15; whereas bounce is bæwns with wns by 10 plus 12. Some combinations resulting in this way are common; others are very rare but still are possible English forms. If Charlie McCarthy should pipe up in his coy way [making a verb from a noun], Thou oomphst, dost thou not?; or a Shakespearean actor should thunder out, Thou triumphst! the reason would be that the formula yields that weird sputter mpfst by term 14 plus term 15. Neither Mr. Bergen nor Mr. Shakespeare has any power to vary the formula. |
|
78.
The slash in the rules for adding suffixes indicate that the phoneme thus described before the blank-underscore is a condition on the final segement(s) of the morpheme to which it attaches.
Notice also that morpheme boundaries are indicated with the sign of concatenation on a gray background.
Here is figure 6 describing the morphological derivations (and inflections) mentioned by BLW. In English all derivations are added first, and then inflections as described in figure 2. Hence we have certain root adjectives, e.g., true, young from which we derive noun bases, truth, youth. Other derivational affixes may attach here in a certain order (none are without vowels): truthful, truthfulness; youthful, youthfulness. The allomorphs of suffixes treated in this monograph, of course, do not contain a vowel. |

| 79. The overriding factor applicable to the whole expression is a prohibition of doubling. Notwithstanding whatever the formula says, the same two consonants cannot be juxtaposed. While by term 15 we can add t to flip and get flipt (flipped), we cant add t to hit and get hitt. Instead, at the point in the patterns where hitt might be expected, we find simply hit (I hit it yesterday, I flipt it yesterday). [This example is actually prohibited by morphological restraints; irregular lexical insertion for hit.] Some languages, such as Arabic, have words like hitt, fadd, and so on, with both paired consonants distinct. The Creek Indian language permits three, e.g., nnn. |
| 80. The way the patterns summarized in this formula control the forms of English words is really extraordinary. A new monosyllable turned out, say, by Walter Winchell or by a plugging adman concocting a name for a new breakfast mush, is struck from this mold as surely as if I pulled the lever and the stamp came down on his brain. Thus linguistics, like the physical sciences, confers the power of prediction. I can predict, within limits, what Winchell will or wont do. He may coin a word thrub, but he will not coin a word srub, for the formula cannot produce a sr. A different formula indicates that, if Winchell invents any word beginning with th, like thell or therg, the th will have the sound it has in thin, not the sound it has in this or there. Winchell will not invent a word beginning with this latter sound. [The formula actually allows such words (see term 2) so the prohibition is probably not at this level of analysis.] |
| 81. We can wheeze forth the harshest successions of consonants if they are only according to the patterns producing the formula. We easily say thirds and sixths, though sixths has the very rough sequence of four consonants, ksþs. But the simpler sisþs is against the patterns and so is harder to say. Glimpst (glimpsed) has gl by term 3, i by 8, mpfst by 12 plus 15. But dlinpfk is eliminated on several counts: Term 3 allows for no dl, and by no possible combination of terms can one get npfk. Yet the linguist can say dlinpfk as easily as he can say glimpst. The formula allows for no final mb; so we do not say lamb as it is spelled, but as lam. Land, quite parallel but allowed by the formula, trips off our tongues as spelled. It is not hard to see why the explanation, still found in some serious textbooks, that a language does this or that for the sake of euphony is on a par with natures reputed abhorrence of a vacuum. [A semantic anomaly, since euphony lacks a technical meaning; but cf. § A31.] |
| 82. The exactness of this formula, typical of hundreds of others, shows that, while linguistic formulations are not those of mathematics, they are nevertheless precise. We might bear in mind that this formula, compared with the formulation of some of the English (or other) grammatical patterns that deal with meaning, would appear like a simple sum in addition compared with a page of calculus. It is usually more convenient to treat very complex patterns by successive paragraphs of precise sentences and simpler formulas, so arranged that each additional paragraph presupposes the previous ones, than to try to embrace all in one very complex formula. |