| The GT grammar introduced in earlier chapters is the calculus that the linguist may use in the componential analysis of syntax. We have demonstrated in the excursus to chapter 4 the possibility of using a GT grammar to describe some of the various calculi of logic. We have pointed out that the linguist may relate the calculi to each other by the use of complex symbols and transformations. When Chomsky first developed GT grammar, it was because he needed a suitable tool to describe the sentences of a natural language. The analyst uses it as a tool always keeping in mind that it is the structure of English that should constrain the form of the rules of the GT grammar of English. In this section we introduce enough of a GT grammar to describe compound sentences. |
Describing phrase structure.The GT linguist defines a phrase structure ( ) as an ordered sequence of constituents ( ).
Box one contains two formation rules in (1) and (2) to describe the calculus that in turn describes the linguists grammar. |

| In accord with the second rule, suppose that (3) is the phrase structure of some fragment of English. It is often convenient to utilize tree graphs, i.e., tree diagrams (cf. ¶5-6-2) for delineating phrase structure. The diagram is a tree graph where the concept of rewrite is represented by multiple vertically directed edges as in (4). It is particularly the large structures generated by the recursiveness in rules like the second rule for a phrase structure that motivate the use of tree diagrams for clarity. Later on in this book I have begun using a variant of this form, the one in (5), to represent phrase structures. Replacing both of these representations is an even more useful form the one in (6). Computer scientists use this form to diagram constituencies. It is just one of many tools incorporated into the Unified Modelling Language (UML). |
Rules to describe syntactic structure.Now, the way to get from the formation rules to the phrase structure is by means of a set of phrase structure rule ( ).
These most generally have the form in (1) in box 2.
The structure characterized in (2) is generated by the PS-rule in (1). |

| GT grammar has a replacement rule for interpreting the phrase structure rule. This is a rule of inference, a replacement rule, a transformation of the metalanguage, which is the calculus for describing GT grammars. The import of this rule is given in (3). Most linguists try to avoid the use of an environment structure, since such a condition may usually be replaced by a suitable transformation. Such rules as are free of this condition take the form described in box 3, which are called context-free rules. |

| Two ways to abbreviate rules use parentheses and curly brackets. The interpretation of such rules are given in box 4. A common variant of the Chomskyan form of the phrase structure rule is what is known as Backus-Naur form. This version adds a very useful device called a Kleene star, which allows the description of multiple occurrences of a constituent without the use of a (asyndetic) conjunction transformation. Box 5 tells how such a rule is to be interpreted. Notice that in this case the subscripts of each of the constituents do not simply indicate discreteness, but its 1) derivational and 2) sequential order. |


| A useful alternative to the phrase structure rule is found in the Unified Modeling Language (UML) of contemporary computer scientists. Using this convention it is important to distinguish two ways in which structural constituents may be related. In the first case we use a rhombus on the arrow end. This is the has a or the is part of relationship. Each constituent is a distinct category that is a part of some greater whole. In the second case we use a triangle on the arrow end. This method indicates the alternative discrete realizations in a rule with curly brackets. This is the is a (kind of) or realized by relationship. Contextually these elements are thought of as different structures for the same kind of phrase. |

PS-rule for a sentence.Suppose the GT linguist wants to describe a sentence. The first thing we can easily state is that a pro-sentence constitutes a legitimate form for the sentence. |
[P1a] Sentence Pro-sentence |
| There is, of course, the possibility that the sentence consists of an independent clause. |
[P1a] Sentence Clause |
| Since these are mutually exclusive alternatives, we may write the two rules as a single one: |

| The language of object and class modelling may be more graphic with respect to classes and constituents by representing them as words in boxes. The student must be careful, however, to interpret such representations as the symbols of the grammar. The tan colored boxes represent alternative structural classes, i.e., classes that contain elements. These are the is a relationships that in turn have constituents. |
PS-rules for a clause.By comparing a sentence with other sentences and their parts the analyst must decide on its most appropriate phrase structure. The GT linguist uses the GT grammar to propose rules that will define the phrase structure. To see how one might proceed consider describing the following three sentences: |
| (1) | This is John. | |
| (2) | And this is Mary. | |
| (3) | But this is Bill. |
| The structures of these three sentences are very similar. These sentences are independent clauses. With these three instances in mind the linguist may define a sentence in terms of a particular structure, i.e., as another clause that in some instances also has a grammatical connective (traditionally called a conjunction) such as and or but to connect it to the preceding clause or to the structure of some other proposition in the context. In this way the grammar might contain these phrase structure rules: |
[P2a] Clause X |
[P2b] Clause Conjunction + Clause |
| The intent is that rule [P2a] be capable of describing the structure of (1) and that rule [P2b], the structure of either (2) or (3). (We discuss the proper constituents of a clause later, but put the variable X in [P2a] to indicate that this is where they would go.) |
Ellipsis as a tool of analysis.It may well be that the linguist will view (1) of the preceding paragraph as an ellipsis. An ellipsis results from an analysis in which it is assumed that the author has omitted something, sometimes a single word or sometimes even whole structures. Perhaps the sentence in (1) is a more explicit version of (2); after all, we feel that the normal function of a connective (conjunction) is to connect things. |
| (1) | This is John and this is Mary. | |
| (2) | And this is Mary. |
| The argument runs something like this: motivated by having expressed the content of the first clause of (1) in This is John the author of (2) naturally omitted that clause as redundant. With independent reasons for wanting to describe (1) as well as (2), and wanting to keep the description simple, the linguist considers next how to adjust [P1b] to describe both: |
[P2bi] Clause Clause + Conjunction + Clause |
[P2bii] Clause (Clause) Conjunction + Clause |
| The parentheses in a phrase structure rule mean that the enclosed elements may or may not be realized as a part of the structure. When we use boxes, the optional elements are outlined with a dashed line. The question is whether the rule [P2b] should describe the connective as belonging with the second clause or not. |

| The second rule allows the derivation of two structures for the main clause, yet the question is whether the second of these structures is really needed to describe sentences in English. |
A transformational approach.Transformationalists are inclined to avoid this question by adopting the policy of introducing connectives by means of transformations. The idea is to use a double base transform to conjoin the clauses as we did in describing the predicate calculus in chapter 3. What this policy ignores, however, is the primacy of deep structure, i.e., the primitive sentences, and the philosophy that transformations, i.e., the rules of inference, should not introduce otherwise undefined symbols. The alternative is to generate the category of connective in the phrase structure rule as we have in [P2b]. It is possible to leave out the parentheses. Doing so would still make the structure of And this is Mary possible by the use of an ellipsis transformation. |
Using recursion.Now what about But this is Bill? It would appear that in the context of This is John and And this is Mary, that But this is Bill could be an elliptical version of sentence (1). |
| (1) | This is John and this is Mary, but this is Bill. |
| The connectives keep adding clauses to the previous context. This situation seems to be further evidence that [P2] is fully adequate only by being recursive. |
![[P2] The Clause](gif/bnf/bnf02.gif)
| Notice that the tan box convention requires that the designer be explicit in putting elements on the right side of [P2] together as a structural unit. The first case (tan) is for the independent clause and any further clauses compounded as parts of it. The second case (cyan) is for the dependent clause, which has the same structure as the independent clause. The student will find the dependent clause in later rules and these are realized as 1) a noun clause [P19], 2) an adjective clause [P29], or 3) an adverbial clause [P31]. Note that every alternative structural class needs a separate box. In this system the elements that alternate cannot also be contained as ordered constituents. This is often what motivates the creation of separate rules. |
| Through recursion the single rule [P2] has the advantage of describing infinitely many structures as required. This rule together with [P7] allows the derivation of any of the following three structures from the main clause (Clause1): |

| Either clause in the first structure may likewise have one of these structures. If it has the first, the possibilities continue until the clause has either the second or the third structure. Suggestions for further phrase structure rules to define the structure of the other elements appear in subsequent chapters. The rules analyze further each of these components until the result is a so-called terminal constituent. In the grammatical calculus of sentence syntax these are the various lexical categories. In the BNF and UML diagrams the labels are in lower case Arial and the corresponding boxes are light green. The conjunction and the subject case marker are such elements. In the latter case the constituent has no separate morphological realization, but, as will become clear later, is required to reflect the syntactic environment of the argument for the use of such rules. |
Syntactic structure of a compound sentence.In terms of [P1] the apparent phrase structure of the single sentence in This is John and this is Mary, but this is Bill would be the one in figure 3. The device of a triangle in tree diagrams indicates where there is structure that the linguist will have to define by creating additional rules. In box diagrams the device of colored lines serves this purpose. The dashed lines connecting to terminal constituents also indicate that lexical insertion rules are yet to be given. |

Ellipsis vs. the null constituent.[P2] appears to be successful in describing the phrase structure of the sentences in (1), (2), and (3), but (4) in the context of (1) and (5) in the context of (1) and (2) seem to require more. |
| (1) | This is John. | |
| (2) | This is John and this is Mary. | |
| (3) | This is John and this is Mary, but this is Bill. | |
| (4) | And this is Mary. | |
| (5) | But this is Bill. |
| There are at least three ways to go about describing ellipsis in a GT grammar. The first way (when possible) is to make the constituent optional. A second way is with a transformation deriving (4) and (5) from (2) and (3) respectively. Suppose the linguist observes a sentence beginning with a connective, in a situation where no context is reconstructable. Thirdly, one may prefer to describe such an ellipsis by positing a null constituent. This constituent is a terminal symbol (Ø) that by convention may be the realization of any constituent. Presumably any constituent has conditions where ellipsis may occur, and so there need be no explicit PS rule to introduce it. The structure of (4) and (5) would, with this method, take the form of the ones shown in figure 4. |

| It turns out that the first way, making the constituent element(s) optional, must sometimes still make use of the null constituent. Suppose we take the first clause of a compound as an optional constituent. In this case there would seem to be no straightforward way for the dependent clause, whose connective is context sensitive, to be included in this rule. The conjunction would then have to be null whenever the first clause of a compound was omitted. |
![[P2'] The Independent Clause](gif/gram074d.gif)