|
|
3.3.3.1 Sentential Mood
Figure 3.10 shows the separate percentages for the three mood output units (the data are from the training corpus and the first test corpus, and for a tolerance of 0.2 -- the precise numbers can again be found in Appendix 5). It is readily apparent that the network has found a very good method for determining which sequences of input words belong to which type of sentence. Recall that the desired mood of the input sentences is based on their final punctuation mark ('.', '!' and '?'), but the network, of course, does not get to see these marks until the very end of the sentence -- i.e. when the classification of the words belonging to the sentence has already taken place.
This result strongly suggests that CLASPnet has become sensitive to both word classes and word order: for example, when the net sees one of the modal verbs of the lexicon as the first word of a new sentence, it immediately 'knows' that it is an interrogative sentence; similarly the sentence-initial presence of WH-elements also triggers the Interrogative output unit; and imperatives can easily be detected because a non-modal verb occurs in sentence-initial position. It is one of the peculiarities of English that it has a very strict word order (cf. Bates & MacWhinney 1989), so it is actually not really surprising that the network has been able to detect these regularities in the input data.
There are at least two lines along which one could make the task more challenging for the network: first, one could extend the grammar to include tag-questions (e.g. 'The men have not seen the tigers, have they?') and check whether the performance of the network on sentential mood drops. My prediction is that it would: because the word order of these questions is mostly identical to that of the indicative sentences, the network would have no chance of determining in advance how the sentence would end -- unless one added prosodic or pragmatic information to the input. Extending the grammar with moods which have their own typical word-order (e.g. the optative) would probably not make the task much more difficult. The second line of changes would be to look at the pragmatic function of each sentence before determining its mood: a sentence like 'Could you pass me the salt?' would be classified as interrogative in the present scheme, while it is arguably (also) close to being an imperative. The primary problem with such an extension of CLASPnet is that it is not self-evident which pragmatic classification scheme to use; if an uncontroversial scheme were available, the daunting task would present itself of having to write a grammar which correctly implements the scheme.
Now, Figure 3.10 above does not show the full picture. Imagine that there were only 5 input patterns in the training corpus (out of the 47,338) for which the Interrogative output unit should be 1. These 5 patterns will likely be swamped during training by the other 47,333 patterns for which the desired output on the Interrogative unit is 0. So, the network ends up always giving this output unit a value of 0. From the perspective of Figure 3.10, the performance for the Interrogative unit would be more than 99% -- after all, the unit is correctly off for nearly all the input patterns. From another point of view, the performance is 0% as none of the patterns for which the output unit should have been on did actually lead to an activation value of 1. And there is a second way in which the performance as shown by Figure 3.10 can be deceptive. Imagine that the net has indeed become sensitive to the 5 input patterns, but overly so: rather than the Interrogative output unit being on for only those 5 patterns, it is on for 500 patterns. Although there are now 495 patterns which the network classifies incorrectly, the overall performance would still be close to 99%. Using NLP terminology, we can call the first type of these mistakes 'missed errors' and the second type 'spurious errors'. Figure 3.11 shows the percentages of missed and spurious errors for the training corpus and the first test corpus (Note 20).
The numbers in Figure 3.11 are more revealing than the ones given earlier: although the overall performance is still very good, it is obvious now that the network has far more problems with spotting imperatives than with the other moods. If we look at the properties of the training corpus then we can easily find the cause for this discrepancy: there are 24,238 input patterns for which the Indicative unit has to be on, 15,528 patterns for which the Interrogative unit should be on, and only 1,926 patterns for the Imperative unit. (There are also 5,646 patterns for punctuation marks for which no output unit has to be active.) Hence, the network has seen far fewer imperatives, and is consequently biased against raising the activation value of the Imperative unit (Note 21). Moreover, the sentence-initial sequence 'do' could start both imperative sentences (e.g. 'do not chase the falcons!') and Yes/No-Questions (e.g. 'do the sharks like the fish?'). The greater frequency of the Yes/No-Questions in the corpus entails that all these occurrences of 'do' are initially classified as belonging to interrogative sentences. A third and final reason for the poor performance of the imperatives is that many of them are very short (e.g. 'drink!'). If the network errs on the first word, then it has no chance of correcting itself.