|
|
3.3.4.6 Some Natural Language Samples
The last type of experimental input was intended to highlight the limitations of CLASPnet. Although the trained net can classify more than 90% of its input patterns correctly, all these patterns were also generated by the grammar developed for CLASPnet. Generalization to other patterns generated by the same grammar has also turned out to be possible. But what about real English? If the history of Artificial Intelligence teaches us anything at all, one of the more depressing insights is that very few computer models scale well from the simplified tasks in the lab to the real world outside. In the case of CLASPnet, checking its real generalization capacities has been a fairly straightforward task: I selected three short samples (approximately 250 clauses each) of different textual styles (a fairy tale, a scientific paper, and a technical manual), and manually determined the desired output pattern for each of the clauses in the samples (Note 27). The input-output patterns were then presented as test corpora to CLASPnet.
It will come as no surprise that the network was quite often stumped by the input it saw. Not only did the natural language samples contain a large number of constructions which the net had never seen, most of the words were also new -- the combination of both factors was nearly always fatal. The overall percentages (averaged over the 17 output units, and for a tolerance of 0.2) for the three samples are clearly below the 90% which CLASPnet scores on the generated corpora: fairy tale, 77% correct; scientific article, 76%; technical manual 79%. These numbers are, however, difficult to interpret because some types of sentences and clauses did simply not occur at all in the samples. If we only take into account those output units for which there were at least relevant patterns, and we calculate the overall percentages of missed errors, we find that 64% of the patterns were missed in the fairy tale, versus 58% for the scientific article, and 'only' 56% for the sample from the technical manual. These numbers make it very clear that CLASPnet currently can not analyze samples of normal written English with a high degree of accuracy: although the output units are mostly off when they should be, they are on in less than half of the cases in which they should be.
Still, when the clauses in the samples happen to conform to the constructions present in the training corpus, the network tends to do a decent job: for example, 'this wife, who was a widow, had two daughters.', 'okrent argues that heidegger's work on the nature of the intentional provides important insight into the conditions of thought.', and 'the number of redundant links is customer dependent.' were mostly analyzed as desired. (The word by word analyses of these and other sentences can be found in Appendix 6.2.) This at least suggests that the performance of the network on natural language samples could be improved if it were also trained on such samples.