|
|
3.3.4 Generalization to New Patterns
In Figure 3.9 of section 3.3.2 above, we have seen that the trained network can generalize very well from the input patterns in the training corpus to those in the test corpora: for all the tolerance values from 0 to 0.5, the maximum difference between the two is at most 2%. The results from the previous section, however, indicate that the differences are much larger for some units than others. But all the sentences in the training and test corpora have been generated by the same grammar, and, as such, the question arises of how well the network can cope with (sequences of) input patterns which it has never seen before. This section addresses that question by presenting the results of a number of experiments with 'deviant' input patterns: sentences which are longer than anything the network has ever seen, which have more levels of embedded clauses, which miss words or contain new ones, etc. In section 3.3.4.6 below, another line for answering the same question will be pursued: it deals with how the trained network copes with fragments of real written English.