|
|
3.4.3.2 Training on Only Orthography or Semantics
The second way of comparing the information stored in the orthographical and semantic representations is to train networks on only one of the two and to compare the results. Starting from a single corpus file of 4,000 sentences (maximum length of 20 words), I again generated a semantics-only corpus file, alongside its orthography-only counterpart. Networks with architectures identical to that of CLASPnet were then trained on these corpora (Note 30). After training, they were both tested on an identical test file, and it are the testing results which are shown in Figure 3.26 (Note 31).
The leftmost bar in the figure indicates the performance of a network which was trained on both types of input, using the same corpus. From Figure 3.26 it is clear that a network which only has access to orthographic information can still be trained to nearly the same level of performance as the network which has access to both. It is only in the group of the missed errors for the Infinity, Voice, and Polarity units that there is a difference of more than 5%. (Actually, for the Voice unit, the performance of the orthography-only network is 2% worse than that of the semantics-only net.) While this may be a comforting thought for syntacticians, Figure 3.26 also illustrates that a network which has only been trained on semantic information can still learn to 'do syntax' -- even in the worst case, that of the clausal types, nearly 60% of the input patterns is still correctly classified (Note 32). As in the previous case, we can only conclude that there seems to be cause to believe that local semantic information can be of much use in detecting properties of clauses.
Note 30
Note 31
It would have been possible to speed up learning by removing the part of the
network which was not receiving any inputs at all.
It is worth pointing out that the semantics-only network generalizes a lot
better than the orthography-only network. With the former, the performance
measures differ only one or two percents, while the latter produces
discrepancies of more than 10% (e.g. 17% for the Complement unit). So,
whatever it is in the semantic representation that the network can use for
the tasks, it is certainly a reliable indicator (see also 3.4.4 below).