|
|
3.4.5 Left Justification, Right Justification, or Both?
The orthographic representation used for CLASPnet is both left-justified and right-justified (see 3.1.1.1). The idea behind this scheme is to allow the network to become sensitive both to prefixes and word stems, on the one hand, and to affixes on the other. In order to test to which degree the network benefits from left-justified and right-justified representations, I ran two new simulations: using training and test corpora of 3,000 sentences (maximum of 20 words per sentence), a network with a left-justified orthographic representations was pitted against one with right-justified representations. The results are shown in Figure 3.29.
In this case, the three networks represented used different training and testing corpora, so small differences can be ignored. The only interesting discrepancy which remains is that for the missed errors of the Infinity, Voice and Polarity units. If we look at those numbers in more detail, we find that there is only a difference of 1% for the Infinity unit; however, it is 8% for the Voice unit and 6% for the Polarity unit. While it is difficult to imagine why spotting 'never' and 'not' should be more difficult for a network using left-justified orthographic representations, it is understandable that detecting the past participle forms of verb as indicators of the passive voice is easier for a network which can look at right-justified orthographic representations.
The overall conclusion for the justification case seems to be that while using double-justified representations does not hurt, it also does not result in noticeably better performance. Hence, the orthographic representation used in CLASPnet is unlikely to have skewed the results in a significant manner, if at all.