|
|
3.3.4.4 Parsing New Words
We've already seen that the network can generalize its knowledge to many types of input which it has not been confronted with during training. But I have been keeping one of the more interesting experiments for this subsection: how does CLASPnet react when it sees words it has never seen before? In a similar situation, humans can usually manage with remarkable skill -- one need only think of Jabberwocky-like nonsense poetry. In order to challenge the network with new words, I added 'lord', 'lady', 'dude', 'computer', 'hug', 'embrace' and 'gargantuan' to the vocabulary. These words were given the appropriate orthographic representation, but no semantic representation at all, as this would have largely defeated the purpose of the experiment.
First, the nouns. The experimental sentences 'the lady love the girl.' and 'the dude chase the sharks.' turn out to be no challenge at all for the network: they are processed as if known nouns are present. So, in cases like these, the network is able to jump from a position in which a noun is still expected to one in which the verb has already been processed. Transposing the known and unknown nouns does not make a difference: 'the woman like the lord.' and 'the woman like the very happy dude near the boat.' are also interpreted correctly. But, to my initial surprise, there was a substantial difference between the parsing of 'the lady love the girl.' and 'the lord love the girl.'. In the second sentence, the dominant clause type unit suddenly changed from Declarative after the initial 'the' to Connector after 'lord'! To make matters worse, no such inconsistency occurs in 'the very happy lord love the girl.' However, the network has a very good reason for its seemingly erratic behavior: recall the 'the more ..., the less ...' constructions referred to in 3.3.3.4 above -- the words 'more' and 'lord' share two of their four letters. During training, the network has learnt to associate the orthographic representation 'm,o,r,e,- ,- ,- ,- ,m,o,r,e' with the Connector unit; the only way it could do so was to make strong excitatory connections leading from this particular orthographic representation all the way to the output unit. Hence, when 'l,o,r,d,- ,- ,- ,- ,l,o,r,d' is seen, the network cannot but notice the similarity with the pattern it already knows, and acts accordingly. The effect only happens when 'lord' follows 'the' because it is only in this context that 'more' should cause the Connector unit to become active. Which explains why 'the very happy lord ...' is processed like 'the very happy lady ...' (Note 24). Finally, as far as 'computer' is concerned, a sentence like 'the eagles like the falcons on the computer.' does not present any difficulties to the network.
Second, the adjective. It turns out that 'gargantuan' functions like a normal adjective for CLASPnet: both 'the very gargantuan tiger chase the wolf.' and 'the men wash the gargantuan woman.' look perfectly normal to the network.
Third, the verbs. The results of using 'hug' and 'embrace' are mixed, but also show that the context in which the words appear is used to help interpret the unknown patterns. So, in the sentence 'the man say to the girl that he be going to hug the woman.' the unknown pattern can only be processed as if it were a 'verb' -- whatever that means to the network. If we turn back to the spatial metaphor, there is only one possible way to go from the network's region after it has seen 'be going to'. When there are more options, the network has to be more careful and this uncertainty becomes apparent in sentences like 'the man hug the woman.' and 'can the man hug the woman on the boat?'. In both cases, the activation value of the Declarative unit drops from a very high value to a low one when the net sees 'hug', and the activation value of the Relative unit increases noticeably. By doing this, it shows that is aware that the input sentence might actually also be 'the man hug like love the wolf.' in which 'hug' would be an unknown personal pronoun with nominative case.
Taking the results from the three categories together, we can conclude that the network does a very good job of parsing unknown words: it takes into account similar words which it already knows about, and also tries to hedge its bets when it is uncertain (Note 25).