|
|
3.3.4.2 Going Down, Down, Down
Although sentence or clause length in itself does not seem to play a significant role for CLASPnet, we have already encountered evidence that the structural complexity of embedded clauses does confuse the network (see 3.3.3.4). Hence, the next experiment tried to find out how many levels of embedding were supported. The results give cause for limited optimism. A simple sentence like 'the man, who love the woman, sleep.' is processed without any difficulties. In the slightly longer 'the very very happy man, who love the very very happy woman, sleep', however, the network forgets about the relative clause type after seeing the second 'very' in the relative clause. But the final 'sleep' is still analyzed correctly as part of the matrix clause. Add another few words and real problems follow: 'the happy man, who love the very very happy woman near the boat, sleep.' Again, the relative clause is lost, but when 'sleep' is seen, the units for Declarative (0.89), Order (0.59), and Connector (0.44) compete for the clausal type analysis, and the second ID unit remains twice as active as the first one. CLASPnet is clearly not well trained for sentences like these.
But, to my own surprise, I found that 'the happy man, whom the woman, who like the cat, love, sleep.' is processed as desired: the ID units move from 1 to 2 to 3, and then back to 2 and 1. The Status unit is only highly active when it has to be. The clausal type units are also correct, as are the Infinity, Voice, and Polarity units. Making the sentences ever more complex, I then tried 'the happy man, whom the woman, whom the cat like on a spaceship, love, sleep', 'the man, whom the women, whom the experts, whom the tigers tease, miss, like, paint.' and even 'the man, whom the women, whom the experts, whom the shark , whom the tigers tease near the boat, miss, love, hate, type.'. I found that the extra prepositional phrase in the first of those sentences did again confuse the network: the Declarative unit became more active than the Relative unit, and only the final 'sleep' was again interpreted as desired. Dropping the prepositional phrase, but adding another relative clause showed that the network did indeed lose track of which verb belonged to which clause: no clausal type unit became highly active for 'miss' and 'paint' -- though 'sleep' was correct again. The final sentence had a surprise: although the prepositional phrase did again result in clausal type confusion, the verbs 'miss' and 'hate' led to an active ID2 unit, while the verbs 'love' and 'type' saw an active ID1 unit accompanied by an active Declarative unit. Apparently, the network simply switched back and forth between the only two interpretations which it thought relevant.
How do these ambivalent results compare with those of other models? The study of embedded constructions has been a long-time favorite among connectionists (e.g. Elman 1992; Christiansen & Chater 1994), and it has been shown a number of times that neural networks can become sensitive to embedded structures. In Elman's (1992) model, for example, up to three levels of embedding could be analyzed by the network. But the language input which that network saw was a lot less complicated than the one I have used for CLASPnet. It also contained a relatively greater amount of embedded clauses in the training corpus. The findings presented above show that one should be very careful about claims of competence in this area: some embedded constructions may be mastered while others continue to present problems; and long embedded clauses are not as easy to master as short ones. I expect that it will be possible to extend CLASPnet to handle many more such sentences without problems -- perhaps by increasing their frequency in the training corpus, or by simply increasing the size of the network -- but this is mere hand-waving at the moment.