A model with as many variable dimensions as CLASPnet can of
course be extended in many ways.
The Corpus: As has been mentioned above, the most attractive way
of changing CLASPnet would be to train and test the model
exclusively on natural language corpora. One could even mix training
corpora from different languages in order to look at the effects of a
simulated multilingual environment. Another option would be to assemble
a corpus of 'care-taker speech', i.e. the language which is used to
address very young children. This would make it possible to try to model
the order in which infants acquire different clausal constructions and
become sensitive to properties of clauses.
The Grammar: If one sticks to using a grammar like the
context-free one developed for this project, then one can easily
investigate the ramifications which small changes to the grammar have on
the overall behavior and performance of the network. For example, what
happens when a new construction is added? Does it interfere with other
constructions, or is it seamlessly absorbed? What are the effects of
using more ambiguous words, or from asking the network to spot tenses as
well? All these questions are impossible to answer with corpora of
natural language, because they never offer high enough a degree of
control over what they consist of.
The Architecture: This is not one dimension but a large
collection of them. So, many options present themselves: e.g. increasing
the number of hidden layers, reducing the number of connections between
layers, and adding more modules. While obviously of interest to neural
network researchers, linguists would probably also find interesting data
if it they can analyze how a highly modular network splits up the
different classification tasks among its modules. Even totally different
architectures could be tried out: e.g. constructive algorithms, which
keep adding units and connections to a simple network till it learns the
task, or interactions with genetic algorithms which pit many similar
networks against one another to evolve the one that is most suited for
the task.
The Semantic Representation: From a linguistics perspective, this
is probably the most interesting one to manipulate. The perceptual units
of the current model could be replaced by more traditional semantic
features, and the effects would be observable in how the network
performs. Increasing the importance of certain senses by increasing the
number of units representing these senses is another option. Perhaps
adding pragmatic information could also turn out to be useful for the
network.
The Orthographic Representation: While English orthography is
stable, the current representation could be replaced by a phonological
or phonetic one. If one adds prosodic information, and maybe even
information about speaker switches, the network could be expected to at
least learn the tasks in a different way. (For Dutch, it would be
amusing to check whether a network like CLASPnet has more or
fewer problems with learning the different spelling systems which have
been proposed.)
The Output Layer: As easily as one can change the input, changing
the meaning of units on the output layer is also bound to lead to
interesting results. What would, for example, the effects be of dropping
the polarity detection task and asking for the recognition of past
versus present tense instead -- would it make the overall task easier or
not? By experimenting with the presence and absence of these tasks, one
can learn a lot about how they influence one another. When I added the
Infinity unit to CLASPnet, for instance, the scores of all the
other output units were hardly affected, if at all, showing that the
classification task needed for detecting non-finite clauses was already
largely done for other purposes.
Other areas in which the model could be extended are: Natural
Language Processing, where the robustness of connectionist models
could be a welcome addition to (or occasional replacement for) the
traditional models; neurolinguistics and the study of aphasia in
particular, as one could quite easily compare the effects of severing
the connections between layers in the model with what is known about
real patients; and psycholinguistics, as experimental data and
models thereof are usually combined. A final possibility which intrigues
me is to try to model the process of grammaticalization as
described for example by Bybee (1988) and Sweetser (1988): by
continuously training a model with slowly changing data, it should be
feasible to gain insight into how the network adapts its internal
representation as the functions and meanings of the words in the input
evolve.
Copyright 1996. All rights reserved.
Ezra Van Everbroeck
Last change: 10 July 1996
http://snow.ucsd.edu/~ezra/msc/43.html