Stéphan Tulkens

NLP Person

Talk: Lexical representations in theories of word reading

I recently gave a talk about lexical representations at a CoNGA meeting, which is a monthly meeting of a group of philosophers, neuroscientists, and computational linguists.

This session was on the topic of representation, a construct which is at the heart of many theories in cognitive science. We meet every month on one of the campuses of the University of Antwerp, so if you want to come down and have a chat, please do!

You can find the slides here

In this post I want to give the outline of the talk, as to provide people who weren’t there an idea of what I talked about.

Dictionaries

Reading, regardless of context, is often conceptualized as being a strict conversion from visual information into content.

In this conversion process, a dictionary metaphor is often used; when we see a word, the orthography of that word provides a key which unlocks the content of that word in our minds. Content, in this case, means anything related to that word, e.g. syntax, semantics and phonology. From this relatively harmless metaphor we can gleam two assumptions:

  1. There is a strict division between key and content
  2. The orthography is the key to the content

As an example to show the metaphor in action, consider the Wikipedia entry for the Mental Lexicon:

… is a mental dictionary that contains information regarding a word’s meaning, pronunciation, syntactic characteristics, and so on.

In the talk, I used evidence from studies on bilinguals to show that phonology and lexical semantics both play a role in lexical access, and hence should be part of the key of this dictionary. I then argued that doing this makes the whole idea of a dictionary, or a division between key and content completely superfluous.

Every part of the content of a word can be used to address this word, and no part of what a word is in our minds has privileged status. The reason we assume that words are addressed using only their orthographic form is that, in word reading, this is literally the only thing we have.

A good example to illustrate how this assumption is incorrectly operationalized comes from the Interactive Activation model, and its bilingual predecessors, the BIA and BIA PLUS models, who model the “resting activation” of a word as being a direct translation of the frequency of the orthographic form of that word.

The BIA model goes further by claiming that interlingual homographs (e.g. Dutch - English ROOM) need separate orthographic representations for each language, because you can’t simply add the frequencies of language-ambiguous words together and still get a good approximation of reading behavior. The authors concede that this counterintuitive, but they stick to it. A better move would perhaps be to drop frequency as a central organizational principle.

While frequency is of course a very good base predictor of reaction time, it is likely that what we’re really seeing is a property which correlates really well with reading time.

<< Older