07 Dec Probability of Understanding Meaning
Some suggest that computers can achieve full language understanding capabilities using statistical models. Others argue that heuristics or programmatic interpretation that uses special procedures tailored to linguistic phenomena. The two camps are as far apart as ever.
Consider the comments around this recent article on Tor.com. On the one side, Norvig demonstrates the validity of, what Kevin Gold, the author, describes as “truth by statistics”. His results are impressive and difficult to dispute. Yet, advocates for more structure (syntax) and meaning (semantics) in the process are adamant as ever that Chomsky’s theories are essential to robust automated natural language understanding and true artificial intelligence. Can they both be right? How about the behaviorists?
Understanding Context Cross-Reference |
---|
Click on these Links to other posts and glossary/bibliography references |
|
|
Prior Post | Next Post |
Language Expressiveness | Modeling Neural Interconnections |
Definitions | References |
natural language meaning | Chomsky 1986 |
artificial intelligence | Aho 1972 |
body language polysemy | Graesser 1990 |
Growing up in an environment rich in symbolic traffic, children begin to associate and categorize verbal and body language patterns. Those little neurons adapt to the associations between color, texture, shape, sound, emotion and observed outcomes, and begin to develop predispositions to understand meaning in context. The learning that takes place in the human neural network can be described, analytically, post-facto, using statistical formulae. The states of the human neural network before and after the learning can be characterized using statistical formulae. But is learning possible in an environment free of structure, content and meaning: an environment filled with and ruled by mathematical assumptions? I’d like to share my interpretation of the results.
Statistical approaches have proven very successful in language translation. Here’s my overly simplified understanding of how they work: the statistical approaches examine large amounts of correctly translated text in the source and target languages, and create statistical models of the probability of each collection and sequence of words in the source language correctly mapping to a corresponding collection and sequence of words in the target language. These models may be developed at the sentence, phrase and/or clause level and may add some constraints that improve the models’ performance given certain characteristics of the source text. Approaches that train on enormous volumes of text in both languages have proven to achieve high levels of accuracy in translation.
A few key questions arise: is understanding meaning needed for accurate translation? If not, why not? Furthermore, can statistical approaches be used to extract actionable knowledge from text? Arguably, the statistical models do not truly capture any sense of meaning in their wholly mathematical mappings. So this implies understanding meaning is not necessary for accurate translation if the statistical models do a good enough job of mapping words and phrases across language boundaries. I believe this is possible because –
- the words themselves carry enough of the meaning to translate, and
- the aggregate groupings of words carry enough context to translate, and
- the statistical maps capture enough of the structure to convey the associations between the concepts and the contexts to accurately translate.
Thus, if the meaning and context remain in the words through the translation process, they never need to be extracted to deliver accurate translations. This is my belief about why statistical methods yield accurate translations. It also helps explain why I believe that the same approaches will not be useful in text summarization, question answering and other natural language processing, learning or artificial intelligence tasks that do not involve multilingual translation.
One of the most vexing problems in language understanding is polysemy: the fact that one word or phrase may have multiple meanings. If the statistical approach to translation carries the meaning across language boundaries, then the problem of polysemy is either resolved by statistical probabilities or by the reader of the translated text. In coming posts I will attempt to show how ontological structure, meaning and statistical approaches must combine to deliver deep understanding, learning and artificial intelligence.
Click below to look in each Understanding Context section |
---|