25 Aug Determinacy in Neural Connections
For many years, researchers thought that it was wrong to assume that there was a cell or set of cells in the brain that stored the memory of Grandma’s face. Though the comparison with computer memory was appealing, it was thought to be too simplistic and incorrect. Now, more researchers in different academic disciplines are assuming that the brain stores information explicitly. The exact mechanisms are unknown, as no part of the brain contains memory chips, but implicit models are generally being rejected in favor of explicit. Unfortunately, most artificial neural systems (ANS) are modeled as implicit and have no clean mechanisms for explicit knowledge representation. This explicitness is a form of determinacy not present in connectionist models. It appears to me that these implicit models are very good at replicating the processes that occur in the first couple layers of the visual cortex, but beyond facial recognition, complex tasks like understanding what Grandma is talking about are beyond the capabilities of these neural networks. I ended my last post considering the disadvantages of implicit modeling in neural processing systems. Today I would like to revisit some of the groundbreaking work in neural modeling, and see what we can gather that will help us in our quest for brain-like systems for understanding human language.
|Understanding Context Cross-Reference
|Click on these Links to other posts and glossary/bibliography references
|Weight Control for Knowledge
Back to Rosenblatt’s Assumptions
By itself, Rosenblatt’s second assumption is absolutely correct. Connection properties between neurons change, and the changes can be long-lasting. When taken in the context of other premises, such as the randomness assumption (#1) and the simple processing element notion, however, the resulting model has a tendency to miss some important neurophysiological realities. Abandoning the eloquent non-determinacy of randomness and simplicity has not taken hold quickly, even though some of the major players in ANS research discarded the myths years ago. Hinton and his team put the problem in clear perspective: “Most of the key issues and questions that have been studied in the context of sequential models do not magically disappear in connectionist models. It is still necessary to perform searches for good solutions to problems or good interpretations of perceptual input, and to create complex internal representations. Ultimately it will be necessary to bridge the gap between hardware-oriented connectionist descriptions and the more abstract symbol manipulation models that have proved to be an extremely powerful and persuasive way of describing human information processing” (Hinton, et al., 1984, p.2). Hinton and others have expressed the need for a new model. The ability of a model to bridge the gap between hardware and software models is the key ingredient in sentient computing. The weakness of connectionism to handle multiple constraints and complex symbol systems can be overcome with an explicit or gnostic model.
The appeal of connectionist models lies in their simplicity, their ability to learn new data, and their graceful degradation in the face of partial internal failure and noise. If the eloquent technology of neural networks can be applied to complex, symbolic, knowledge-based ANS, perhaps some of these same benefits can be accrued. One might consider how a connectionist net might be incorporated in a larger scheme to serve as a learning front end for a cognition simulator requiring multicoded input, such as images and text. There would probably need to be a resemblance to Fukushima’s network. His model is the only one we have discussed in which activation is both cyclical and multidirectional. To provide a clearer illustration of the impact of current assumptions on the implementation of ANS, consider the activation illustrations earlier in this Section that showed how E/I flows from row to row and layer to layer in a typical neural network. If a temporal element were added to account for residual activation at synaptic junctions, a common phenomenon described earlier, successive inputs would be treated differently than in a typical ANS. In the previous example, the second input (alternate input) did not cause a neurode in the second layer to fire, whereas, with residual activation, the second layer does have a fired neurode. Of course, the temporal restriction common to ANS is necessary to enable the simple and eloquent learning algorithms employed. This restriction limits the applicability of such networks to temporally predictable or complete domains. The resulting domain limitation is critical to deciding how to apply ANS in our model.
An incorporation of both the implicit representation approach and arbitrary prior structure is described by Grossberg in Biological Cybernetics (1976, pp. 121-134). The network he describes incorporates both short-term and long-term memory as well as mechanisms for several specific functions related to image processing and feature detection. Explicit long-term memory models often incorporate complex gnostic cells, like the “grandmother cells” discussed in Section 2, to hold explicit representations. In image-processing systems, gnostic cells have proven to be an effective means of managing complex recognition tasks (Fukushima, 1988, p. 67). Appropriate roles for gnostic cells and the determination of an appropriate prior structure or preprogrammed starting point for computer models of knowledge and cognition were discussed in Sections 4 and 5. An even more explicit model of neural processing is the Boltzmann Machine (Hinton, et al., 1984). For symbolic processes such as machine reasoning, expert systems, natural language interpretation, and other applications requiring multi-valued and fuzzy logic, this type of hardware architecture has not yet proven useful or even applicable. It is possible that the reason for this incompatibility is the lockstep cyclic processing mode and the fineness of PE granularity, such as single binary bit PE values. These shortcomings could be overcome by introducing more complexity into the algorithm or the PEs so they could store residual activation from one input processing cycle to the next.
More on Rosenblatt’s Assumptions
Earlier we responded to Rosenblatt’s first two assumptions, showing why they do not enjoy universal acceptance. The next three assumptions (R-3 through R-5) are more useful in designing neuromorphic learning models. 3. The changes in the network described in assumption 2 result from exposure to a large input sample; similar cells develop pathways to some cells or cell groups, while dissimilar cells form links with others. This assumption appears to be wholly consistent with what is currently known. 4. Positive reinforcement can facilitate and negative can hinder formation of links in progress. Thus interaction with the environment is the catalyst for network development. This assumption, too, is both physiologically and psychologically consistent. 5. SIMILARITY in this context is described as a “tendency of similar stimuli to activate the same set of cells.” This assumption also follows from the data. The psychologically oriented of Rosenblatt’s assumptions (3-5) appear to stood the test of further research, while the first two assumptions fall in the face of new physiological data. Assumption one dealt with brain structure, two with function, and three through five with process. Although the structural and functional ideas about the brain may have been weak, ideas about process stand. Assumptions 1 and 2 should be revised. Assumptions 3 and 4 give us excellent clues on learning. Repetition and reinforcement are key to the process. If we associate knowledge in a system with confidence values as numeric weights, one possible learning procedure is to create newly encountered knowledge with very low weights, and elevate the confidence values every time we encounter corroborating facts. Conceptually, this is what connectionist networks do to build a large store of implicit knowledge. A distributed explicit knowledge representation model with weighted fragments of knowledge could learn using similar rules, formulas or algorithms. I discussed the formulas for neural learning in my post Learning from Errors. In upcoming posts, I will attempt to show how a consistent model could be constructed that incorporates the strengths of these intuitive ideas while remaining consistent with neuro-physiological realities.
|Click below to look in each Understanding Context section
|Perception and Cognition
|Language and Dialog
|Apps and Processes
|The End of Code