28 Aug Weight Control for Knowledge
Stochastic Models
Data, information and knowledge may be stored in many different ways in computers. Most artificial neural models rely heavily on stochastic or probabilistic techniques for establishing the internal structure that represents the data. The generalized delta rule for adaptation is an example of this sort of technique. The generalized delta rule, developed by D.E. Rumelhart, G.E. Hinton, and R.J. Williams, is used in neural networks with hidden layers improving the straightforward learning procedures with semi-linear activation functions. Yet, implicit in the use of such procedures and formulae are the underlying assumptions of randomness in the network link structure and rejection of explicit representation of data or knowledge within the network.
The emphasis on randomness may be due to the prior lack of knowledge about the extent of determinacy in neural link formation in the human brain. As this determinacy is becoming more widely accepted, the use of stochastic techniques in neural modeling needs to be seriously re-evaluated. This is not to say that they have no place: attempts to interpret data with multiple possible meanings can be facilitated greatly by statistical ordering of the most likely interpretations in a given context. This statistical ordering is what I mean by “weight control” for knowledge.
Understanding Context Cross-Reference |
---|
Click on these Links to other posts and glossary/bibliography references |
|
|
Prior Post | Next Post |
Gnostic Learning Model | |
Definitions |
References |
Huk 2012 | |
Rumelhart 1986 | |
Hinton 1984 |
I have concluded that implicit knowledge representation is not a satisfactory approach for problems with multiple complex constraints, such as language understanding. Yet I also believe that fuzzy logic, a strong point of neural networks, is extremely valuable in processes in which there is inherent ambiguity or missing data. The polysemy, or multiple meanings of words, is a significantly ambiguous aspect of human language. And it is often going to be difficult or impossible for computers to extract subtext or exformation from a string of words. I see a world in which every fragment of knowledge is assigned a confidence value applied to it in the domain or context in which it lives.
Probabilities implemented as weights combined with contextual mechanisms can be extremely useful, eloquent, and optimal in increasing the speed of neural algorithms. It is also possible in a system with weighted knowledge concepts, to define the weights to favor the more frequently accessed patterns. In other words, in a system that reads large amounts of digital content or listens to speech converted to text, the more often it hears a certain pattern, the higher the weight can become. The knowledge concept weights will reflect both the frequency of actual encounter, and the number of different related concepts that are associated with it in the processing environment.Complexity and Granularity
Compared to the flow of electricity in silicon chips, storage devices and motherboards, activation impulses in the brain moves at glacial speeds. The statement that neurons are slow, however, may presume that firing involves processing a single instruction or basic procedure, and the information in neurons roughly corresponds to a single bit register. The processing elements’ speed, however, is only one of many components in a complete cognition system. Considering the complexity of modern parallel computers, it seems obvious that much more needs to be considered in benchmarking the brain against the silicon flip-flop. Communications bottlenecks in all current parallel machines, for example, make them so slow even with fast processing elements that it may be decades before computer technology approaches the connectivity, speed and capacity of the brain. If neurons actually store complex data elements or perform complex functions, fine-grained SIMD architectures can only be considered remotely neuromorphic.
The complexity assumption can be equated with “medium-grained parallelism” as opposed to the traditional fine-grained parallelism of connectionist theory. In the medium-grained model, each neurode can contain multiple “bits” of information, whether they be sensory, logical, relative, functional or any other type of data. This model presupposes a variation of the “grandmother cell” theory in which any data element stored in the brain is explicitly represented in one or more cells, conceptual representations are linked in “semantic fields” to related concepts or data, and that data is simultaneously stored or replicated in many different areas or “contexts” of the brain.
Let’s take a peek at what’s going on with MIPUS: If he were not so lovable, MIPUS might have been left in the bathroom when his batteries wore out. Though he has the ability to plug himself in, and the ability to track his battery life, and specific algorithms for communicating his needs, he sometimes adopts a passive-aggressive attitude and ignores his own programming. This capability, or feature, has been omitted from other models. Fortunately, however, one of the kids took pity on him and brought in an extension cord from the bedroom. When he came to, MIPUS began weighing the probability of getting fired again.
MIPUS can act cute when he knows he has messed up. In this case, when he realized he was rescued after a 3-day nap, he struck up a conversation about bungee jumping.
Click below to look in each Understanding Context section |
---|