30 May State of the Art in Knowledge Representation
KR Evolves Slowly
The state of the art in computer programming has evolved toward data-driven techniques. In early programs, the data was “hard-coded” into the program with specific functions operating differently on each data item and type. Gradually, programmers began storing data and templates in different files, attempting to write orthogonal procedures to introduce a modularity to programs and flexibility to data files. The current trend is to simplify the functions even more, storing pertinent information with the data that permits these simple programs to perform the necessary data manipulations and inferences necessary to accomplish the tasks (data driven programs). With the rise of object oriented programming systems (OOPS), programs can actually be bound to or embedded in the data, with the information controlling the machine (see Section 9).
|Understanding Context Cross-Reference|
|Click on these Links to other posts and glossary/bibliography references|
|Prior Post||Next Post|
|Framing Formal Logic|
|Minsky 1975 McCreary 2014|
|LinkedIn Architecture Discussion|
In most companies today, the automation ecosystem consists of networks, systems or “applications”, databases and data file stores. Many people, when they envision an enterprise-grade technology solution, think of a bunch of processes built around, and controlling the flow of information in and out of a database such as Oracle or SQLServer. The type of database is almost always an RDBMS or Relational Database Management System. ERP, CRM, PLM, ALM, MDM, and many other versions of alphabet soup are built on this architecture, and more money is spent in corporate America on this type of solution than all others put together. This picture shows how one CRM vendor has implemented their solution: you see the interfaces at the top, the coded processes in the middle, and the relational data serving as the foundation for the system.
I think the reason this model is extremely common is because it is easy to decompose and understand: you have processes and they need data, and generate data, so you use the software system to tie different sources of data together and keep manual human activity to a minimum.
Unfortunately, this traditional model doesn’t always work well. Consider Amazon’s quest for high performance sales transactions from anywhere in the world 24 hours a day: “They had unlimited licenses for RDBMS software and a consulting budget to attract the best and brightest consultants for their projects. In spite of all that power and money, they eventually realized that a relational model wouldn’t meet their future business needs” (McCreary 2014 pp.11-12). The description of how they chose to meet the business needs is very telling and I’ll return to this in future posts.
The move away from code controlling the cybernetic universe, to knowledge, has at least two big advantages: 1) the more the code is tied to the data, the smarter the data will be; and 2) adding new code to an existing system is becoming easier and more efficient.
|Infancy||Large programs on punched cards||Embedded Data|
|Adolescence (Now)||Medium Programs / Services / Apps||Dependent Data|
|Adulthood||Dependent Heuristics / Services / Apps||Independent Data|
Comparing our current state in systems design and enterprise architecture, the things I get paid to do day in and day out, to adolescence, is no accident. As with adolescent humans, hormonal changes often create a sense of awkwardness combined with a desire to do something interesting. It is growing maturity that (hopefully) brings greater discipline to the human. As I follow several discussions on Enterprise Architecture on LinkedIn (Tame Problems Only), I find that this characterization of adolescence is borne out by the varieties of opinions on what is important.
Organizations with automation needs can also benefit from additional maturity on making decisions about automation. While some people are beguiled by the newest shiny technology that comes along, it is certainly advisable to make automation decisions only after a solid business case has shown that the benefits justify the costs. This illustration shows a very high level view of what I consider to be a “state-of-the-art” approach to automated system or solution design. The cylinders generally represent permanently stored data, and it is often the case that workflows and rules are considered part of metadata. I’m OK with both conjoined and separated ideas.
In the next few sections, we investigate the strengths of current data modeling, knowledge representation (KR), and inference techniques, including their ability to gracefully degrade in the face of noise and at the periphery of a system’s knowledge. We discuss methods for distributing knowledge and incorporating fuzzy logic that can be used in intelligent processing techniques and expert systems, showing how such systems might incorporate distributed and neural KR schema.
You’ll notice that, in this post I do not mention RDF or OWL or any number of popular knowledge models. I’ll save that for other posts. The point of the discussion today is to establish context for evaluating options.
Types of KR
Digital knowledge is like assorted chocolates: the way it’s wrapped is completely independent of what’s inside. But the more artful the exterior, the more desirable. Knowledge representation has two parts: knowledge and a formalism for representing it. KR goes a step beyond data modeling. Many types of KR schemes are used in AI programming such as expert systems. Expert systems that use automatic reasoning techniques to simulate the expertise of a trained human to assist in problem solving, monitoring and diagnosing may use one or more of the techniques identified in this section. As these approaches are used to support simulated reasoning, they can also be generalized for AI applications and heuristics other than expert systems.
In this post, and others planned for this section, the strengths of these KR schemes will be discussed and a method for choosing the most appropriate scheme for the application will be proposed. Because rule-based systems are the most common, they will be discussed in detail.
While we are on the subject of knowledge representation, let’s agree on a basic definition of knowledge.
- Data is a symbolic representation of a fact or object unrelated to any other fact or object;
- Information is more than one datum loosely joined together based on some association;
- Knowledge is meaningful information, combined in context to become actionable.
It is possible to separate dictionary or encyclopedic knowledge from process knowledge, but I prefer to bind them into a single discussion. Process knowledge is often embodied in rules. Rule-based KR schemata generally use a combination of facts and rules. The facts are the input data that is used to activate the rules and determine the output. In rule-based parsers, such as programming language compilers, the facts are lines of program source code. In rule-based expert systems, the facts are generally formatted fragments of data or propositions about the way facts interact in the problem domain.
The MYCIN expert system is an example of a rule-based AI application. Given a set of facts describing the status and symptoms of a patient, MYCIN’s rules compare the conditions of the patient’s situation with the conditions necessary to activate its diagnostic rules. If a patient comes in with a complaint of chest pains, MYCIN will activate those rules for chest pains that apply most to the patient. The rules for chest pains for senior citizens will obviously differ somewhat from the rules for high-school athletes. The most complete possible set of facts should then produce the most accurate or complete diagnosis.
The reason a complete set of facts is necessary for complex rule-based systems such as MYCIN is that the process of diagnosis, as with many other complex processes, involves integration of a large number of constraints. The single constraint just mentioned, the age of the patient, may lead to widely different assumptions about the cause of the pains. The more valid constraints applied to a problem, from a hangnail to global climate disruption, the more likely the solution will be useful and lead to progress. These constraints are the stuff of knowledge, thus of KR. I will show some options in upcoming posts.
|Click below to look in each Understanding Context section|
|4||Perception and Cognition||5||Fuzzy Logic||6||Language and Dialog||7||Cybernetic Models|
|8||Apps and Processes||9||The End of Code||Glossary||Bibliography|