26 Nov Planning and Scheming
Select a Knowledge Representation (KR) Scheme
In prior posts I have been describing the steps of building knowledge systems. A major part of Step 3: Task 1 is defining how to store knowledge – selecting a scheme. Giarratano and Riley (1989) suggest making the selection of a scheme, such as rules, frames or logic, dependent upon the application. Neural approaches or fuzzy reasoning techniques should be considered at this stage as well. When looking at your specific domain, you may find that the constraints are primarily two or three dimensional, such as in geographic information systems. My chosen domain for today is Machine Translation or MT. Multiple techniques can be integrated for constraint resolution because MT involves a minimum of two radically different types of constraints: lexical and syntactic. Because many more types of constraints can be applied (morphological, semantic, pragmatic, propositional…), the KR scheme must be carefully chosen to support the inference procedures planned for the interpretation and generation phases of MT.
Understanding Context Cross-Reference |
---|
Click on these Links to other posts and glossary/bibliography references |
|
|
Prior Post | Next Post |
Identifying and Acquiring Knowledge | Measuring Knowledge |
Definitions | References |
fuzzy reasoning KR | Thorndyke 1979 Giarratano 1989 |
inference interpretation | Cofer 1976 Bobrow 1975 |
rules frames constraints | Squire 1987 Schank 1986 |
Many problem/solution domains, including MT, often also require an intermediate representation scheme between the real world model and the available digital content. In the case of MT, this intermediate representation could stand for the content of the text divorced from the syntax or lexicon of any particular language. Intermediate schemes are called transfer schemes or interlingua, and they are created on the basis of source-language text analysis and used as the basis for target-language text generation. Here again, the two tasks of Step 3 overlap in that inference techniques, KR, and fact structure tightly interact.
- Rules (as Code or as Data)
- Frames (Defined Frames, Slots and Fillers)
- Logical form (Forward or Backward Chaining)
- Ontology (Taxonomy with Bound Processes)
- Fuzzy logic (Belief or Confidence values)
- Object-oriented (Objects with Inheritance)
- ANS (Trained Neural Network)
- Neuromorphic (Spreading Activation Model)
Getting the Facts
Remember, an inference engine deals with two things:
- Rules – Rules often take the form of propositions or patterns.
- Facts – These are contained in the data to be processed.
In a system for sending parcels around the country, the rules include shipping costs, capacities, restrictions, mileages between destinations, and the like, rules are likely to remain fairly constant in the system (see a system I built). The facts, however, change every day. The facts are the descriptions of the parcels, their weights, where they must go, and by what date, and may include fuel prices and changing tariffs and other cost data. By applying rules to the facts, the system figures out how to best, and most cost effectively get the parcels to their destinations on time. Even with rules set in stone, as long as the facts are assessed at the appropriate time, the system can exhibit dynamic behavior by adapting to facts as they evolve. Fuzzy logic can make such systems even more dynamic.
Where do facts come from? In corporate America, chances are good that the facts are stored in databases, but the semantics behind the facts are often herder to get, and may, in fact, differ for different constituencies within a single organization. Capturing and standardizing the definitions for terms, and reducing ambiguity and overloading where possible, is extremely important, and often difficult. When we design an expert system to interface with a database, one of the most important parts of the design phase is to carefully map the fact data from its source into the expert system fact format. This could involve occasional conflicts in the associations between structure and semantics known as impedance mismatch. In resolving this mismatch, as well as possible differences between similar content in different formats, it is critical to keep track of source data types and formats as well as the access mechanisms used to retrieve data sets for use in functions.
The fact structure for MT is fairly well-defined: a text in the source language. This can be more complex than meets the eye in that some languages represent text in ascii while others, such as oriental languages, represent their massive character sets and alphabets using different coding schemes such as CIS in China and JIS in Japan.
Detailed Design Task
The second task in knowledge design (Step 3) is the devil. Up to now we have been defining things at a largely conceptual and logical level, selecting platforms and showing how the pieces will interact with one another. Now is the time to commit to the detailed workflow, data inputs and outputs, and the methods for integrating rules, machine learning and neuromorphic designs.
Step 3: Task 2 – Detailed design ==>
This work is the most important planning and scheming from the perspective of getting the end product to the customer. The sooner you can complete and freeze the deliverables named below before beginning the build phase, the more likely your chance for success.
Typical MT interfaces allow the user to specify the source language text, the target language, the lexicon(s) to use, and/or the domain of the source text. An important part of the interface will be dictionary update procedures that permit users to add unknown words to the dictionary. It is also possible to allow the addition of new rules to the grammar.
These knowledge-update procedures will be an important part of the testing phase because it is implausible to assume that the first prototype will be anything like ready for market. The selected corpus/corpora will be extremely useful in validating the system; in addition, native speakers of each language, as well as translators, should be planned for and used extensively in the verification process.
Detailed Design Artifacts
One of the most important areas to exercise discipline in the development process is in detailed design. If you can match design to requirements, and you wait to start building the system until you have completed the design and have signatures from the sponsor and owner, you can better manage changes in scope and other areas.
Deliverables:
- Data flow design
- Design specification
- Detailed function design
- User interface design
- Report design
The User Interface
If the system we are designing is going to be used by humans, it ought to be user-friendly and tailored to the users’ circumstances and use patterns. Designing a CAD system to be used in a cubicle implies significantly different user behaviors than designing a photo sharing app for a mobile phone. The term “user-friendly” has become a buzzword, and, like many buzzwords, has come to lose most of its original meaning. The process of making a system user-friendly involves more than building it to run on a graphical user interface (GUI). To really achieve a high level of usability, part of the knowledge engineering and design process must involve human-factors engineering.
In the process of performing human-factors engineering and usability testing, we have found ways to make a system easier for people to use by uncovering constraints or characteristics of the problem domain that we had missed in the normal course of knowledge engineering. For example, in a parcel-distribution expert system, we found that by actually giving the user a graphical map of the distribution area (continental US), we were able to discover patterns in the ideal distribution plans that were generated. We could then tailor the tool to get to those solutions more quickly by reordering the KR and rule firing order.
User-friendly and “GUI” are not synonymous. In fact, an LEGUI (language enabled graphical user interface) may even serve users better by making computing easier. Prior to writing this section the first time several years ago, IBM announced release of a new version of their PC operating system that includes speech input for dictation and navigation. This, in combination with the now-familiar graphical tools associated with a windowed environment and point-and-click motif, is likely to make computing easier for users, especially those with visual disabilities. Now, with more people performing more knowledge tasks on mobile devices, the time is even more ripe for language enabled interfaces.
Using the steps I am outlining in my posts, I hope to show ways of not only enabling language-driven activation and processing on computers and devices, but making the apps themselves much smarter by using contextual semantics to truly understand users’ intent.
Click below to look in each Understanding Context section |
---|