28 Jan Summarization and Translation Domains

Translation as A Sample Domain

For our sample domain, we need something that requires expertise, is not trivial, and about which the author knows something. This limits us significantly, so we are taking the easy way out and going with the domain of Machine Translation (MT) of human languages.

We considered the intricacies of communication and translation in Section 6. In the next few posts, I will take what I know about translation, and build it into a system design intended to make expert decisions. The main expert task we are trying to solve is word-sense disambiguation; in other words, we need to interpret the correct meaning. In this knowledge domain, multiple constraints like syntax, semantics, and pragmatics will be used to correctly interpret the meaning of a word with multiple possible meanings. These constraints need to be built into the knowledge representation scheme.

Understanding Context Cross-Reference
Click on these Links to other posts and glossary/bibliography references

Section 8 #19

Table of Context

Prior Post	Next Post
Segregating Layers of Intelligence	Stay Tuned
Definitions	References
heuristic translation	Nirenburg 1987
interpret meaning	St. Augustine
domain expertise	Aho 1972 Rajsky 2008

At some point in the not too distant future, it will be possible to get instant, highly accurate translations of the text you write in your native language, and automatic summaries of documents and web pages. Much of this blog is about the technology needed to make this possible. But the ideas, and some of the core processes have been around for decades. In the 1980s when I lived in Japan, I purchased computers and word processors that had “henkan” (transform) keys that let you type Roman letters, hiragana or katakana on the keyboard and automatically convert input into kanji. The functions were very accurate (for me anyway) and it made data entry very fast and easy. That functionality is rudimentary compared to translating meaning from one language to another, or even summarization, but similar statistical or heuristic processes can be used as part of much more complex processes.

Summarization is simpler than translation because all you need is a gist of the content in the source language, or in another language. The starting point of these capabilities in understanding the source text. A linear model of interpretation or comprehension is useful for specifying the sequence of processes in this domain.

In a sales transaction, the typical structure includes a buyer, an item to purchase, a price and a form of payment. In this model, any message, whether written, spoken or gesticulated, is constructed by representing the semantic content in syntactically correct structures. The linear model of comprehension, in this case, is to analyze the syntax then analyze the semantics to derive the meaning in the original message. This is, of course, a gross oversimplification because it leaves out important elements such as discourse pragmatics, morphology, emphasis in tone of voice, and, perhaps most importantly, context. For the purpose of this brief discussion, however, we will concentrate on the simplified model and add other important elements later.

As we move into the design and development processes of this linguistic expert system, I will try to describe the processes and components in a general way so that they can be applicable to multiple domains of expertise. The linear model of interpretation:

Meaningful Utterance – Syntax = Semantic Core + Expressive Elements = Message

implies we need knowledge of language elements that provide the expressiveness from which we can infer meaning of the source text. These knowledge elements include:

words and phrases	a dictionary or lexicon
combination patterns	syntactic and semantic grammars
detailed directions	rules for the process domain

The entire process from input to output can be governed by rules, or, alternatively heuristics or conventional programs can be used to control the overall process, invoking rules at specific points in the process. The illustration below shows the knowledge components for summarization and translation. Summarization implies only a single language, while translation requires two (or more). The two processes can be combined in “gist translation” which simply adds the summarization rules and processes to the translation domain.

With manufacturing, retail, financial services or other domains, the knowledge elements will be different, but we still begin by creating a canonical or reference model of what is needed to successfully deliver the required outcomes.

Putting it Together

Formal methodologies help systems people do things right the first time. Expert systems and other complex knowledge-based applications often require a cyclic approach, pushing us toward agile methods. The development cycle for knowledge-based and expert systems, as shown below, is similar to that of any other software system.

1	Document and prioritize requirements	Give Me Smart Requirements
2	Assemble the knowledge elements	Identifying and Acquiring Knowledge
3	Select a KR Scheme	Planning and Scheming
4	Build the base model	upcoming post: Models Improve the Runway
5	Define the process flows	Rings of Power: Workflow…
6	Establish the rules	Unlocking the Power of Unruly Systems
7	Test each functional block	Measuring Knowledge
8	Debug and Refine	Land of Code
9	Get user validation and feedback	Immediate Feedback
	Return to step 1

This development cycle may roll through all four phases a few times before you are ready to give the final product to users. Once it is ready, you can move into the installation and support phases.

Defining Capabilities and Outcomes

Different types of requirements need to be considered:

1) Business requirements:

These are what the business needs to improve profits or lower costs or meet some other compelling need. Business requirements are defined in terms of users and work flows.

2) Technical requirements:

These are technical issues such as company automation standards and interoperability that can be described in terms of platforms and data flows.

Please refer to my recent post on SMART Requirements. Summarization and Translation domains are the main point of my quest for the universal translator. Let’s make sure we understand the real problem before we start spinning code.

Click below to look in each Understanding Context section

Intro	Context	1	Brains	2	Neurons	3	Neural Networks
4	Perception and Cognition	5	Fuzzy Logic	6	Language and Dialog	7	Cybernetic Models
8	Apps and Processes	9	The End of Code		Glossary		Bibliography

Posted by Joe Roushar in Communication, Computing, Enterprise Applications, Knowledge, Linguistics, Mobile Apps, Social Interaction, Software Design, Technology , Followed with Comments Off

Post Tagged with cognitive modeling, communication, context, cybernetics, fuzzy logic, information, intelligence, interlingua, interpretation, knowledge, language, ontology, semantics, syntax, translation, understanding

SHARE THIS ARTICLE

Comments are closed.

Recent Posts

Recent Comments

28 Jan Summarization and Translation Domains

Translation as A Sample Domain

Putting it Together

Defining Capabilities and Outcomes

Like this:

Related

SHARE THIS ARTICLE

Recent Posts

Recent Comments

28 Jan Summarization and Translation Domains

Translation as A Sample Domain

Putting it Together

Defining Capabilities and Outcomes

Share this:

Like this:

Related

SHARE THIS ARTICLE