Advantages of Facts over SQL
From Blue Mars Developer Guidebook
Fact Representation
A fact is a triple consisting of a subject, a verb, and an object. Examples of two facts in English are: I love chocolate, and a ball is smaller than a house. Typically our system stores facts so that the created relationship is many subjects-to-one-object where possible. An example of such a relationship is names of currencies and the class :currency.
- [Franc instance :currency]
- [Dollar instance :currency]
- [Peso instance :currency]
There are many subjects tied to a single object using the same verb. Generally we try to pick the object as the broad class or most important concept and the subject as an illustration or modifier of it. This is a convention; it is not required. But it means you can guess how data is stored without actually seeing it in advance.
Advantages of Facts over Rules
Fact knowledge has several advantages over topic rules.
The first advantage, which is a big one, is that fact knowledge can be written more compactly than rule knowledge. This makes it easier to rapidly create responses. The following fact declaration says that A.L. Kennedy wrote a book called What you need.
- =exemplar :author “A.L. Kennedy” :book “What You Need”
Using a couple of generic rules that can access the knowledge base appropriately, this creates the equivalent of:
- (…) “What you Need” is a book.
- (…) An example of a book is “What you Need”
- (…) A.L. Kennedy is an author.
- (…) An example of an author is A.L. Kennedy
- (…) A.L.Kennedy wrote “What you Need”.
- (…) “What you Need” was written by A.L.Kennedy
The second advantage is you get classes of objects that can used exactly like a single word to the CHAT-L pattern matcher. You write a pattern like Have I heard of :author and that reacts to all of the authors the system knows about.
The third advantage is the actual reason for a knowledge base – so that you can do inferencing to generate results you didn’t write an explicit rule for. Imagine someone asks Which is smaller, bread or a house? If you seriously want to answer this kind of question instead of stalling, you’d have to write a lot of patterns with outputs, which is impractical. Yet this sort of thing can be represented efficiently in the knowledge base on a gross scale. It only works on a gross scale because entering actual size data on every object in the world is not reasonable. Particularly since sizes of objects vary. But having the knowledge base know that microorganisms are smaller than insects, and insects are smaller than food, and food is smaller than a car and a car is smaller than a building, and buildings are smaller than cities. This can be done quite efficiently and covers a lot of ground.
The fourth advantage is the ability to save lists of things for or about the user. User variables can save isolated words and text, but is not easily manipulated to save related collections of information (like the contents of a shopping cart or all the pets you have ever owned).
Fact System vs Relational Databases
One might have tried to integrate an SQL relational database into the chat scripting language for handling collections of facts. A relational database stores data as records in tables. Each record has a number of named fields. You can require the records in a table be unique by declaring a primary key and you can cross-connect tables by using the primary key from one table as a field in another table. Thereafter it’s all a matter of using queries to retrieve fields from tables according to some criteria. The virtues are avoiding repeated data (except for the keys), speed of table searches, and preexisting debugged code.
But SQL is not efficient for facts. It is expensive (relatively speaking) to do multiple queries and impractical to use sql queries to traverse a graph structure, which some facts in the knowledge base form.
One way to think of the fact system is as a restricted relational database combined with graph traversal algorithms. Each record consists of two fields (subject and object) either of which can be a key. The only datatypes supported are the string and number types. The verb acts as a field label relative to some main entry (which is the primary key if the table has one). Linking records from two tables involves use of the same key. E.g., a record about an item with fields kind, price, weight is represented as this.
- [item1 is watch]
- [50 costs item1]
- [20 weighs item1]
Another table may have records that specify inventory for specific stores.
Fact queries can then locate any or all of this data using one or more queries to do so. And they query doing graph traversals under a variety of constraints, so many questions can be answered with a single query.
Thus the fact system can act as a built-in database for the chatbot. The query capability also goes beyond SQL because it has primitives to directly support chat. One key one is the ability to do a query which dissects a field into its text subwords and then does a summing match of them against user input to find facts with the most hits in the input. It allows you to write a keyword object for each item and find the fact subjects that most closely resemble the input. E.g., the fact
- [blue_watch_Timex keywords item1]
could be queried so that any reference to blue or watch or Timex could retrieve item1.
