Extended AIML

From Jason Shankel's Wiki
Jump to: navigation, search
Line 1: Line 1:
AIML, the [http://http://www.alicebot.org/ Artificial Intelligence Markup Language] is an XML-based standard for implementing Eliza-class chatbots.
+
AIML, the [http://http://www.alicebot.org/ Artificial Intelligence Markup Language] is an XML-based standard for implementing Eliza-class chatbots.  AIML scripts are composed pattern/template pairs which match an input pattern to an output template.
  
 
==AIML Structure==
 
==AIML Structure==
Line 10: Line 10:
 
   <template>#template</template>
 
   <template>#template</template>
 
  </category>
 
  </category>
 +
 +
#pattern is a pattern for a human input.  For example "I LIKE *" would be a pattern corresponding to human sentences such as "I LIKE BEANS" and "I LIKE MUSIC."
 +
 +
#template is a set of instructions for producing a response.  The most basic template form is just literal text like "ME TOO" or "NOT ON YOUR LIFE."  There are also a number of subtags that can be used in <template> to generate text programmatically.  You can find a complete specification for the standard [http://www.alicebot.org/documentation/ here.]
 +
 +
#that is a pattern corresponding to the last thing the bot said.  This allows the bot to create different categories for different contexts.  This allows the bot to respond to input patterns like "YES" and "I DON'T THINK SO."  For example, the bot might ask you if you enjoy opera or if you want to play chess.  AIML can use the <that> clause to create two separate categories for "YES" to handle these two cases.
 +
 +
<category>
 +
  <pattern>YES</pattern>
 +
  <that>DO YOU LIKE OPERA?</that>
 +
  <template>I'M AN OPERA LOVER, TOO</template>
 +
</category>
 +
 +
<category>
 +
  <pattern>YES</pattern>
 +
  <that>DO YOU PLAY CHESS?</that>
 +
  <template>WE SHOULD PLAY SOMETIME</template>
 +
</category>
 +
 +
Categories can be wrapped with a <topic> tag to provide further context:
 +
 +
<topic name="music">
 +
  <category>
 +
  <pattern>YES</pattern>
 +
  <that>DO YOU LIKE OPERA?</that>
 +
  <template>I'M AN OPERA LOVER, TOO</template>
 +
  </category>
 +
</topic>
 +
 +
<topic name="games>
 +
  <category>
 +
  <pattern>YES</pattern>
 +
  <that>DO YOU PLAY CHESS?</that>
 +
  <template>WE SHOULD PLAY SOMETIME</template>
 +
  </category>
 +
</topic>
 +
 +
==Extensions==
 +
 +
AIML provides a good basis for an Eliza-class chatbot, but the standard has a number of limitations which makes it difficult to extend the apparent intelligence of the system beyond the rather superficial level of Eliza.
 +
 +
To address these limitations, we first identify the three major categories of intelligence that AIML addresses: pattern recognition, context identification and response generation.
 +
 +
===Pattern Recognition===
 +
 +
Pattern recognition in AIML is extremely simplistic.  Patterns consist of literal human language strings wildcards designated by '*' and '_'.  The only difference between '*' and '_' is that '_' trumps literal matches and literal matches trump '*.'  The distinction is not important for this discussion.  Wildcard values can be referenced in the template section using the <star> tag.
 +
 +
For example, the pattern "I WANT A *" corresponds to any sentence starting with "I WANT," including "I WANT A BICYCLE," "I WANT WORLD PEACE" and "I WANT RUNNING DOWN THE STREET JACKALS BAROQUE CHEESE WIFFLEBURY."
 +
 +
So right away we see two problems.  First, AIML patterns must specify every idiosyncratic English form of the same idea.  "I WANT *" and "I WOULD LIKE *" and "I WISH I HAD A *" all have to be accounted for separately.  Second, wildcard matches are unconstrained.  Any legal input will match a wildcard and we have to rely on special processing in the template section to figure out the meaning of the sentence. 
 +
 +
For example "I LIKE MARY" and "I LIKE RUNNING" and "I LIKE CHICKEN" and "I LIKE REMBRANDT" are all expressions of liking something but with different connotations.  Rembrandt was a person like Mary, but you probably don't mean you like him personally.  Chicken is a food and running is an activity.  It's left pretty much up to the template section to figure these things out.
 +
 +
What we'd like is an input matching system that matches sentence patterns...types of sentences...and draws inferences in a knowledge system from those structures.  We want to free up AIML from the ambiguities of any particular language and we want to separate the inferences from dependence on any one input language.  The inferences we draw from statements in Spanish or French should be more or less the same as those derived from English or German.
 +
 +
====Wildcard Labels====
 +
 +
The first extension I made to AIML is allowing for multiple <pattern> and <that> specifications to match to a single <template>.  This seemed like a no-brainer, allowing patterns like "I LIKE *," "I ENJOY *" and "I AM FOND OF *" to all map to the same production template.
 +
 +
The first problem I encountered is with the wildcards.  Patterns had to have the same wildcards in the same place.  So once you get more than one, you can get ordering problems.
 +
 +
For example "* IS VISITING *" and "* IS HOSTING *" might both map to the same template and support "MARY IS VISITING JOHN" and "JOHN IS HOSTING MARY."  But the indices of the wildcards are reversed.
 +
 +
It's simple enough to avoid this and just use the multi-pattern feature when you know wildcards will match up, but it occurred to me that for readability it might be nice to name wildcards.  So I added a ':' operator thusly:
 +
 +
"*:visitor IS VISITING *:host" (by convention, variable names are lower case and input literals upper)
 +
 +
So now, "MARY" can be matched in "MARY IS VISITING JOHN" with both <star index="1"/> and <star name="visitor/>
 +
 +
====Regular Expressions====

Revision as of 20:49, 4 April 2012

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox