“Sweet Talkin’ Woman” Exploring IPsoft’s #Amelia, an Artificial Agent
Posted on June 15th, 2015
06/15/2015 @WeWork, 69 Charlton Street, 8th floor, NY
Adam Pease @iPsoft presented his research on natural language processing (#NLP). He emphasized the importance of understanding the semantic structure of natural language and the limitations of the “bag of words” approach to automated language understanding.
IPsoft creates an artificial agent to automate customer service agents. It conducts a dialogue with users and is designed to handle step-by-step protocols for specific questions: e.g. for mortgage monthly calculations, but has the flexibility to engage in small talk and access databases and large corpora background knowledge. It also is trained to express its level of certainty when presenting an answer.
If Amelia cannot answer a question it will escalate to a human and then listen to the answer as it attempts to learn the solution when it faces similar inquiries in the future.
The next generation system will have an emotion model and a dialogue model. It could determine the emotion of the user, but a theory is still needed on how to handle the different emotions of users.
From this point, Adam emphasized the theory behind his NLP research emphasizing his work on ontology. He first reviewed some of the traditional information retrieval methods used by Amelia:
- Term freq/inverse document freq – greater importance given to infrequent terms
- BM25 – two phase. Retrieve docs, then relevant sentences from the document
- #Word2vec – multidimensional vectors of docs. Can train on words and the conjunction of words from documents. Uses multifactor analysis to avoid overfitting the data.
He then talked about methods for mapping the semantic structure of sentences and why these approaches are important: they have the potential to creating novel answers to problems.
The Stanford dependency parser graph extends the idea of sentence diagramming to create structures that can determine that “John walks to the store” is the statement that answers the questions “Who walks to the store?”. This is done by matching the node “John” with the node “who” and is robust to non-structural variations such as “amble” replacing “walk” in the original sentence.
Wordnet is one approach to creating the framework for these structures. George A. Miller at Princeton started its development 20 years ago. It is an electronic dictionary containing 100,000 hand-created word senses and semantic links.
Adam then talked about his research on ontology (“Suggested Upper Merged Ontology” contains 20k items, 80k axioms, linked with fact databases.) which relates higher order logical theory to the theory of language.
Related technologies include
- Expert systems – worked, but not visionary. Not reusable in the next project
- Semantic networks – match graph structures
- Semantic web – restricted representation.