ph: +1 416-691-2516 info @ archimuse.com
Join
our Mailing List.
published: March 2004 ![]() |
![]()
Make Your Museum Talk: Natural Language Interfaces for Cultural Institutions Stefania Boiano (freelance), Giuliano Gaia (freelance), Morgana Caldarini (Jargon), Italy Abstract A museum usually talks to its audience through a variety of channels, such as Web sites, help desks, human guides, brochures. A considerable effort is being made by museums to integrate these different means; for example, by creating a coherent graphic layout for digital and printed communication, or by giving the possibility to contact the human helpdesk via either e-mail or chat. The Web site can be designed so as to be reachable or even updateable from visitors inside the museum via touch screen and wireless devices. But these efforts seem still far from reaching a real, complete integration due to the difficulty of creating a coherent and really usable interface for different means and situations. We have all experienced how difficult it is to integrate different information, coming from different sources, with different formats, inside a common frame like the Web site, and how difficult it is to update it continuously. Moreover, the Web site is simply inaccessible to computer-illiterate persons. One way to achieve a deeper integration comes from a new generation of natural language recognition systems and their user-friendly interfaces. These applications are able to understand text inputs and spoken language coming from any source (e-mail, chat, Web forms, phone). After getting the input, the system will try to find the appropriate answer by applying complex interpretation rules and by searching different databases inside or outside the museum's infosphere. After the answer is found, it can be transmitted to the user by a variety of means: Web, e-mail, cell phone messages, vocal messages. Such interfaces can integrate many useful applications: museum mascot, interactive guide, shop assistant, first level help desk, e-learning tutor and customer care. It is also easy to imagine a proactive role; e.g. since the system can dialogue in real time with users, also getting valuable data about their needs and desires, it could personally invite people to museum events, always in a very interactive and natural way. Is this science fiction, particularly for low-budget museums? Perhaps not. It is now possible to develop powerful and easily maintainable solutions on low-cost platforms, integrating the museum's IT infrastructure with artificial intelligence based characters. They could act as a front end for natural language recognition engines, speech recognition systems, and existing database and content management applications. The paper will present these new solutions, together with the first case histories of natural language interfaces usage. Finally, the paper will explore some of the exciting possibilities opened for museums and cultural institutions by the integration of these innovative means. Keywords: bot, chatbot, dialog, interaction, natural language interfaces Introduction During 2002 we (Stefania Boiano and Giuliano Gaia ) launched
ourselves into the project of rebuilding the whole Science Museum of Milan
Web site (www.museoscienza.org).
The aim was to build a more usable, complete and coherent Web site, and
to experiment with new online education tools. The first part to be realized
was a very large section about Leonardo da Vinci, with Flash interactive
educational games and a massive amount of information about Leonardo and
his works (see www.museoscienza.it/leonardo
for an Italian preview). A Shockwave 3D version of the Ideal City imagined
by Leonardo was also under development by the Politecnico of Milano (Gaia-Barbieri
2001). Chatbots are not a new idea. Eliza, the famous software icon who simulated a psychiatric interview in the Sixties, was somewhat an ante-litteram chatbot. Chatbot started to flourish in Internet Relay Chat (IRC), an enormous chat environment born in 1988 and still commonly used. IRC offered powerful programming options to its more experienced users, so many programmers started creating software to simulate human chatters (Herz 1995) .Today there are many different kinds of chatbots able to chat in different languages. (Check for examples at www.agentland.com ). Most of them are based on the same principle: the text input of the human counterpart is compared to a knowledge base of sentences and keywords, in order to identify a suitable answer using matching rules set by the programmer. Chatbots are used mainly for entertainment, even if some alternative applications are arising in the commercial field (for example, as virtual shop assistants) and in the cultural one (an interesting example as a virtual tour guide is being presented in MW 2003 too; see Almeida-Yokoi 2003). In our project, we teamed with two specialists in chatbot technology (Morgana Caldarini, co-author of the present paper, and Andrea Manzoni, both founders of Jargon, a Milanese Web agency). They launched in April 2002 an innovative Italian chatbot answering to the name of Alfa (currently online at www.jargon.it ).
Fig. 1 - Alfa, the Italian
chatbot created and trained by Jargon (alfa.gif) - http://www.jargon.it The project was quite ambitious and articulated in different steps. Unfortunately, the whole project of rebuilding the Web site was stopped by the Science Museum in December 2002, and with it the chatbot project. So we will describe here all the project steps, even if only the first three steps have been completely or partially implemented. Step 1: the Bot is CreatedThe first step is the creation of the bot. This means defining different aspects:. Application field: as we said before, the first application field we planned for the chatbot was to answer questions about Leonardo da Vinci. It seemed a good start, because:
While we started with the chatbot only as a sort of virtual expert about Leonardo, we were already planning development of the software towards more innovative applications (e-learning, virtual assistant for online booking and shopping, first-level help desk, and so on..).In fact, we saw the chatbot not only as a technological gadget to be added to the Web site, but also as a "natural language interface" in a much wider sense. An interface means that the chatbot can act as a mediator between user inputs and a variety of data sources. Technology: the idea of using the chatbot as an interface between data sources and the user oriented us towards a technology suitable for integration with multiple data formats and sources. After exhaustive analysis, we decided that the Lingubot technology from Kiwilogic (http://www.kiwilogic.com/) was the most suitable for our project, for the following reasons:
Target: one of the essential things in every communication project is to define the target carefully. One important feature of bots is that they are able to adapt to the user; i.e. they can change the language and the content of their sentences depending on the user characteristics emerging from the dialogue. Anyway, some main coordinates are to be given to the bot; in our case, we decided to focus the virtual guide on youngsters, privileging easy language and concepts. Language: being a linguistic interface, the choice of the language is very important. Kiwilogic offered Lingubots different linguistic knowledge bases; we chose to start with Italian, developed by Jargon, because this would have allowed us to control idiomatic expressions better and to get the finer details of the user-bot interactions. Physical aspect: the chatbot can be associated with an image, a flash movie or other file types; for example using 3D plugins or streaming video. This means giving to the bot not only a face, anthropomorphic or not, but also gestures, facial expressions, and so on (coordinated with text outputs). Of course the aspect has to be carefully thought out, because it can deeply influence the user perception of the bot, and therefore the interaction. We decided to avoid the obvious choice of making it look like Leonardo, because it would have been banal and very difficult to realize - how can you in fact program software to talk like a genius? We decided to make it resemble a Leonardo's machine (a talking flying-screw). This offered us some advantages:
Personality: strictly connected to the physical aspect is the character personality. To make a character realistic you have to create not an "answering machine" but something simulating the complexity of a human relationship with attitudes varying according to the conversation. For example, if a bot is insulted by a human being, to be realistic it has to become angry and stay so for a significant amount of time. Facial expressions and gestures can reinforce its personality. So it is important to define the personality well and to shape answers accordingly, even if a certain degree of incoherence is also necessary not to make the bot reactions too predictable. We tailored for our bot a calm and solid behaviour with some irony. Step 2: the Bot LearnsOnce the bot is defined, the difficulty begins : in order to talk, the bot has to learn how to interpret and answer the user's input. As said previously, a Lingubot comes with a pre-set knowledge base of the specific language and a corpus of interpretation rules, but this is only the first floor of a much more complex building. The bot has to be carefully "trained" to give it the possibility of accomplishing the task it was created for. In our case, as a virtual guide, the bot had to learn a lot about Leonardo and his life. Moreover, the most difficult task is to connect questions to a specific answer and to understand when the user is going outside the bot field of expertise. We had the precious help of two assistants, Davide Radice and Felice Colasuonno, both coming with a strong humanistic background but not having special programming skills nor previous experience in chatbot programming . The bot was trained by going through 4 different phases, which overall lasted one month:
Step 3: the Bot ChatsAfter receiving the basic training, the bot is able to start written conversations with real users in the pre-defined language. To understand how this happens, let's see how the bot reacts to a user question.
Of course, the probability of getting the right answer increases with the size of the data base. Hence, continuous work on widening and fine-tuning the knowledge base is necessary: a great help with this regard comes from the analysis of the users' dialogues. The interpretation of the meaning of human sentences is a most problematic point. For this reason Turing chose the dialogue man-computer as its operative test for Artificial Intelligence - even a last-generation bot would not pass it, since chatbots can always be put on the wrong track by ambiguities of the language in the sentences of the human user, or just by typos. Nevertheless, a well-trained bot can carry on even long conversations in a credible way and perform efficiently many tasks. Once properly trained, the bot is ready to start its life as a virtual expert, since it is able to "understand" most user inputs and try to find appropriate answers. The bot can also open pre-defined Web pages related to the question, and it can recognize returning users using cookies - this offers the possibility of a high degree of personalization. When a user is recognized, the bot can scan logs of previous dialogues and tell the user: "I remember you were interested in Leonardo's flying machines. Do you know we are opening a special exhibition on this subject?" Logs generated by dialogues are valuable because they:
Like humans, bots never finish learning. By studying conversations made by the bot during its activity, operators can fix conversational situations in which the bot does not give a proper answer, and can identify subjects human users are particularly interested in and enlarge the knowledge base accordingly. In fact, most of the knowledge base work is made after, and not before, the publishing online of the bot. At least six months of continuous refining work is usually needed to make the Lingubot able to carry on a good number of conversations on its specific subject. For instance, Alfa, the Jargon bot, in the period April-December 2002 performed more than 200,000 conversations. All of these conversations were recorded and analysed, both manually and automatically, by Jargon Authors. This brought to inclusion in the Italian Lingubot Basic Knowledge Base of more than 40,000 new terms, 4,500 recognition patterns and thousands of keywords and interpretation rules. After 9 months of extensive training, Alfa's recognition success rate is now stable at around 97%. Unfortunately, as mentioned before, in December 2002 the Leonardo project was stopped, so the bot could not undergo the extensive testing scheduled before the launch, programmed for February 2002. From here on, we will describe the further steps of the project which were planned for a period of 9 months from the launch. Step 4: the Bot Sends e-mailOnce the bot recognizes a user, it can also offer the possibility receiving e-mails about specific subjects. These emails can be just traditional text newsletters, but with an extra link to the bot. By clicking that link, the user can start a conversation with the bot directly talking about the newsletter content. For example, the user can ask the bot why it signalled that specific conference to him, or ask questions on specific things that are not specified in the conference Web page, or ask about other events related to that one. In fact, a well-trained bot added to a newsletter can enhance significantly both user satisfaction and the collecting of feedback. During our work on the Leonardo section, we had realized a virtual postcard page where the user could send to friends, together with a message, unusual drawings by Leonardo. We experienced a high level of interest and satisfaction with this page. We planned to offer users the possibility adding to their postcards a link to the bot. This way, the users could offer friends the possibility starting an unusual conversation about the postcard image. Moreover, the users could tell the bot secret messages to be revealed during the conversation with the friend. This was intended as a tool to increase Website visits, but it also ould have educational side-effects: for example, the bot could focus conversations on the image, stimulating the user with cultural questions; only after getting a right answer would the bot reveal the message of the user's friend. Step 5: the Bot TalksDownloading a free plugin, Web site visitors are able not only to read what the bot says but also to hear its words, thanks to a voice synthesizer. We decided to implement this feature because it made conversations more engaging, especially for pupils, and also more accessible to visually impaired people not equipped with vocal browsers. The main problem with vocal synthesizers is intonation: since synthesizers cannot understand the meaning of the sentence, they pronounce every word singularly, with no connection to the other words. This is a major problem especially with languages like Italian which are strongly based on intonation (for example, interrogative meaning to a sentence in Italian is mainly given by intonation). The choice to make the bot look like a machine was also intended to "justify" its unnatural way of talking. During 2002, AT&T released a new generation of voice synthesizers called "Natural Voices" http://www.naturalvoices.att.com/ which are able to read English with nearly natural intonation, and able to give "colour" to sentences. This could improve significantly a bot's talking capabilities. Step 6: the Bot ListensTalking is only half of the vocal interaction, and the easy one. What we really wanted was to make our bot able to listen to the users on the Web and outside the Web. In our opinion, this step is very important because bots could be very effective if used inside the museum, for example in a multimedia kiosk. During 2002, a renowned Italian group of digital artists, Studio Azzurro, tested in the Science Museum an innovative system called "Torkio". Torkio is a digital character interacting with museum visitors, remotely controlled by a human operator. The success of the installation is high, but the need of skilled operators limits its operating times. Observing the high emotional impact of the digital character on museum visitors, we thought it would have been very interesting to put the bot in the museum too; the bot could have been an effective virtual guide. A "talking kiosk" could invite visitors to play, could ask questions, offer answers, show multimedia files related to their questions, collect their emails for newsletters, and be an amusing technological exhibit itself. In this case the limit is technology. Till today, speech recognition systems work well only when:
None of these is true in our case, since the bot should be able to interact with plenty of new users who pronounce long sentences in a natural way, sometimes "worsened" by the emotions ( e.g. a laugh or anger) created by the conversation, and with a lot of environmental noise. But a new generation of low-cost and effective speech-recognition software has been announced by DARPA founded researchers at Carnegie Mellon University http://www.speech.cs.cmu.edu/. The English and French speaker independent speech recognition modules are already available, and the Italian module is being developed by Jargon. So it is possible that an effective listening ability could soon be added to chatbots. Step 7: the Bot Goes WirelessGoing out from the Web does not mean only going into the museum as a new-generation interactive kiosk. Kiwilogic provides Lingubot with the ability of interacting with users using cell phones messages (SMS), and more generally using wireless systems. This opens up interesting application fields:
Step 7: the Bot Is Integrated with the Museum Data SourcesLingubot technology can be easily integrated with external databases (via ODBC/JDBC) and to external applications and sources using HTTP or TCP/IP. Bots can also call applications directly. This means that the bot can get data from any digital source inside or outside the Museum infosphere. In our project, we had planned to integrate the chatbot with the following data sources:
This way the bot would have been able to provide different services to the user:
The key and the strength of the system was the integration of the different data sources. The bot could in real time adapt its messages to the situation of the bookings; for example not promoting services already full or books sold out. Of course, the bot is only a first-level help-desk; if the bot is not able to fulfil its mission, for example because it does not understand what the user is asking, then it redirects the user to the human help-desk. Step 8: The Bot Extends Itself By Interacting With Its Peers Over The WebIntegration could go further than the previous point. Since a well-trained bot becomes a specialized database itself, and it is easy to make bots communicate between themselves over the Internet, it is also easy to imagine a situation in which a bot in the Science Museum of Milan gets a question on English steam trains and connects to its "colleague" at the British national Railway Museum to find the answer; this would be even easier for more organized tasks such as online bookings or shopping, and could allow us to develop one of the first effective, practical working infrastructures of the "semantic Web. ConclusionsEvery time a new technology hits the market, it has to answer three main questions posed by potential users (including museums):
ReferencesAlmeida P., S. Yokoi (2003). Interactive Conversational Character as a Virtual Tour Guide to an Online Museum Exhibition, paper presented at Museums and the Web 2003. http://www.archimuse.com/mw2003/papers/almeida/almeida.html http://www.archimuse.com/mw2003/abstracts/prg_200000698.html Cigliano E., S. Monaci (2003). Multimuseum: a multichannel communication project for the National Museum of Cinema of Turin, paper presented at Museums and the Web 2003. http://www.archimuse.com/mw2003/abstracts/prg_200000703.html Herz, G.C. (1995). Surfing on the Internet, Little Brown & Company. Gaia, G. (2001). Towards a Virtual Community, paper presented at Museums and the Web 2001, http://www.archimuse.com/mw2001/papers/gaia/gaia.html Gaia G., T. Barbieri (2001). HOC - Politecnico and Museum of Science and Technology of Milan: A Collaborative Exploration of Leonardo's Virtual City, paper presented at ichim 01 http://www.archimuse.com/ichim2001/
|