e-PIM: A Conversational Multimodal Interface for a Thin Client

As Third Generation (3G) networks emerge they provide not only higher data transmission rates but also the ability to transmit both voice and low latency data within the same session. This paper describes the successful implementation of a multimodal application (voice and text) that uses natural language understanding combined with a WAP browser to access email messages on a telephone handset. We also report on a user trial that evaluated both the multimodal system and a unimodal system that is representative of current products in the market. Participants saw significantly greater value in the multimodal interaction, and rated their experience with the multimodal system significantly more positively than the unimodal system. They were also significantly faster and more inclined to use and recommend the multimodal system. While we expected to see mixed usage of modalities in the multimodal system, speech was the dominant modality used, with users falling back to GUI selection only after encountering multiple speech recognition failures in a row. To our knowledge, this represents the first implementation and evaluation of its kind using this combination of technologies.

By: Jennifer Lai, Stella Mitchell, David Wood, Christopher Pavlovski, Harry Stavropoulos

Published in: RC23182 in 2004


