Because of the massive expansion of mobile in these regions, emerging users in developing countries are rapidly gaining access to Information and Communication Technology (ICTs). However, they still need appropriate interfaces to perform better with interactive products. Many regard that audio interfaces like Interactive Response Systems (IVRs) can be the best-fit interfaces for emergent users on account of easier deployments, and a strong presence of vocal culture in developing regions. In addition, IVRs are known to improve system usability and task completion rates by preventing users from losing track of interface features, functions, and limitations. They insist on using directed dialogue IVRs for novice users. It seems possible that by using IVR-based interfaces with directed dialogue for emergent users, the advantages that exist for first-time users can be transferred to emergent users.
IVRs, however, pose serious usability challenges to their users because of the inherent transience and temporality of audio. Tatchell finds IVR-based services difficult to learn, easy to forget, and confusing. Users must pay attention to the audio prompts presenting menu choices and system control features. This places a significant strain on the user's working memory. Consequently, user interactions with directed dialogue IVRs suffer from "poor referability" and "absence of memory aid." Recent studies with emergent users in focus have reconfirmed these usability difficulties. A lesser-explored approach aimed at addressing usability barriers with IVRs is the use of coordinated visuals along with audio prompts. Our efforts in the current research work are primarily based on this approach.