Blog: On Conversational Software
Most software is limited. Most software relies solely on the graphical user interface. Most software is not conversational.
Most graphical user interfaces (GUIs) are confusing. GUIs promise to visually guide users towards favorable behaviours. While traditional desktop graphical user interfaces are often conventionally defined, mobile interfaces can still be difficult for people to distinguish tappable versus non-tappable elements due to the diversity of styles (Google, 2019). This leads to false affordances (e.g., a feature that could be mistaken for a button), user frustration, uncertainty, and errors. To help clarify the tappability of items in their interfaces, designers often conduct studies or a visual affordance tests. Such studies, however, are time-consuming and their findings are often limited to a specific app or interface design.
Most GUIs take time to learn. Every app is different and while clicking buttons is an effective way of performing computing actions, there is a steep learning curve associated with understanding how the visual elements of the app relate to its function. The learning curve can either be logarithmic or exponential. In a logarithmic learning curve (Figure 1), the user quickly learns to use software at the beginning, then the rate of learning slows down as the user gains more experience. With increasing proficiency and experience, the user eventually gains competence.
In an exponential learning curve (Figure 2), the user has a hard time to reach competency, at least at the beginning. Following the exponential learning curve, the user takes more time to understand how graphical elements relate to user’s intent.
How can we solve these graphical user interface (GUI) problems? This is where conversational user interfaces (CUIs) come in. Conversational user interfaces are apps that mimic conversations with real humans. CUIs provide opportunity for the user to communicate with the computer in their natural language rather than in a syntax specific commands.
How can CUIs improve the user experience (UX) of apps? I believe that a “balanced software” is a combination of GUI and CUI. For the purpose of this blog post, I will describe this hybrid software as conversational software.
As the Founder, CEO at Produvia, I often ask myself: “how can I use AI to build better software for my clients?” For the remainder of this blog post, I will describe how conversational software which goes beyond conversational user interface and graphical user interface.
Imagine you want to chat with or speak with your favorite software. What would that look like? You would log into your Google Gmail, Google Calendar, Pipedrive, Trello, Drift, or Slack account. You would see a chatbot widget on the bottom-right corner. You would click on this icon and see an a conversational bot, powered by artificial intelligence.
Second is an intelligent personal assistant that speaks to your software on your behalf.
Let’s say you want to “speak” to your email. You log into your Google Gmail account. You invoke Second and say:
“Hey Second. Create a draft.”
Second updates the user interface and helps you create this draft. Second also recommends to “Send an email” as the next step.
“Second is an intelligent personal assistant that speaks to your software on your behalf.” -Slava Kurilyak
How does Second work?
Second understands your intention and presents you with a response. Second analyzes the intent of each conversation. Let’s revisit the Gmail example. You can give Second action or find commands. Here are a few examples on how to talk to your Gmail account:
- “Create a draft” — Create (but do not send) a new email message.
- “Create an email” — Create and send a new email message.
- “Create a label” — Creates a new label.
- “Remove label from email” — Remove a label from an email message.
- “Add label to email” — Add a label to an email message.
- “Find email” — Finds an email message.
Behind the scenes, entity recognition and intent classification are being used to understand user’s intent. For #1, the intent is “action”, entity is “draft”. For #6, the intent is “find”, entity is “email”. Once Second has understood your intent and entity, Second will act on your behalf and change the GUI of Gmail to accomplish your goal. This means that Second will perform actions on your behalf, and will not require you to learn new graphical user interfaces.
Second is powered by an AI-driven chatbot. It leverages natural language processing and deep learning to accomplish this. Second leverages Google Assistant and Dialogflow to learn users’ intents and entities and is able to guide users through conversations. Google Assistant and Dialogflow is deployed on Google Cloud infrastructure.
Second also guides you towards a better user experience by recommending next actions. Using Second, users to receive recommendations for next actions. This is accomplished using Click-Through Rate (CTR) Prediction. CTR is a crucial task for recommender systems, which estimates the probability of users’ to click on a given item. In other words, CTR Prediction is observing user’s behaviours and predicting the next action.
The model which predicts CTR is hosted on Google Cloud and built in Tensorflow (Google, 2017). This is accomplished by combining Google Cloud Dataflow-based preprocessing, Google Cloud ML Engine-based predictions and Google Cloud Run-based serverless deployments. Where possible, Click-Through Rate Prediction leverages various state-of-the-art methods, such as: xDeepFM, DeepFM, Deep Interest Network (DIN), accessible by open-source libraries (DeepCTR, 2019).
What does the future look like with Second? You talk to Second and this AI talks to your favorite software in order to accomplish tasks on your behalf. If you’re interested in learning more about Second, feel free to text “#second” to +1–604–704–3984.