Blog: Digital therapists: are chatbots appropriate for psychotherapy?
In CognitionX’s marketplace intelligence work, we created a framework for describing conversational AI. Something simple but useful. In this framework there are three broad kinds of conversational solutions:
- Concierge: doing things for you. Almost all traditional software can be considered “non-conversational concierges”.
- Coach: helping you become a better you.
- Companions: designed to make you feel happy, focused on long-term engagement, building trust and intimacy
Those that can hold more human-like conversations I’ve dubbed “digital beings”. Here’s a pic, with a promotion for CogX, our annual festival of AI in London 10–12 June:
This post is about coaches. The way I see it, coaches have another dimension as well: therapeutic nature.
- Digital Information Bots: you ask, you get. It may be conversational, but it’s not personalised.
- Digital Guide: Personalised trails through a knowledge space.
- Digital Coach: everything a guide does, with the addition of goals that drive the guidance. They can be physical well-being coaches for helping become physically healthier, or mental well-being coaches that help you become mentally healthier.
- Digital Therapist: engagement is driven by rules grounded in science, such as cognitive behavioural therapy (CBT).
This post is about Digital Therapists. It draws from my personal experience using a chatbot in this space, and academic research showing progress towards how we as human beings could become mentally healthier in an affordable, accessible, and safe way.
The paper is Deep learning for language understanding of mental health concepts derived from Cognitive Behavioural Therapy (3 Sep 2018), by Lina Rojas-Barahona, Bo-Hsiang Tseng, Yinpei Dai, Clare Mansfield, Osman Ramadan, Stefan Ultes, Michael Crawford, Milica Gašić.
But before I dive into the research, let me introduce you to Koko. Koko was a Facebook Chatbot that CognitionX categorises as a “broker bot”. You talk to the bot, it talks to another person, and it mediates the discussion. We’ve identified five main models:
Koko: anonymised online human-to-human support
Koko was a project whose site appears to have been recently taken down. You can see it on the Internet Web Archive where Jan 2019 it says:
“About Koko — Koko originated at the MIT Media Lab and has collaborations with MIT, Stanford, Harvard, NYU, Columbia, and Cambridge. The company has raised financing from top investors including USV and Omidyar Network. Koko is based in New York City and is actively hiring for engineering, design, product management, and sales.”
They ran a chatbot that you could access from itskoko.com (via Web Archive again):
“A safety net for social networks Koko offers services that help social networks manage crisis, abuse, and bullying.”
I used Koko here and there while it was up in 2017. It was quite fulfilling to help people who felt sad. Here’s an example: as you can see, I don’t know who they were, but they are clearly real people with real problems. This is a mild example. There was a “danger of self harm” choice and I used it at least twice.
It was a shame it was taken down. I don’t know why and haven’t been able to contact the authors. Maybe one day I’ll know.
Back to the CBT chatbot research
Now enter the paper I mentioned. Here’s the abstract, plain-English-ized:
In recent years, we have seen deep learning impact speech and text. We introduce a new task: understanding of mental health concepts derived from Cognitive Behavioural Therapy (CBT). We define a set of mental health categories based on the CBT principles, get specialist humans to label a collection of mental health conversations, and apply deep learning to see how much can be automated. Our results show that the performance of deep learning significantly outperform traditional approaches. We also believe this approach will be an essential component of future automated systems delivering therapy.
Koko was used as training data! 4035 conversations were annotated with three things, by experienced psychological therapists, taking about one minute per conversation. They did it with multiple annotators to cross-check. Turns out this last point, cross-checking the annotations, was key. Here are the annotation types they took on:
- Thinking errors
Thinking errors list:
- Black and white thinking
- Disqualifying the positive
- Emotional reasoning
- Fortune telling
- Jumping to negative conclusions
- Low frustration tolerance
- Mental filtering
Emotions, and proportion of posts that had them:
- Anxiety: 63%
- Depression 21%
- Hurt 20%
- Anger / frustration: 14%
- Loneliness 7%
- Shame 6%
- Grief / sadness 5.7%
- Guilt 3.3%
- Jealousy 2%
While our main emphasis was on thinking errors and emotions, we also defined a small set of situations:
- Bereavement 2.5%
- Existential: 21%
- Health 10%
- Relationships 68%
- School / College: 8%
- Work 6%
- Other 5.5%
What was the conclusion?
- That it’s state-of-the-art
- It’s still not very good: annotators who labelled the conversations
- Lots of ideas on how to improve in future.
My own observation is that this was a very rich (yet small) set of relevant data and was quite rare. I expect this space to move forward largely dependent on good example (training) data that draws upon experiences in this space.
Actually, some of this is already happening, albeit only in a private dataset. That’s what my next post will be about: X2.ai and Sara. Stay tuned!
If you want to be notified of the next post and weekly plain-English news on AI for speech and text, go here and choose “Speaking Naturally”. Sign up here