Why a Digital Therapeutic for Cannabis?

Cannabis use disorder is one of the most undertreated conditions in psychiatry. The numbers are striking: an estimated 16 million Americans meet diagnostic criteria for CUD, and fewer than 10% receive any treatment. The gap between clinical need and available services is enormous, and it is getting wider as cannabis use increases alongside legalization.

The treatment gap is not primarily about a lack of evidence-based approaches. We have effective interventions for CUD: motivational enhancement therapy, cognitive behavioral therapy, contingency management. These work. The problem is delivery. These interventions require trained clinicians, scheduled appointments, and sustained engagement over time -- resources that are scarce, expensive, and geographically limited.

This is the context in which we decided to build PAWS. Not because AI is a trendy technology, but because we have a clinical problem that human clinicians alone cannot solve at scale.

What PAWS Is

PAWS is a digital therapeutic tool that uses large language models to deliver behavioral health interventions for cannabis use disorder. It is being developed in my lab at Columbia in collaboration with Xuhai "Orson" Xu at Columbia's Department of Biomedical Informatics (DBMI).

At its core, PAWS is designed to do several things:

PAWS is not designed to replace a clinician. It is designed to provide evidence-based support to people who would otherwise receive no care at all -- and to extend the reach of clinical interventions between appointments for those who are in treatment.

Why LLMs Are Promising for Behavioral Health

The application of large language models to behavioral health is not obvious. Most of the public discourse about AI in healthcare has focused on diagnostics -- reading radiology scans, analyzing pathology slides, predicting disease risk from genomic data. These are important applications, but they do not address the core challenge in behavioral health, which is therapeutic interaction.

Behavioral health interventions are fundamentally conversational. Motivational interviewing, cognitive behavioral therapy, dialectical behavior therapy -- these are talking treatments. They involve a clinician listening, reflecting, asking questions, providing feedback, and guiding the patient toward insight and behavior change.

This is exactly what large language models are good at. Not perfectly, not yet -- but the capabilities are striking. Modern LLMs can:

The potential is significant. A patient experiencing a craving at 2 AM does not have access to their therapist. They do have access to their phone. An AI-powered tool that can provide evidence-based support in that moment -- helping them ride out the craving, reminding them of their reasons for change, suggesting a coping strategy -- could make the difference between a relapse and a successful day of abstinence.

The Collaboration with DBMI

Building something like PAWS requires expertise that no single department has. I know clinical psychiatry, substance use treatment, and the evidence base for behavioral interventions. What I do not know is how to build production-grade AI systems.

That is where Orson Xu comes in. Orson is a researcher at Columbia's Department of Biomedical Informatics whose work focuses on human-AI interaction and health sensing. His expertise in building AI systems that interact with people in real-world contexts is exactly what this project needs.

The collaboration has been productive precisely because it is genuinely interdisciplinary. I bring the clinical expertise -- what questions to ask, what responses are appropriate, what constitutes a safety concern, how motivational interviewing actually works in practice. Orson brings the technical architecture -- how to structure the LLM's context window, how to implement safety guardrails, how to evaluate conversational quality at scale.

We also work with our broader research team, including Tim Becker, who brings clinical experience in child and adolescent psychiatry, and collaborators across Columbia who contribute expertise in natural language processing, clinical trial design, and digital health regulation.

The Challenges

I want to be candid about the challenges, because building an AI-powered clinical tool is not straightforward. The technology is powerful, but deploying it in a clinical context raises serious issues that we are working through.

Clinical Safety

This is the most critical challenge. An AI system interacting with people who have substance use disorders will inevitably encounter users in crisis -- suicidal ideation, acute psychosis, overdose. The system must reliably detect these situations and respond appropriately, which means escalating to human intervention, not attempting to manage a crisis through a chatbot.

We are building layered safety systems: keyword detection, sentiment analysis, context-aware escalation protocols, and mandatory check-ins at defined intervals. But no safety system is perfect, and the consequences of failure in a clinical context are severe. This is an area where we are being deliberately conservative.

Hallucination and Accuracy

LLMs hallucinate. They generate plausible-sounding information that is sometimes incorrect. In a general-purpose chatbot, this is an inconvenience. In a clinical tool, it is potentially dangerous. If PAWS tells a user that cannabis withdrawal is not real, or that it is safe to combine cannabis with their medication, or that their symptoms do not warrant professional attention -- those are clinical errors with real consequences.

We address this through retrieval-augmented generation (RAG), which grounds the model's responses in a curated knowledge base of clinical evidence. We also implement response validation checks and constrain the model's behavior in safety-critical domains. The goal is not to eliminate hallucination entirely -- that is not currently possible -- but to reduce it to acceptable levels and ensure that errors are benign rather than harmful.

Bias and Equity

LLMs are trained on data that reflects societal biases, including biases in how substance use is discussed, treated, and studied across different populations. Cannabis use carries different cultural weight in different communities. Enforcement patterns, clinical access, and treatment outcomes vary by race, income, and geography.

A clinical AI tool must not replicate or amplify these disparities. This requires careful attention to training data, evaluation across diverse populations, and ongoing monitoring of the system's behavior with different user demographics. It is an area where we have more questions than answers, and where we are proceeding carefully.

Validation

How do you prove that an AI therapeutic works? The gold standard in medicine is the randomized controlled trial. We are designing PAWS with clinical validation in mind from the beginning -- not as an afterthought. This means defining clear outcome measures (cannabis use frequency, craving intensity, functioning), establishing appropriate comparison conditions, and running trials with sufficient power to detect meaningful effects.

This is slower than the "move fast and break things" approach favored by the tech industry. But in healthcare, moving fast and breaking things means breaking patients. We are not willing to do that.

What Is Next

PAWS is in active development. We are currently in the stage of refining the conversational architecture, building out the safety infrastructure, and preparing for initial feasibility testing. The path from where we are now to a validated clinical tool is long, but the need is urgent.

I believe that AI-powered digital therapeutics will become a standard part of behavioral health care within the next decade. Not as replacements for clinicians, but as extensions of clinical capacity -- tools that allow us to deliver evidence-based care to the millions of people who currently receive nothing.

Cannabis use disorder is a good test case for this approach because the treatment gap is so large, the evidence base for behavioral interventions is strong, and the consequences of untreated CUD -- psychosis, cognitive decline, impaired functioning -- are serious and well-documented.

We are building PAWS because the patients who need help should not have to wait for a system that cannot scale fast enough to reach them. AI is not a perfect solution. But it might be the tool that lets us finally close the gap between what we know works and who actually receives it.

Interested in Our AI Research?

Dr. Sultan's lab at Columbia University is developing AI-powered tools for substance use treatment. Learn more about the research program or get in touch.

AI in Psychiatry → | Sultan Lab →


Further Reading