|—|Jul 2Thu, Jul 2, 2026

Health

Training Therapeutic Judges and Multi-Agent Systems for Improved Mental Health Support

A new framework enhances mental health support using large language models, with promising results in therapeutic quality.

By Feed and Figures Editorial Team•Jul 1, 2026 (2h ago)•1 min read•Source: arXiv NLP

AdSense placeholder (article-top)

Large language models are increasingly recognized for their potential in providing mental health support. A recent study, published on June 29, 2026, introduces a novel framework aimed at enhancing the therapeutic quality of these models. Researchers, including Mizanur Rahman and six co-authors, developed a two-stage system that utilizes human-aligned evaluations to improve therapeutic responses.

Introducing TheraJudge: A New Evaluator

In Stage I, the team presents TheraJudge, an open-source therapeutic evaluator trained using preference-based optimization on human-annotated data. This innovative tool assesses responses across seven psychological dimensions, ensuring more reliable judgments compared to traditional methods.

TheraJudge demonstrates strong agreement with clinician ratings, achieving intraclass correlation coefficients (ICC) between 0.87 and 0.95. This performance surpasses existing supervised baselines and closed-source judges, particularly excelling in critical areas such as Safety, Relevance, and Empathy.

Operationalizing Evaluations with TheraAgent

Stage II introduces TheraAgent, which operationalizes TheraJudge's evaluations through a coordinated refinement process. This system employs specialized roles, including Critic, Coach, and Therapist, to translate evaluative signals into targeted response revisions.

AdSense placeholder (article-mid)

Empirical results indicate that TheraAgent leads to a significant improvement in therapeutic quality, with a reported increase of 0.43 points on a 5-point scale during blind evaluations. Additionally, clinician inter-rater reliability stands at an impressive 96%.

Targeted Correction of Responses

The system effectively addresses low-quality responses, with those rated 3 or below improving by an average of 2.45 points and achieving a 94% recovery rate. This capability highlights the importance of acting on human-aligned evaluations rather than relying solely on stronger generative models.

Overall, the findings suggest that the alignment of mental health large language models can significantly enhance therapeutic outcomes. The research team has made their code publicly available to foster further advancements in this field.

🤖 This article was rewritten by Feed and Figures' editorial AI from a report originally published by arXiv NLP. Facts and quotes are preserved from the original; the rewrite focuses on clarity and structure. For the unedited original, see the source link below.

#Mizanur Rahman

#Abeer Badawi

#mental health

#AI

#therapeutic models

#psychology

#evaluation

Share: Twitter Facebook WhatsApp

AdSense placeholder (article-bottom)

Training Therapeutic Judges and Multi-Agent Systems for Improved Mental Health Support

A new framework enhances mental health support using large language models, with promising results in therapeutic quality.

By Feed and Figures Editorial Team•Jul 1, 2026 (2h ago)•1 min read•Source: arXiv NLP

AdSense placeholder (article-top)

Introducing TheraJudge: A New Evaluator

Operationalizing Evaluations with TheraAgent

AdSense placeholder (article-mid)

Targeted Correction of Responses

#Mizanur Rahman

#Abeer Badawi

#mental health

#AI

#therapeutic models

#psychology

#evaluation

Share: Twitter Facebook WhatsApp

AdSense placeholder (article-bottom)

Training Therapeutic Judges and Multi-Agent Systems for Improved Mental Health Support

Introducing TheraJudge: A New Evaluator

Operationalizing Evaluations with TheraAgent

Targeted Correction of Responses

Related stories

Baker's yeast Saccharomyces cerevisiae shows promise against Candida albicans infections

Recurrent laryngeal neuropathy in horses: Understanding diagnosis and treatment

Study reveals women’s employment rates increased by 27% after weight loss on GLP-1 medications

Why turning off screens is challenging for children: 4 effective strategies

Training Therapeutic Judges and Multi-Agent Systems for Improved Mental Health Support

Introducing TheraJudge: A New Evaluator

Operationalizing Evaluations with TheraAgent

Targeted Correction of Responses

Related stories

Baker's yeast Saccharomyces cerevisiae shows promise against Candida albicans infections

Recurrent laryngeal neuropathy in horses: Understanding diagnosis and treatment

Study reveals women’s employment rates increased by 27% after weight loss on GLP-1 medications

Why turning off screens is challenging for children: 4 effective strategies