A recent study by Bartłomiej Cupiał and colleagues, published on June 29, 2026, investigates the factors that enhance interactive improvement from feedback in artificial intelligence. The research, conducted across various models, reveals that effective feedback significantly contributes to performance gains in multi-turn language agents.
Understanding Interactive Improvement from Feedback
The study explores how natural-language feedback can lead to improvements that surpass those achievable through repeated attempts alone. The authors implemented a controlled student-teacher protocol across multiple platforms, including Omni-MATH, Codeforces, BBEH Linguini, and ARC-AGI1. This approach allowed a comprehensive evaluation of thirteen open-weight models in both student and teacher roles.
Key findings suggest that while higher accuracy can indicate useful feedback, it may also result from factors like resampling or format correction. The authors emphasize the importance of distinguishing between these effects to accurately assess feedback's role in learning.
The Role of Feedback Types in AI Learning
The authors compared different types of feedback: external feedback, self-feedback, and unguided self-refinement. Their analysis revealed that self-generated feedback offered minimal additional benefits compared to unguided self-refinement. In contrast, the strongest external teachers provided significant feedback-specific gains, indicating that effective feedback must go beyond generic guidance.
- External feedback yields substantial performance gains.
- Self-generated feedback adds little value.
- Effective feedback must provide actionable guidance.
Insights on Feedback Utilization and Student Agency
The research further indicates that the ability of students to use feedback is a more critical factor in achieving interactive gains than the identity of the teacher. Although teacher choice remains significant for a consistent student, the study suggests that feedback-based agents should be evaluated against repeated-attempt baselines.
In conclusion, the findings underscore that the capacity to act on feedback, rather than merely its availability, represents a central bottleneck in enhancing interactive improvement in artificial intelligence. The study's controlled evaluation framework is available for further exploration at the provided link.
🤖 This article was rewritten by Feed and Figures' editorial AI from a report originally published by arXiv AI. Facts and quotes are preserved from the original; the rewrite focuses on clarity and structure. For the unedited original, see the source link below.