Filtered Mixture-of-Generators significantly improves synthetic survival training, according to a study by Niccolò Maria Rizzi and colleagues published on June 30, 2026. This innovative approach addresses challenges in survival analysis, particularly in clinical settings where data scarcity and privacy concerns hinder research.
Advancements in Survival Analysis
Survival analysis typically relies on time-to-event data, which is often limited due to the high costs associated with clinical training data. The researchers propose a solution that utilizes generative models to augment datasets while preserving patient privacy. By employing a Filtered Mixture-of-Generators (FoGS) framework, they enhance the quality of synthetic data, making it more representative of real-world scenarios.
The new method involves selecting samples from a diverse pool of four distinct tabular generators. Each sample is evaluated using an ensemble of seven survival models trained on real datasets, ensuring that only the most plausible samples are used for further analysis. This two-level pipeline optimizes the selection process to maximize the downstream model's performance.
Performance Metrics and Results
In their experiments, the team tested FoGS on 16 public datasets, comparing synthetic training with real data. The results showed an average improvement of +2.17 in C-index and +0.67 in Integrated Brier Score (IBS), indicating that the synthetic models performed comparably to those trained on actual patient data.


