Digital twins and hybrid models in biologics production: A model comparison
Digital twin technology has the potential to play a transformational role in drug development and manufacturing.
In this second article, we put theory into practice by comparing a classical data-driven model with a hybrid model for predicting viable cell density (VCD) and product accumulation – key parameters of productivity – in a perfusion bioreactor.
We demonstrate how digital twins enable innovators to answer questions such as:
- How can we identify optimal conditions for maximizing cell growth and productivity?
- When should we initiate perfusion and adjust feed strategies to avoid substrate depletion and cell density collapse?
- Can we detect early signs of instability and predict hard-to-measure parameters such as substrate concentration in the bioreactor?
Case Study: Perfusion Bioreactor cell culture modeling
Modeling attributes such as viable cell density (VCD) or product accumulation alongside process parameters like substrate concentration, temperature, or agitation using predictive models is a crucial step in drug process characterization.3
Moreover, data-driven methods face challenges when dealing with complex systems involving numerous process parameters and limited experimental data, which is common in drug development. Mechanistic or hybrid models resolve some of these issues by incorporating prior process knowledge in the form of biological and physical laws.4 Hence, while classical data-driven models serve as a useful starting point, we believe integrating mechanistic knowledge through hybrid approaches enhances predictive power and facilitates better process optimization in bioreactors.
Our hybrid model takes process parameters as inputs and predicts biological variables such as rates, yields, and plateau levels for biomass and product formation. These predicted parameters are then fed into a system of ordinary differential equations (ODEs), which calculate the dynamic responses of the bioreactor over time. Figure 2 illustrates the key steps of both the classical purely data-driven approach and the hybrid approach.
Figure 3: Experimental domain of the data used to train and test the models. Test 1 dataset lies within the experimental domain of the training dataset, while test 2 dataset lies outside.

Figure 4: Data (red) and predictions by the hybrid model (green) and classical (blue) approach for the within experimental domain dataset.

Figure 5: Data (red) and predictions by the hybrid model (green) and classical (blue) approach for the outside experimental domain dataset.

Figure 6: Predictive performance comparison of Classical and Hybrid modelling approaches measured by RMSE (root mean square error). Lower RMSE values indicate better predictive accuracy.
Historically, creating and fitting hybrid models required specialized knowledge in mathematics and data science, which limited their accessibility across many companies. However, this landscape is changing with the advent of user-friendly applications like TwinLab, shown in Figure 7. We created this application to allow users without deep technical expertise to easily explore different process scenarios and predict outcomes. Such tools make advanced hybrid modeling practical and actionable, supporting scientists and engineers in bioprocess development by integrating mechanistic knowledge with data-driven insights through intuitive interfaces.
One approach is to apply clever design of experiments. On the one hand, covering the process parameter space remains very important, while optimal designs for such mechanistic models help to ensure enough data support is provided where the information is the most relevant. On the other hand, mechanistically informed Bayesian optimization could be a first step to quickly cover the experimental domain, while caution remains with respect to the proper coverage of the domain.
Another solution is the use of Intensified Design of Experiments (iDoE), which introduces deliberate shifts in process parameters within a single experiment (e.g. bioreactor run). This strategy effectively condenses multiple conventional DoE combinations into a smaller number of experimental runs, thereby maximizing the information gained per experiment.
Finally, a more ambitious but highly promising strategy involves leveraging pre-trained digital twins. In this scenario, the hybrid model would first be pre-trained on thousands of experimental datasets aggregated from various sources, similar to how large language models like ChatGPT are developed. Users could then use and improve the model using their own limited training data, continually improving its performance for everyone involved – ultimately benefiting both users and patients. To protect proprietary data, the platform would deploy federated learning, a privacy-preserving approach that enables users to improve the digital twin collaboratively. This method ensures that individual experimental data remains confidential and is never directly shared between users, while still contributing to an ever-improving collective model.
We will continue to explore the potential of the digital twin and statistical methodologies to support better predictions, including the use of iDoE and Bayesian optimal designs.
Note: Results were generated with the assistance of BioWin ASBL and the financial support from the Region, in accordance with the provisions of the Grant Agreement (Convention 8881 ATMP Thérapie cellulaire).
About the author:
Disclaimer:
The information provided in this article does not constitute legal advice. Cencora, Inc., strongly encourages readers to review available information related to the topics discussed and to rely on their own experience and expertise in making decisions related thereto.
Neem contact op met ons team
Sources:
1. Digital Twins: From Personalised Medicine to Precision Public Health, J Pers Med., July 2021. https://pubmed.ncbi.nlm.nih.gov/34442389/
2. Perfusion Bioreactors Industry Research Report 2025, Research and Markets. https://www.globenewswire.com/news-release/2025/10/09/3164326/28124/en/Perfusion-Bioreactors-Industry-Research-Report-2025-Biopharmaceutical-Growth-Drives-Demand-Amid-Cell-Culture-Advancements-Global-Forecast-to-2032.html
3. Predictive models for upstream mammalian cell culture development - A review, Digital Chemical Engineering, March 2024. https://www.sciencedirect.com/science/article/pii/S2772508123000558
4. Hybrid semi-parametric modeling in process systems engineering: Past, present and future, Computers & Chemical Engineering, Jan 2014. https://www.sciencedirect.com/science/article/pii/S0098135413002639
