Design Space of pharmaceutical processes using data-driven based methods

AIChE Annual Meeting in Salt Lake City

November 08, 2010

F. Boukouvala F.J. Muzzio M.G. Ierapetritou

The identification and graphical representation of process design space is very critical in locating not only feasible but also optimum operating variable ranges and design configurations. Design space of a process is the area of the parametric space within which acceptable product can be produced [1]. In the pharmaceutical industry- working towards an ideal maximum level of quality assurance and performance- the Design Space is directly linked to a well defined Control Strategy and Criticality Assessment [2]. In this work [3], it is proposed that the construction of the Design Space is considered as a problem of determining the boundaries of a process where feasible, profitable and acceptable performance of the process is guaranteed. For many years, this has been considered as a major concern in many process industries [4-7]. Specifically, the mapping of the design space of pharmaceutical processes is achieved using the ideas of process operability and flexibility under uncertainty. For this purpose, three different data-based methodologies are proposed to produce the graphical representation of the region in which acceptable performance of unit black- box processes can be ensured. The first step of the proposed methodologies is the use of three different surrogate- based methodologies to model the performance of all possible designs of pharmaceutical unit operations. The methods used are: High Dimensional Model Representation (HDMR), Response Surface Methodology (RSM) and Kriging. In HDMR methodology the output of a process is considered as a finite hierarchical correlated function expansion, in terms of the independent and cooperative forms of input variables [8]. Look-up tables are then constructed so that component functions at any given arbitrary input variable can be interpolated. In RSM, for a problem containing n continuous input variables, an n-dimensional quadratic polynomial is used as the local model since quadratic behavior describes the mathematical geometry in the neighborhood of an optimum [9]. Based on Kriging methodology, the prediction is expressed as a weighted sum of the observed function values at sampling points that fall within a set interval around the point that is predicted [10]. A variance for each test point is also calculated, thus regions where subsequent sampling is required can be linked to a high variance. Modeling and optimization of pharmaceutical processes often includes modeling using discrete decisions, which represent design variables such as the use, size or configuration of a specific part of a piece of equipment (e.g. screw size of a feeder, design of a nozzle, etc). In the published work, data-driven modeling of pharmaceutical processes often treats these types of variables as coded values based on which the final response surface is fitted. In the present work, however, individual models are produced for alternative process designs which are complemented by the assignment of a decision variable for each design. These approaches are compared and their efficiency is illustrated through two pharmaceutical case studies involving a continuous powder mixer, and a loss-in- weight feeder of a tablet production process. The purpose of this work is to develop a general methodology that will construct a graphical representation of the region bounded by the limits within which acceptable product or process performance is achieved. For this, the only available knowledge on the system consists of a multivariate experimental data set of a desired output measured at different operating conditions for different design configurations. This is very often the problem faced in processes for which a physics- based model does not exist. The main advantages of the proposed methodology (independent of the specific data-driven approach used) are that (a) the accurate calculation of the corresponding range of input variables that result to acceptable performance, and (b) the design that results to optimum performance at different operating conditions can be identified. It should be also emphasized that the methodology proposed does not depend on the existence of first- principle models, and that the models developed are computationally inexpensive. Experimental validation of the results is also performed to verify the results of the proposed approaches. This work can be considered as a fundamental building block of a general methodology for defining the design space of black- box processes, which operate under uncertainty, and for which the only available information is an experimental data set. References: 1. Lepore J and Spavins J, PQLI Design Space. Journal of Pharmaceutical Innovation, 2008; 3(2): 79-87. 2. Garcia T, Cook G, and Nosal R, PQLI Key Topics - Criticality, Design Space, and Control Strategy. Journal of Pharmaceutical Innovation, 2008; 3(2): 60-68. 3. Boukouvala F, Muzzio F, and Ierapetritou M. Design Space of Pharmaceutical Processes using Data-Driven based Methods. Submitted to Journal of Pharmaceutical Innovation, April 2010. 4. Banerjee I and Ierapetritou MG, Design Optimization under Parameter Uncertainty for General Black-Box Models. Industrial & Engineering Chemistry Research, 2002; 41(26): 6687-6697. 5. Floudas CA and Gumus ZH, Global Optimization in Design under Uncertainty: Feasibility Test and Flexibility Index Problems. Industrial & Engineering Chemistry Research, 2001; 40(20): 4267-4282. 6. Halemane KP and Grossmann IE. Optimal Process Design Under Uncertainty. 1987 [cited; Available from: 7. Lima F, Jia Z, Ierapetritou M, and Georgakis C, Similarities and differences between the concepts of operability and flexibility: The steady-state case. p. 702-716. 8. Li G, Rosenthal C, and Rabitz H, High Dimensional Model Representations. The Journal of Physical Chemistry A, 2001; 105(33): 7765-7777. 9. Myers RH and Montgomery DC, Response Surface Methodology: Process and Product in Optimization Using Designed Experiments: John Wiley & Sons, Inc., 1995, 728. 10. Cressie N, Statistics for Spatial Data (Wiley Series in Probability and Statistics): Wiley-Interscience, 1993.