User-friendly Population Balance Model with Data Integration for Crystallization Development

Commercial off-the-shelf software for PBMs is available from multiple suppliers, and the use of this software is increasing within the pharmaceutical industry. However, there are several opportunities to improve the current commercial PBM software in order to increase adoption of this approach.  These opportunities include:

  • Streamlining the data importation and transformation steps required to initialize the models;
  • Automating the preprocessing steps required for model convergence;
  • Increasing the incorporation of offline and online data into these models; and
  • Improving the efficiency of model selection/discrimination and parameter regression.

Download the Request for Information and submit your response.

RFI issued July 11, 2016

Responses due August 8, 2016

Questions Received (Updated August 3, 2016)

  • Can an expert describe in more detail the workflow that is used to today to obtain the data that is used to develop a population balance model?
    1.     Solubility data is collected. 
    2.     A local solubility model is regressed that encompasses the relevant processing conditions (solubility vs. temperature for thermal crystallizations, solubility vs. solvent/antisolvent composition or pH for additive crystallization – both when needed). 
    3.     Desupersaturation experiments are designed (could be designed to probe for nucleation and growth separately or in a joint manner).  The operating mode of the desuperaturation experiments (batch, continuous, etc. – most frequently batch) does not have to be in agreement with the intended operating mode of the crystallization process. While ideally, instances of “to-be-improved-upon using PBM” crystallization processes would be executed as desupersaturation experiments, these processes often do not introduce enough variability in nucleation or growth rates to regress crystallization parameters. 
    4.     Flow sheets are made within the PBM software that are agreement with the operating mode of the desupersaturation experiments.
    5.     The desupersaturation experiments are executed: process data – temperature trends, mixing (impellor agitation rate), dosing rates; are collected as are analytical data – IR, FBRM, PSD, HPLC, pH (in relevant cases).  Frequently there are ~5-10 measures of particle size per experiment and >100 measures of process data and concentrations.
    6.     The data are time-stamp aligned. 
    7.     The data are converted to the appropriate units for the PBM software.  For PSD data, frequently quantiles or moments of the population are used and not the distributions.  If distributions are used, some software require that the population is converted from the customary “volume %” based distributions to “volume density” based distributions.  A separate tool is frequently used to accomplish this conversion.  For seeded desupersaturation experiments, the particle size distribution of the seed could be included but is frequently omitted (or approximated by quantiles or other measures) to enable better parameter regression in subsequent steps.  The need to approximate or fit seed PSD may come from the stiffness of the governing differential equations and solution methods underlying the system.  It is worth noting that “fit” and measured seed PSDs are often in qualitative agreement.  The FBRM data is frequently not used within the PBM software.  When it is used; it is often converted to PSD quantiles using separate tools.  HPLC measures of concentration can be used to calibrate the IR data (using separate tools from the PBM software) or less frequently are used directly in the PBM model. 
    8.     The data is brought into the PBM software; and various nucleation, growth, agglomeration, breakage, secondary nucleation models are “turned on” within the software.      
    9.     The parameters within the specific crystallization models are regressed using the data from the desupersaturation experiments.
    10.   Parameter fitness characteristics are evaluated, and the varieties of models used are adjusted for improvement (e.g., primary nucleation “turned on” or “off”, growth fit to a power law rather than an exponential based model).  Both statistical measures of fitness and qualitative assessments of how well the model predicts the characteristic trends of the desuperaturation experiments are used.  These assessments are evaluated in terms of the solute concentration data from IR or HPLC and the particle size data.
    11.  The model is then used to optimize variables within the intended crystallization process (either through a constrained optimization problem or in silico scenario evaluation) and/or a crystallization experiment is conducted to cross validate the parameters regressed in the PBM software. 
    While improvements in any/all of these steps are desired – this RFI highlights the need for improvements in steps 6-7 (which are manual and labor intensive) as well as steps 8-10 (which require modelling expertise beyond the typical intended user). 

  • Can I get technical clarification on what this means - "Concentration data using FTIR should be able to be adjusted (offset or scaled) to match the measured or modeled solubility when the system is at equilibrium, with the ability to apply the same correction to the remainder of the concentration data.  This ability may help mitigate issues with parameter estimation and model stability"? Sometimes the concentration as obtained from IR data appears to be offset from the predicted / measured solubility.  This can be an actual phenomena - i.e., the system never fully desupersaturates, or an artifact of the instrument / IR calibration model.  PBM models may behave poorly when there is an offset between the measured and actual concentrations (e.g., fit parameters may not converge if the system appears to maintain supersaturation).  Continued investment in an IR calibration model, after a model has been obtained that’s “good enough” with the sole purpose of eliminating this artifact are often unwarranted/impractical.  The PBM software should be able to “subtract” the offset and adjust the concentrations throughout the experiment accordingly.   

  • A requirement is to make the software compatible with - Pi data historian by OSISoft and Delta V . I would like to understand this requirement in more detail. Is it so that larger scale crystallization processes in piloting or production can be modelled? Pi Data historian and Delta V; may be used for small scale experiments – the intention is flexibility.  It is worth noting that although direct compatibility of the software with these tools would be beneficial it is not a requirement.  Direct compatibility with iControl is a requirement.  Compatibility with an acceptable intermediate file format (CSV) for these tools would be appreciated within the proposed solution. 

  • What information does "UV for assay/solution composition" provide? Offline concentration? Yes - offline concentration of solute or solvent ratios.  It may be beneficial when compared to HPLC for flow applications.  A possible application could be as a measure of inlet concentration within a continuous desupersaturation experiment.  

  • Comma-separated value data from arbitrary sources – is requested. Can I have some examples of "arbitrary sources". One example would be the output from a CSV file containing the amount of anti-solvent fed as the result of a mass flow controller or a balance.  It may contain a header (one or several lines) and then at least two columns – one being a time stamp one being a data stream. 

  • A key element for <VENDOR> to understand is the potential market size for this package. Could members of the ETC support a market study to help us here?  Currently we don't have one available and do not have any plans to create one.

  • What kind of supersaturation generation methods need to be covered?  Cooling, anti-solvent, reactive, pH shift, etc.? The desire is to have a versatile modeling platform that can handle supersaturation generation via cooling, anti-solvent, distillation (i.e., evaporation and/or solvent swap), reactive, and pH swing.

  • Which phenomena need to be modeled?  Primary nucleation, Secondary nucleation due to attrition, Growth, Agglomeration?  Are there particular mechanistic models that should be included? Again, the desire is to have a comprehensive, versatile model so the tool should handle primary and secondary nucleation, growth, agglomeration, and breakage.  Recognizing that there are different models available for each of these mechanisms, we would like the tool to allow selection of the most appropriate representations for each of these mechanisms.

  • Should the ability to model wet milling processes be included? This was not stated explicitly as a requirement, but the feature would certainly add value and increase the versatility/applicability of the modeling platform.  The ability to apply wetmilling at different stages in the process would also be desired—e.g., wetmilling while crystallizing, wetmilling after crystallization followed by a heat/cool annealing (fines dissolution) step, etc.

  • Which operating modes need to be considered?  Batch as well as continuous?  What about flowsheeting?  E.g. for multi-stage crystallization processes or batch processes with a wet mill in a recycle loop? The models should be applicable for both batch and continuous operation.  The capability to do flowsheeting for more complex operations like multi-stage crystallization or wetmilling in a recycle loop would provide the flexibility and broad utility that is ultimately desired.  The development of the modeling platform could begin with the establishment of a simpler foundation (a single batch or continuous crystallizer) and then expand out to support these more complex applications as a “stage 2” deliverable.

  • What are the requirements with respect to other physical properties than solubility?  Is interfacing with commercial property packages such as Aspen Thermodynamics of importance? Yes – ties to available commercial property packages are important and desired.  Aspen is one example. DIPPR available through a couple of interfaces is another example.  The connection should provide for an archive of the property value used in the simulation, source of the value and account for temperature and compositional dependence of the property on the composition of the system.  The impact of the API on the properties of the system could be disregarded (i.e. – API impact to the thermal conductivity).

  • As regards CFD interfaces for compartmental models, which CFD tools are used by the ETC members? ANSYS Fluent would be the primary tool, but Comsol and OpenFoam/MixIt (Tridiagonal) may also be of interest.