3.9. Optimization Workflow¶
Despite the formal simplicity of performing optimization in pSeven Core, proper problem definition is an intricate procedure and requires certain accuracy and effort. This page provides simplified instructions on how to get the best of GTOpt to solve your problem efficiently.
- Problem Investigation
- Local and Global Methods
- Preliminary Study
- Problem Solving
This section describes general wishes and worth-to-follow rules to get a most economically adequate representation of the models you are going to investigate. This stage has to formally define the problem’s type as one of the following:
- Single-objective non-linear constrained problem (NLP) or its unconstrained analog (UNLP). Only one performance measure, no uncertain parameters. This type is the most preferable and simplest to solve, it is always worth to start model investigation with it.
- Multi-objective non-linear constrained or unconstrained problem (MOP). Multiple performance measures, but yet no uncertain parameters. It is more complex than NLP, in particular because the analysis of teh obtained solution is far from being straightforward.
- Robust optimization problem (RO). One or several objective functions and constraints, and at least some responses depend on number of uncertain parameters. This type is the most difficult to treat.
Importance of this stage is hardly possible to underestimate. Indeed, it is quite evident that a badly posed problem cannot be solved in principle, no matter how perfect the optimizer is! Moreover, proper problem definition is arguably much more important than the optimization per se. Therefore let us consider some general rules, which are worth to follow in order to set up the problem properly.
This section includes a few general rules to follow when defining optimization objectives — the performance measures of the model under investigation.
Consider only relevant model responses.
In fact, in any optimization problem you can artificially introduce an infinite number of additional “objectives”, only slightly related to model under study, but which make the problem practically unsolvable. In particular, if you are uncertain in the relevance of some response, it is worth to exclude it at first approximation. After all, response relevance could be inspected a posteriori and hence it is always better to start with minimal set of performances.
Always estimate uncertainties of both the model responses and the model itself.
Importance of this requirement follows from very simple observation that it essentially makes no sense to improve 100% biased or uncertain model responses. In short, effects unaccounted by model itself (systematic uncertainties) must be much smaller than expected improvement of responses. This is discussed in more detail in section Preliminary Study.
Ensure proper scaling of selected performance measures.
Keep in mind that ultimately we are going to perform numerical optimization with all its inherent limitations and headaches. Badly scaled problems, in which various responses differ by many orders of magnitude, are difficult to handle numerically, hence it is always worth to avoid bad scaling. After all, it really makes no sense to measure interstellar distances in nano-meters. To summarize: keep objectives of the same order, this would not harm.
GTopt automatically performs rudimentary problem scaling, trying to do the best to ensure numerical stability. However, as it always happens with automated approaches, there are difficult cases in which automatic procedure might fail. Please, remember that there is no magic inside GTOpt, it cannot cure pathological cases.
Decide the number of performance measures you want to consider.
Importance of this step is self-evident: of course, one wants to investigate the most general case, however the multi-objective formulation is much more difficult to handle. Therefore, it is always better to start with single most relevant objective function, augmenting problem setup with additional performances if no difficulties had been identified during the course. Do nott worry, model evaluations performed at simplified single-objective stage would not be lost, GTOpt is able to fruitfully reuse archive of already performed evaluations. Note that it is very rare case when one wants to consider more than a few objectives: usually the solution of a multi-objective problem provides a large variety of designs, the analysis of which is close to impossible.
General recommendations for proper selection of design variables are similar to what had been given for objective functions.
Consider only relevant model parameters.
Note that pSeven Core allows you to perform preliminary studies to rank the relevance of various design parameters.
Ensure proper scaling of design variables.
Again, GTOpt automatically performs design space rescaling, but any automated procedure has a chance to fail in pathological cases. In particular, internal rescaling cannot cure numerical instabilities, which arise when design variables differ by many orders of magnitude.
Consider only bounded design variables.
Existence of finite bounds might be argued quite generically. In the numerical optimization context, finite bounds may also stabilize optimization process, which otherwise may spend some time looking into the corners of the design space (this is especially true for globalized methods which have to investigate global model behavior). At the very least, finite bounds on design variables should reflect the validity region of the underlying model.
Constraints usually represent a subset of model responses or are given externally as more or less simple semi-analytic functions (geometry constraints). As far as geometry constraints are concerned, the rule is straightforward: keep them as simple as possible; if the geometry constraint can be written as a linear combination of design variables then it must be written this way (optimizer is to be advised accordingly).
Recommendations for generic constraints are very similar to that of objective functions.
- Constraints should be relevant to underlying processes.
- Constraint functions should be scaled properly.
- Imposed constraints should have a sufficient degree of certainty.
Additional constraints-specific consideration concerns equality constraints. As a matter of fact, exact equalities are not welcome, especially in real-life applications, where you often want to monitor feasibility of optimization iterates. Usually, exact equalities represent oversimplifications because the gap between imposed limits cannot be smaller than the (finite) response uncertainty. Thus, it is always worth to keep this gap finite.
Although box bounds of the type \(x_L \,\le\, x\,\le\, x_U\) could be considered as a subset of generic non-linear constraints \(c_L \,\le\, c(x)\,\le\, c_U\) it is quite convenient to consider them separately from very beginning. Thus GTOpt treats box bounds and general constraints differently, and user is advised to provide them separately. Although it is admissible to unify everything in terms of generic constraints, this would definitely result in severe performance degradation. Moreover, there is another important difference between generic and box constraints treatment in GTOpt: box bounds are always respected in GTOpt, while the amount of generic constraints violation during optimization process depends upon the details of particular algorithm used; normally generic constraints are violated until GTOpt converges to solution.
As far as the relevant bounds \(c_L, c_U\), \(x_L, x_U\) are concerned, they are subject to only two obvious requirements: a) \(c^j_L \le c^j_U\), \(x^k_L \le x^k_U\) and b) \(c^j_L\) and \(c^j_U\) can not be at their respective limiting values \(-\infty\) and \(+\infty\) simultaneously (unbounded constraints are not supported) – see below for precise meaning of \(\pm\infty\) in GTOpt. Otherwise, imposed limits are almost arbitrary, in particular, user may set \(c^j_L = c^j_U\) to specify equality constraint(s) and \(x^k_L = x^k_U\) to make particular design variable effectively frozen.
Specification of uncertain parameters is tricky because it is usually not straightforward to identify uncertainties indeed relevant to the considered model. However it is very important because presence of uncertainties radically changes the type and complexity of an optimization task: it becomes a robust optimization problem, which is by orders of magnitude more difficult to handle.
Therefore, considering a proper selection of uncertain parameters, the only general recommendation possible is:
- Always try to avoid uncertain parameters.
It is not a fatal simplification to hold identified uncertainties fixed because an uncertainty quantification (UQ) analysis of an obtained simplified solution can be performed a posteriori. It often happens that uncertainties at established optimal design are negligible and hence simplified formulation is adequate. Otherwise, simplified solution not only provides a good starting point for robust optimization (perhaps, with localized methods), it also quantifies (via UQ) the relevance of various uncertain parameters. In many cases such preliminary study could easily half the number of original uncertainties. In either case, the summary is: even if uncertain parameters are unavoidable, try to diminish their number as much as possible.
It is of crucial importance to decide beforehand the type of the study you are going to perform. In broad terms, optimization can be performed either locally or globally. Since the meaning of the terms “local” and “global” regarding optimization is rather vague, this section aims to provide a basic explanation of how they are understood in GTOpt.
Strictly speaking, local optimization methods are characterized by the following property: for a particular design given initially, they reach a locally optimal solution located close to the starting point. Unfortunately, no one could rigorously define what the meaning of “close” is: all classic local algorithms are able to bypass the nearest locally optimal solution, instead finding some other. Nevertheless, the key feature of local methods should be obvious: they essentially stop searching at the first found locally optimal solution and make no attempt to establish the global optimality of the found design.
Alternatively, we could characterize local methods as those that discover the available design space from small to large distances: every iteration of a local method always reduces to investigating the close vicinity of the current iterate, followed by some, perhaps comparatively large, step towards optimality. In this sense the method’s evolution from small to large distances is understood in GTOpt (for more details, see also section Preliminary Study further).
Contrary to local approaches, global optimization methods always try to ensure that the found solution is globally optimal. There are several types of globalized algorithms. Simplest globalized methods (known as multi-start strategies) just apply some local algorithm repeatedly, starting with different “well separated” initial designs, and compare the resulting set of locally optimal solutions afterwards. As a consequence, globalized methods of this family only slightly differ from local approaches — in particular, they follow the same investigation pattern and evolve from small to large distances. More advanced global methods, applicable in computationally expensive context, are those that utilize approximation models (surrogate based optimization, SBO). These methods are quite different as they break the above investigation paradigm and always proceed from large to small distances: initial global models of problem functions are gradually refined in promising regions, until at least the lowest allowed design space resolution is reached or the computational budget is exhausted and optimization stops.
Evidently, global optimization is rather resource consuming, and we advise to always start problem treatment with local methods (unless your problem is computationally expensive and SBO remains the only possibility). In order to control the computational load and transition smoothly from local to global algorithms, a parameter is required that adjusts the optimization method closer to its local or global limiting cases. For multi-start methods, such a parameter is easy to establish: for instance, it may control the number of different initial guesses to be considered. For surrogate-based methods it becomes more complicated: it is by no means evident how an SBO-like methodology could be continuously transformed into a local search method. Fortunately, this transformation is possible and is rooted in the old dilemma of surrogate model exploration and exploitation. For virtually any SBO method, the exploration/exploitation balance in approximation can be parameterized using a single value like mentioned above. The only peculiarity here is that it is not possible to actually reach the limit of purely local treatment in SBO, as any surrogate-based treatment includes the initial training stage and for this reason is always globalized to a certain extent.
GTOpt explicitly introduces this parameter as the GTOpt/GlobalPhaseIntensity option that takes values from 0.0 (applies purely local methods) to 1.0 (applies maximum possible globalization). Any non-zero value forces GTOpt to apply globalized methods, while its magnitude adjusts the degree of global search and the fraction of computational budget allocated to globalization.
In most cases default GTOpt/GlobalPhaseIntensity is recommended, however some tweaks might be necessary in specific situations. For non-expensive problems, the globalization parameter may vary significantly and should only approximately reflect the expected degree of problem multi-modality. For instance, if you expect to have only a few locally optimal solutions, a value of 0.1 or even 0.01 should be sufficient. Computationally expensive problems require more consideration. First of all, in expensive problems it is always recommended to set the total evaluation budget (GTOpt/MaximumIterations and GTOpt/MaximumExpensiveIterations). As far as the choice of GTOpt/GlobalPhaseIntensity value in computationally expensive problems is concerned, there are two general cases:
The problem is multi-modal and there is no proper guess for the location of the optimum.
This is the general scenario of global optimization, where entire design space should be searched for a solution. Sufficiently large GTOpt/GlobalPhaseIntensity values have to be used because they favor exploration in surrogate modeling thus providing good design space coverage.
Initial guess for the optimal design is available and multi-modality is out of consideration. The prime purpose is to improve the current best design in its vicinity.
In this case, GTOpt/GlobalPhaseIntensity should be set to a relatively small value. Supplemented with explicitly set budget, this strategy allows quite accurate investigation of best design vicinity. Typical GTOpt/GlobalPhaseIntensity value in this case is below 0.1.
Once a problem is set up (meaning primarily that model type is decided), it is worth to carry out a few small preliminary experiments with formulated models in order to determine their basic properties. This can not only help to detect and diagnose possible technical issues, but also help GTOpt to select the best suited algorithm to solve the problem.
At this stage the problem has been formally given, but its properties yet remain completely obscure (at least for GTOpt). Surely, it is always possible to just plug the model into GTOpt, select the most general properties for all responses (generic non-linear, expensive to evaluate functions) and start optimization in hope that GTOpt does the job anyway. However, under given generic assumptions the most robust “fool-proof” algorithm will be used to get the solution and it might not be the most efficient. In order to get the best of GTOpt’s capabilities, it is worth to provide additional information so that the choice of best suited optimization method can be more intelligent.
Gathering additional information before running actual optimization is helpful in vast majority of cases. It allows early detection of technical (programming) errors in problem setup, estimation of underlying model validity and robustness and, in particular, might indicate whether optimization makes sense at all.
Generally the preliminary study of a model can be divided into two steps, which are worth to be done in the following order:
- Investigation of the model’s behavior at large scales (over all design space).
- Study of small scale properties of the model.
It is worth to note that only the first stage seems to be mandatory. Indeed, its prime purpose is to ensure that optimization of the formulated model makes sense at all (it might well be not) and to estimate by order of magnitude the optimization potential (amount of improvement of performance criteria). Investigation of small scale model properties may be skipped, but at the expense of potential performance degradation.
Preferable way to examine large scale behavior of a considered model is to perform a design of experiment (DoE) study, aimed to probe model’s responses over the whole design space. pSeven Core provides a large variety of DoE scenarios, for our purposes all these are equally good. Outcome of DoE is to be analyzed (this is quite trivial and might be done with standard tools), and the goal is to answer several questions discussed in this section.
How stable are model responses? What is the actual model validity region?
Here the recipe is to calculate the fraction of unsuccessful model evaluations. These are not only the cases when model fails to produce outcome (for whatever reason), physically suspicious and/or senseless responses are to be accounted for as well. The total fraction of unsuccessful runs quantifies the model stability, if it is smaller than about 20% then forthcoming optimization has great chances to succeed. Otherwise, if this fraction is large (more than about 50%), this is a sign of trouble: optimizing undefined or faked responses is clearly devoid of meaning. The remedy is to diminish the design space, but unfortunately reduction details are strongly case-dependent. Let us consider a few proposals:
Diminish the box bounds.
Often the imposed box bounds appear to be too optimistic, and underlying model becomes invalid well within the boxes. Try to diminish box bounds until either the fraction of unsuccessful runs becomes acceptable or boxes could not be reduced further.
Introduce additional linear constraints.
Rationale to consider additional linear constraints comes from simple observation that in large dimensions the volume of design space cannot be simply reduced by box bounds alone (most of design volume is concentrated near the corners of imposed box). In case the model validity region is located near the origin, box bounds can not help to diminish formally available design space.
Possible solution is to introduce additional linear constraints, which essentially forbid designs near the corners of the box bounds. Of course, this remedy is not unique, you can try any other suitable approach. However, it is closely related to the way how linear constraints are handled in GTOpt. It is important that most optimization methods strongly respect linear constraints. Remaining algorithms do not maintain them exactly, but corresponding degree of violation is always negligibly small.
What is the variation range of each model response and how it compares to model uncertainty?
This is the crucial question to justify optimization of the model in question, and it is to be considered carefully. Relevant scales to be compared here are the variation magnitude of model responses over the whole design space and the expected uncertainty of corresponding responses. While the first quantity is easily derived from DoE, estimation of the second one requires specific knowledge of the model in question and no generic recommendations could be given a priori (see, however, the notes in section Small Scale Properties). Nevertheless, we emphasize that model optimization makes sense only when the first quantity is (much) larger than the second, otherwise it will be impossible to segregate effect of conducted optimization from generic “random” variability of responses due to respective uncertainties.
Prime purpose of this study is to estimate the level of noisiness of the model responses. General recommendation is to conduct a few runs, at which design parameters are changing by tiny amount along an arbitrary line in design space. In vast majority of cases it might be generically argued that small change of design parameters could only cause small changes in responses. Moreover, to the leading order the corresponding graphs are to be linear with respect to chosen line coordinate. In practice, there are two possibilities:
- either measured graphs are indeed close to linear, or
- measured responses exhibit random abrupt deviations from linear law.
The former case is indeed fortunate and confirms sufficient smoothness of the responses thus allowing gradient-based optimization methods. In the latter case, model clearly exhibits noisy behavior and one should compare the measured magnitude of random deviations (noise magnitude) to the large scale variability of responses examined previously. As a matter of fact, forthcoming model optimization makes sense only when noise magnitude is much smaller than the expected large-scale variations. If this is not the case, model uncertainty is to be diminished first.
The actual process of solving an optimization problem with GTOpt (that is, implementing a Python script) consists of a few rather simple steps. These steps utilize information obtained at previous stages; GTOpt relies on it to automatically select the most appropriate solving method. Due to this, your major concern is investigating the problem and understanding its properties (see sections Problem Investigation and Preliminary Study for a guide). On one hand, GTOpt can be used successfully with very little programming skill (knowing the very basics of general OOP is enough). On the other, no amount of coding and tuning advanced options can help if GTOpt is not provided with reliable information about the problem.
A brief summary of what should be done before you open a code editor:
- Identify performance criteria, objectives and constraints. Select relevant variables and responses.
- Decide the problem type: single- or multi-objective, constrained, unconstrained and so on.
- If you know that some response functions are linear or quadratic, note this.
- Note the computational complexity of model response functions. GTOpt can handle expensive functions differently, making the optimization process more efficient in terms of the number of function evaluations. See section Surrogate Based Optimization for more details.
- Note the expected noise magnitude of model responses. It is a qualitative property: GTOpt only has to know whether you consider objectives or constraints noisy or not.
- Note the expected degree of model multi-modality. It is worthy to indicate whether the GTOpt solver should perform a local or a global search for an optimal solution.
After you gather the problem information, refer to the following sections for implementation guides:
For a basic example, see GTOpt Quick Start.
To specify response function types (linear, quadratic), use function hints when setting up a problem. See the Hint Reference for the @GTOpt/LinearityType hint for objectives and constraints.
To select expensive functions, use the @GTOpt/EvaluationCostType hint. Note that a problem with expensive functions requires numeric bounds for all variables (see Surrogate Based Optimization); in other cases, GTOpt can handle unbounded variables, but it is also recommended to avoid this (see Selecting Design Variables).
For a problem with expensive functions it may also be worthy to set the maximum allowed number of expensive evaluations. Use the GTOpt/MaximumExpensiveIterations option to add this limit. Options are set when you configure a GTOpt solver — see
gtopt.Solver.optionsand section Options Interface for details.
If problem objectives or constraints are noisy, refer to the GTOpt/ObjectivesSmoothness and GTOpt/ConstraintsSmoothness options. These again are part of the solver configuration. Note that you do not have to specify the amount of noise, only a general assumption is needed.
To configure global search, use the GTOpt/GlobalPhaseIntensity option. Note that an aggressive global search greatly increases time to solve even for the simplest problems; if you enable globalization, start from low option values around 0.1 (see the option description for details).
Keep in mind that this page discusses only the essential steps of studying and solving an optimization problem. In all cases it is recommended to begin with a basic configuration of problem and solver, following the brief guide in this section and with the regard to general recommendations in sections above. This configuration can then serve as a base for a thorough problem study using advanced GTOpt features.