July 26, 2022

# Stepped Variables in Optimization and Adaptive Design

## Introduction

We use variables to describe, measure, manipulate and control some properties of an object. Values of variables can vary or be distributed across the set of all values that a variable can possibly have. Several types of variables exist, and it’s important to choose the right variable to measure when designing studies, optimizing and interpreting results. A strong understanding of variables can lead to more accurate analysis and results. Until recently pSeven supported three types of variables: continuous, discrete and categorical.

From pSeven v6.18 a new type stepped variable has appeared. In this tech tip, we describe stepped variables, explain the difference between continuous, discrete, categorical and stepped variables, discuss usage scenarios of all the variables types, and consider examples of using stepped variables in optimization and adaptive design of experiments study.

## Definition

Sometimes variables take a wide range of values in a continuum. If we consider the continuous variable weight between 1 and 2 Kg, the number of values is limitless: 1.005, 1.7, 1.33333, and so on. Such variable is called continuous.

On the other hand, a discrete variable is one that has a finite number of values, representing discrete quantities. The discrete type is intended for design variables that have a considerably small set of allowed values. We can compare, multiply and subtract these values, but the function is known only for value from the list. For best understanding see **Figure 1** which demonstrates what values a function F can take depending on variable type (*X*).

*Fig. 1. Different types of variables*

Box bounds are the simplest type of constraints. They are implicitly assumed and respected always. It is guaranteed that no one problem data evaluation could happen outside imposed box bounds.

Categorical variables belong to a kind of measurement called nominal. That is, they can be measured only in terms of whether the individual items belong to certain distinct categories, but we cannot quantify or even rank order the categories. The variable gender, for example, has only two values (male and female). Or, the variable color can take green, red, and blue values, for instance. Variables that take only a handful of discrete non-quantitative values are categorical variables.

Stepped variables are intended for the cases where the dependency between a variable and responses is in fact continuous, but the generated designs are required to include only predefined variable values. This makes stepped variables different from discrete variables, which do not imply continuity. A typical example of a stepped variable is a geometry parameter of a part selected from a catalog. For example, the catalog may contain metal plates with thickness 1.0, 1.1, …, 4.9, 5.0 mm, so the thickness variable cannot have an arbitrary value between 1.0 and 5.0. However, plate properties such as mass and strength are continuous functions of the thickness variable: a plate with 2.625 mm thickness can exist, it is just not acceptable in the design. In this example, the plate thickness is naturally defined as a stepped variable in [1.0, 5.0] with step size 0.1. Based on this definition **Figure 1** can be replaced by the following plot (**Figure 2**).

*Fig. 2. Stepped variables as a new type in pSeven*

The reason for the appearance of stepped variables is the desire to speed up the design space exploring in the case when we have a strictly defined set of values. More about how stepped variables are implemented in pSeven in the next section.

## Variables in pSeven

**Continuous variables** are supported by all techniques in pSeven. For a continuous variable, the block generates values from the interval specified by their bounds. Some techniques assign levels to continuous variables internally – for example, Full factorial or Orthogonal array design. The bounds (Lower bound and Upper bound properties) are required for a continuous variable, unless it is constant. The range between bounds must be 10e-6 or greater. Variables are specified in Design space exploration block on the Variable pane (**Figure 3**).

*Figure 3. Different variables type configuration*

Set of **discrete values**is specified by the Levels property which is required for a discrete variable. Levels of a discrete variable may be arbitrary – they are not required to be placed at regular intervals. This also implies that the dependency between a discrete variable and responses may be discontinuous, although the values of a discrete variable can be compared numerically (for being less, greater, or equal). This assumption is important in the optimization and adaptive design techniques, where using discrete variables is not recommended: when possible, prefer to define a variable as stepped (find below) with an appropriate step, rather than to define it as discrete with preset levels.

Discrete variables are unsupported or only partially supported by many techniques. Some of the techniques which support discrete variables also have additional requirements to their properties. Partial support generally means that a technique allows discrete variables in configuration but uses an algorithm which cannot work with discrete variables. In such cases, the block usually runs an independent study for each combination of levels of discrete (and categorical) variables, and then merges the results. Levels of discrete variables are also checked when evaluating an initial sample. Points of the initial sample, which contain values of discrete variables do not matching their levels, are considered infeasible and are excluded from the feasible and optimal result matrices. See more: Initial Samples and Results.

**Categorical variables** are a further generalization of the discrete variable type. They have a limited set of allowed values specified by the Levels property. This property is required and must define at least 2 levels. Levels may be numbers or strings all level values must be unique and must have the same type.

Even when levels are numeric, a categorical variable is never processed as a numeric one: its values can be compared for equality only (match or no match). Categorical variables can be used to define a set of keys recognized by a blackbox. A typical example is a variable which enumerates material grades, like ("S235", "S275", "S355"). Generated design points will contain one of these keys; the blackbox shall recognize the key received and use the characteristics of the specified material when evaluating the design.

Most techniques work with categorical variables by running an independent study for each combination of levels of categorical (and discrete) variables and then merging the results. Some techniques which support categorical variables impose additional requirements on them.

Levels of categorical variables are specified in the same way as for discrete variables, with the addition of strings as valid values. The levels are also checked when evaluating an initial sample. However, if you set a categorical variable to constant, its levels are ignored both in the initial sample and when generating new designs.

For a **stepped variable**, the block generates values adjusted to the regular grid defined by the variable’s bounds and step.

For each value *v _{i }*it guarantees that

*v*,

_{i}= b_{l }+ k · s*b*, where

_{l}≤ v_{i}≤ b_{u}*b*

_{l},

*b*

_{u}are the lower and upper bound,

*s*is the step size, and

*k*is some integer. That is, the values of a stepped variable match steps and are placed within bounds.

based optimization, and Gradient-based optimization techniques. A stepped variable requires bounds and the step. The range between bounds must be 10e-6 or greater, same as for a continuous variable. The step size *s* should be relatively small compared to the range between bounds *r*: the greatest valid step size is *s _{max} = 0.1 · r*, but in practice it is recommended to use steps with size

*s ≤ 0.01 ∙ r*. Also, the range between bounds must contain an integer number of steps. More details in Step hint here: Variable Hints.

If we use stepped variables in optimization tasks, it is also recommended to add the Initial guess hint to a stepped variable. Remember that the initial guess value for a stepped variable must be valid with regard to its bounds and step – that is, the initial guess must match some step.

If you use an initial sample without evaluated responses, the block checks whether the values of a stepped variable contained in the sample are within bounds and whether they match steps. Points of the initial sample, which contain out-of-bound or non-matching values of stepped variables, are considered infeasible and are excluded from the feasible and optimal result matrices. Responses in such points are never evaluated, since they violate the stepped variable definition.

However, if the initial sample which variables values are out of bounds and steps, contains values of responses, this data will be involved in initial approximation model building during Surrogate-based optimization and Adaptive design study.

If we set a stepped variable to constant, the bounds, initial guess, and step settings are ignored, so effectively the variable is not considered to be stepped. All new designs generated by the block will contain the fixed value of such variable. Also, if a stepped variable is constant, its values in the initial sample are not required to be within bounds and to match steps.

More information is contained in pSeven documentation: Types of Variables

## Stepped variables in optimization task

Let’s consider optimization of the geometry of a high-speed rotating disk. The geometry is described by 6 radii and 3 thicknesses. However, several geometry parameters are constant, that is why only *r2, r3, r4, t1, t3* are variables. The problem is to minimize the disk mass taking into account constraints on stress and radial displacement. More information about the research object you can find in example 3.1 in the pSeven examples list. Suppose, the thicknesses can only take pre-installed table values from the range [0.004, 0.05] with a step is equal 0.002. Important that the range between the bounds of the stepped variables must be a multiple of its step. To set the step value use Step hint in Design space exploration block. Other variables are continuous and defined only by bounds.

*Fig. 4. Design space exploration block configuration*

During the gradient-based optimization, the block generates values of thicknesses from the given list (**Figure 5**).

*Fig. 5. All designs of stepped variables in optimization problem*

For this problem statement, the following optimum and corresponding geometry parameters are found: *mass = 22.56, s_max = 346 MPa, u_max = 0.3 mm, r2 = 0.14 m, r3 = 0.168 m, r4 = 0.2, t1 = 0.05 m, t3 = 0.032 m*, where *t1* and *t3* are stepped variables with the values from the catalog.

## Stepped variables in Adaptive design

Adaptive design of experiments uses a Surrogate-based optimization concept inside, so the support of stepped variables is an expected feature.

Let’s consider a simple analytical function that, however, have a specific shape – Branin function (**Figure 6**).

*Fig. 6. Branin function*

To reproduce this shape, use the following dependency:

As it’s seen from **Figure 7**, a feasible domain has a “non-rectangular” design space, so the traditional design of experiments methods cannot be applied in this case.

*Fig. 7. Feasible and non-feasible design spaces*

Imagine the variables can take only values from -5 to 5 with a step of 0.02 for some reason. Response and constraints are also defined in Design space exploration tool and can be seen in **Figure 8**. The linear constraint is calculated in a separate PythonScript block and restored before the model evaluation. The non-linear constraint is computed in another python-block in parallel with the Branin function.

Adaptive technique generates a uniform plan if constraints are given, and improves prediction for objective function if the “Adaptive” type of response is enabled. We selected the second scenario to see how it works with the stepped variables.

*Fig. 8. Design space exploration settings*

Exploration budget means a number of points that will be generated, but it’s not necessary all of them will fall into the feasible domain. The simulation result is shown in **Figure 9**.

*Fig. 9. Adaptive plan with stepped variables for the scenario of function improvement*

100 points were generated, and just 6 of them violate the non-linear constraint (grey points). Due to the specific behavior of the original function, a lot of points are located in the left bottom corner and much less in the rest of the feasible space. All points are generated with the given step 0.02.

## Conclusion

In this tech tip, we introduced a concept of stepped variables started from pSeven v6.18, looking at the new type versus the already implemented continuous, discrete, and categorical variables. This new type of variable is useful for problems when parameters can take only values from a range with a given step, like from a catalog table, for instance. Exploration of the design space only for given variables values allows to save computational time and obtain the most appropriate solution. The examples above demonstrate how it works for optimization problem and in Adaptive design of experiments study.

*By Yulia Bogdanova, Application Engineer, DATADVANCE*