14.3. GTApprox Tests¶

Sections

Methodology
Test Cases

14.3.1. Methodology ¶

GTApprox is tested by comparing the current version with previous versions in terms of training time and accuracy of the models trained in a number of predefined test cases.

This section provides general test formulation and defines measures. Test case definitions are found in the Test Cases section.

14.3.1.1. Test Case Formulation ¶

For each test case, a multi-dimensional training sample \(\{\mathbf x_n, y_n\}_{n=1}^N\subset {\mathbb R}^{d+1}\) is generated by sampling a scalar-valued response function \(y=f(\mathbf x)\) with \(\mathbf x\in{\mathbb R}^d\). This sample is used to construct the approximation \(\widehat{f}\) of the response function \(f\). Then, the predictive accuracy of the approximation is measured using an independent test sample \(\{\mathbf x'_k, y'_k\}_{k=1}^K\subset {\mathbb R}^{d+1}\) generated by sampling the same response function \(f\) (see Accuracy Measures).

Therefore, a test case is defined by:

The response function \(f\). To make test cases easily portable and reproducible, most response functions used in testing are defined by analytical formulas.
The design space \(\mathcal D\) on which the response function is considered. Typically this is the unit hypercube \([0,1]^d\).
The design of the experiment used to generate a training sample in \(\mathcal D\), which may be one of the following:
- “rnd”: random sample
- “unif”: uniform grid
- “cornrnd”: random sample with additional points in the corners of the design space
- “lhc”: Latin hypercube
The size \(N\) of a training sample.

Regardless of the size and type of a training sample, a large random subset of the design space is used as a test sample, in order to increase the statistical reliability of error estimates.

14.3.1.2. Test Naming Rules ¶

Each test case has a name in the form of prefix-function-doe, where:

prefix is the name of function set (for internal use),
function is the name of the function as these names are listed in the Test Cases section, and
doe is the technique used to generate a training sample (“rnd”, “unif”, “cornrnd” or “lhc”, see above).

Sample size is not included in the name, since each test is conducted for a variety of the training sample sizes, which are listed in the spreadsheets found in the GTApprox Test Report.

14.3.1.3. Accuracy Measures ¶

After an approximation \(\widehat{f}\) has been constructed for a particular test case, its accuracy is evaluated on a test sample \(\{\mathbf x'_k, y'_k=f(\mathbf x'_k)\}_{k=1}^K\). Let \(\Delta_k\) denote the difference between the value predicted by the approximation at the \(k\)-th point and the respective true value of the response function:

\[\Delta_k = \widehat{f}(\mathbf x'_k)-f(\mathbf x'_k).\]

Then, common accuracy measures are:

Mean absolute error (MAE):

\[\mathrm{MAE } = \frac{1}{K}\sum_{k=1}^K|\Delta_k|\]
Root-mean-square error (RMS):

\[\mathrm{RMS } = \sqrt{\frac{\sum_{k=1}^K\Delta_k^2}{K}}\]
Maximum error (MAX):

\[\mathrm{MAX } = \max_{k=1,\dots,K}|\Delta_k|\]
Relative root-mean-square error (RRMS):

\[\mathrm{RRMS } = \sqrt{\frac{\sum_{k=1}^K\Delta_k^2}{\sum_{k=1}^K\widetilde{\Delta}_k^2}} ,\]

where \(\widetilde{\Delta}_k\) is the difference between the response function and its mean value estimated on the test set:

\[\widetilde{\Delta}_k = f(\mathbf x_k')-\frac{1}{K}\sum_{s=1}^K f(\mathbf x'_s).\]

14.3.1.4. Performance Profile ¶

The performance profile is a function showing the share of test cases where the relative error is below the given value \(a\). Let \(\mathcal A\) be an approximation tool (algorithm). Then, after constructing approximations in several test cases using \(\mathcal A\) and evaluating their accuracy, one can plot the performance profile \(P_{\mathcal A}\):

\[P_{\mathcal A}(a) = \frac{\textrm{number of test cases TC with } {Perf}_{\mathcal A}({\rm TC}) < a}{\textrm{total number of test cases}} ,\]

where \({Perf}_{\mathcal A}({\rm TC})\) is the performance characteristic, which may be either of:

RRMS error of the approximation obtained with the algorithm \(\mathcal A\) in test case \(TC\) (in this case, \(P_{\mathcal A}(1)\) is the share of test cases where the algorithm \(\mathcal A\) has advantage over the trivial prediction of the response function by its mean value).
Time taken to construct the approximation with the algorithm \(\mathcal A\) in test case \(TC\).

The performance profile function \(P_{\mathcal A}\) monotonically increases from 0 to 1; better algorithms (with regard to the selected performance characteristic \({Perf}_{\mathcal A}\)) correspond to higher-lying performance profiles.

14.3.2. Test Cases ¶

This section describes individual GTApprox test cases (actual test settings).

Each case is defined by a response function (listed in sections below), design space \(\mathcal D\), a design of the experiment and training sample size.

For all tests, \(\mathcal D = [0,1]^d\) if not stated otherwise.

Each response function is tested with the four DoE generation techniques described in Test Case Formulation.

Each test is conducted for a variety of the training sample sizes, which are listed in the spreadsheets found in the GTApprox Test Report. The test sample is always a large random subset of the design space.

Note

Following sections denote \(\mathbf x=(x_1,\ldots,x_d)\).

14.3.2.1. 1-Dimensional ¶

airfoil

\[f(x) = \sqrt{x}(1-x)(1.2-x)\]

classic

\[f(x) = (6x-2)^2\sin(12x-4)\]

kink

\[f(x) = e^{-4|x-0.8|}\]

pressure1

\[f(x) = x^{0.3}(1-x)-0.05\arctan(30(x-0.05))+0.15 e^{-50(x-0.45)^2}-0.2 e^{-70(x-0.9)^2}\]

sin10pix

\[f(x) = \sin(10\pi x)\]

heaviside

\[\begin{split}f(x) = \begin{cases} 1, & x>0.500001\\ 0, & x\le 0.500001 \end{cases}\end{split}\]

f7mod

\[f(x) = \max(\tanh(\cos(0.3(2x-1)+50(2x-1)^3)), -(2x-1)^3)\]

14.3.2.2. 2-Dimensional ¶

aero2

\[\begin{split}f(x_1,x_2) = &((\arctan\sqrt{x_1})(-\arctan(80(x_2+0.2)^3 (x_1-0.3-0.3x_2))+2x_1(1-x_1)+(1+x_2)x_2))\\ &\times(1+0.2\sin(5\pi(1-x_1)x_2)+0.2\sin(x_1))(1+0.5\arctan(4(x_2-0.5))-x_2)\end{split}\]

branin

\[f(x_1,x_2) = \Big(15x_2-\frac{5.1}{4\pi^2}(15x_1-5)^2+\frac{5}{\pi}(15x_1-5)-6\Big)^2+10\big(1-\frac{1}{8\pi}\big)\cos x_1+10\]

CurvedKink

\[f(x_1,x_2) = \min(x_1^2+x_2^2, 0.3)\]

himmelblau

\[f(x_1,x_2) = ((12x_1-6)^2 + (12x_2-6) -11)^2 + ((12x_1-6) + (12x_2-6)^2 - 7)^2\]

michalewicz

\[f(x_1,x_2) = \sin(\pi x_1)\sin(\pi x_1^2)+\sin(\pi x_2)\sin(2\pi x_2^2)\]

rosenbrock

\[f(x_1,x_2) = 100((4.096x_2-2.048)-(4.096x_1-2.048)^2)^2+(1-(4.096x_1-2.048))^2\]

rastrigin

\[\begin{split}f(x_1,x_2) = & 20+((10.24x_1-5.12)-10\cos(2\pi(10.24x_1-5.12)))\\ &+((10.24x_2-5.12)-10\cos(2\pi(10.24x_2-5.12)))\end{split}\]

randfunc2

\[\begin{split}y = &((((sin(sin((sin((10*x2)-(sin((10*x1)+(tansig((10*x2)- \\ &(10*x2)))))))-(10*x1))))*(abs(tansig(10*x2))))*(10*x1))+ \\ &(abs((10*x1)*(((10*x1)-((10*x1)-((abs(abs(tansig((tansig( \\ &(sin(tansig(10*x1)))*(10*x1)))*((abs(sin(10*x1)))* \\ &(abs(10*x1)))))))+(10*x1))))+(abs(abs(sin((10*x1)- \\ &((((abs((((10*x2)+(10*x1))-((10*x2)+((10*x1)-(10*x1))))* \\ &(sin(10*x2))))+(abs(10*x1)))-(((10*x1)+(((10*x1)+(10*x2))+ \\ &(abs(sin(abs(tansig((abs(10*x1))*(10*x1))))))))*(sin(10* \\ &x1))))*(10*x1))))))))))+(abs(10*x1))\end{split}\]

randfunc3

\[\begin{split}y = &(tansig(10*x1))-(((tansig(abs(tansig((10*x2)*(sin(sin(((((10*\\ &x1)+(10*x1))+(sin(10*x1)))+(((((sin(abs(10*x1)))*(sin(tansig(tansig(\\ &10*x1)))))-(10*x2))*(10*x1))*(sin(tansig((10*x2)-(10*x2))))))-\\ &(((10*x1)+(10*x2))*((10*x2)-(tansig(((((tansig(sin((tansig(10*x2))-\\ &(10*x2))))-(abs(abs((abs(((10*x2)*((abs(abs(10*x1)))*(abs(10*x1))))+\\ &(abs(abs(10*x2)))))*(tansig((10*x1)+(10*x1)))))))*(10*x2))*(((sin((10*\\ &x1)-(10*x2)))*(10*x1))+((abs(10*x2))*(10*x1))))-(sin(10*x2)))))))))))))-\\ &(10*x2))+(10*x1))\end{split}\]

randfunc5

\[\begin{split}y = &((10*x1)+((abs(sin(10*x2)))-(tansig(((10*x1)-(tansig((((abs(sin\\ &(((10*x2)*((sin((10*x2)+((10*x2)+(tansig(10*x1)))))+((10*x2)+(10*\\ &x2))))-(((sin(((tansig(tansig(sin(10*x1))))*(10*x1))*(abs(10*x1))))-\\ &(10*x2))-(((10*x1)*(sin(10*x2)))*((abs(10*x2))*((10*x1)-(10*\\ &x2))))))))*(sin(10*x1)))*(sin(sin((abs((((10*x2)+(tansig(sin(10*\\ &x2))))-(sin(sin(10*x1))))-(10*x2)))*(tansig(10*x2))))))*(10*x2))))+\\ &(tansig((sin(10*x1))*((10*x2)-(sin((tansig(10*x2))+(sin(sin(10*\\ &x2))))))))))))*((10*x1)*(10*x2))\end{split}\]

randfunc7

\[\begin{split}y = &((sin((abs(tansig(abs((abs(((10*x1)-(abs(abs(10*x1))))-\\ &(10*x2)))-(((((10*x1)-((10*x2)-(10*x2)))+((10*x1)-(sin((10*x2)-\\ &(((abs(10*x2))*((10*x1)*(((tansig(sin(sin((((10*x2)+(10*x1))-\\ &(abs(10*x2)))*(10*x2)))))+((10*x1)+(sin(10*x1))))+(abs((10*x2)-\\ &(10*x1))))))+(sin(10*x2)))))))-(sin(sin(10*x2))))+(sin(10*\\ &x1)))))))*((sin(10*x2))+(10*x1))))-((abs(sin(10*x2)))+(tansig(sin(\\ &(tansig(tansig(10*x2)))-((abs(abs(((10*x1)-((10*x2)*(10*x1)))+\\ &(abs(sin(10*x1))))))-(10*x1)))))))+(10*x2)\end{split}\]

randfunc11

\[\begin{split}y = &(sin(tansig((4)+(1))))-((abs((((10*x2)-((2)*(abs(2))))-(tansig(2)))+\\ &((10*x2)-(5))))-(((abs(sin((tansig((1)+((10)+(abs(10*x2)))))+((((abs(sin\\ &((12)+(sin(5)))))+((10*x1)+(10*x2)))*(1))+((abs(4))-(abs(sin(abs(sin(\\ &abs(((2)+((2)+(sin((tansig(abs(tansig((abs(5))-(13)))))*(tansig(sin((1)*\\ &(abs(((10*x2)*(4))+((sin(abs(10*x1)))*((4)-(sin(10*x1)))))))))))))+(5))))\\ &))))))))*(sin((10*x2)+(2))))*(10*x2)))\end{split}\]

14.3.2.3. 3-Dimensional ¶

The following functions are sampled on the design space \([-1,1]^d\) with various values of \(d\).

linear

\[f(x_1,\dots,x_d)=\sum_{k=1}^d kx_k\]

sinusoid

\[f(x_1,\dots,x_d)=\sum_{k=1}^d \sin(6.5x_k)\]

ellipsoidal

\[f(x_1,\dots,x_d)=\sum_{k=1}^{d} kx_k^2\]

ackley

\[f(x_1,\dots,x_d)=\sum_{k=1}^{d-1} \Big(e^{-0.2}\sqrt{x_{k}^2+x_{k+1}^2}+3(\cos(2x_k)+\sin(2x_{k+1}))\Big)\]

whitley

\[f(x_1,\dots,x_d)=\sum_{k=1}^{d}\sum_{n=1}^d \Big(\frac{1}{400}(100(x_k^2-x_n)^2+(1-x_n)^2)^2-\cos(100(x_k^2-x_n)^2+(1-x_n)^2)+1\Big)\]