Generic Tool for Data Fusion (GT DF)

GT DF or Generic Tool for Data Fusion is a highly powerful tool of pSeven Core, available in pSeven to handle the data of variable fidelity. It allows to construct surrogate models using data of variable fidelity, contrary to GT Approx, which handles data of same fidelity. GT DF tool uses a high fidelity data and a low fidelity data as input. It is suppose that these data are generated using high fidelity and low fidelity functions respectively. Output of the tool is a surrogate model for the high fidelity function.

Working modes and techniques

Data Fusion technology works in two modes:

  • sample-based,
  • blackbox-based.

In a sample-based mode, the tool takes high fidelity sample and a low fidelity sample as inputs. These samples consist of points and corresponding values of a considered function. In a blackbox-based mode, the tool takes the high fidelity sample and a low fidelity blackbox as inputs. In this mode low fidelity function blackbox provides low fidelity function values at any feasible point from a specified design space. A surrogate model constructed using the blackbox-based mode can calculate the high fidelity function estimates with or without using an interior low fidelity function blackbox.


GT DF allows users to meet their specific requirements for a surrogate model using a wide range of powerful techniques:

  • HFA (High Fidelity Approximation) – uses only high fidelity data,
  • DA (Difference Approximation) — approximates difference between low and high fidelity data
  • VFGP (Variable Fidelty Gaussian Processes) — builds models using Gaussian processes regression ideas
  • SVFGP (Sparse Variable Fidelity Gaussian Processes) is designed to handle large samples with Gaussian processes regression-based technique.

The tool performs an automatic selection of techniques based on the data provided and on user requirements. 


Moreover, GT DF:

  • Provides not only approximation of high fidelity functions, but also partial derivatives;
  • Allows to manage a surrogate model construction time;
  • Perfectly handles samples of varied sizes: from tiny to really huge data sets;
  • Can exactly fit the high fidelity training data;
  • Allows accuracy evaluation, so that user is able to estimate uncertainty for predictions obtained with a DF-based surrogate model;
  • Estimates quality of obtained models using Internal validation.

Application example

Let's take a quick look of GT DF usage for an engineering problem. The task is to construct a surrogate model for an airfoil lift and drag coefficients depending on the angle of attack.

Application example

Both high and low fidelity data are generated with Euler equations solver. But to calculate low fidelity function, coarse mesh in CFD is used. To compare different techniques, Mean Squared Error (MSE) is calculated for a test high fidelity sample.

Data Fusion tool application significantly
Results are presented in the table above. It demonstrates that Data Fusion tool application significantly improves the quality of a surrogate model for both lift and drag coefficients.