1.3. Correlation Analysis

1.3.1. Introduction

This example shows how to apply pSeven Core to estimate correlations in the data.

Consider a simple linear regression model with noise:

\[f(x) = -0.4 x_1 + x_2 + \varepsilon,\]

where \(\varepsilon\) is a random value uniformly distributed in \([0, 0.05]\).

The task is to estimate influence of different input factors (variables) on the output.

1.3.2. Correlation Analysis

Start by importing the Generic Tool for Sensitivity and Dependency Analysis (GTSDA) module and the standard random module which is needed to generate the test data:

import random

from da.p7core import gtsda

Describe the problem:

def noisy_linear_function(x):
  y = [-0.4 * point[0] + point[1] + 0.05 * random.random() for point in x]
  return y

Generate a sample:

number_points = 50
number_dimensions = 2
x = [[random.random() for _ in range(number_dimensions)] for _ in range(number_points)]
y = noisy_linear_function(x)

Create a gtsda.Analyzer instance:

analyzer = gtsda.Analyzer()

Set options and logger (see Options Interface, Loggers):

from da.p7core import loggers

analyzer.options.set("GTSDA/Seed", 100)
analyzer.set_logger(loggers.StreamLogger())

Perform correlation analysis:

result = analyzer.check(x=x, y=y)

Print a result summary:

print('Results of correlation analysis with default options:')
print('scores: %s' % result.scores)
print('p_values: %s' % result.p_values)
print('decisions: %s' % result.decisions)

1.3.3. Full Example Code

import random

from da.p7core import gtsda
from da.p7core import loggers

def noisy_linear_function(x):
  y = [-0.4 * point[0] + point[1] + 0.05 * random.random() for point in x]
  return y

number_points = 50
number_dimensions = 2
x = [[random.random() for _ in range(number_dimensions)] for _ in range(number_points)]
y = noisy_linear_function(x)

analyzer = gtsda.Analyzer()

analyzer.options.set("GTSDA/Seed", 100)
analyzer.set_logger(loggers.StreamLogger())

result = analyzer.check(x=x, y=y)

print('Results of correlation analysis with default options:')
print('scores: %s' % result.scores)
print('p_values: %s' % result.p_values)
print('decisions: %s' % result.decisions)