SciPy Interpolation

What is Interpolation?

Interpolation is a technique used to generate points between given data points. For instance, for points 1 and 2, we can interpolate to find values like 1.33 and 1.66. In machine learning, interpolation is commonly used to handle missing data in a dataset, a process known as imputation. Beyond imputation, interpolation is also useful for smoothing discrete data points in a dataset.

How to Implement it in SciPy?

SciPy offers a module called scipy.interpolate, which contains various functions for handling interpolation.

1D Interpolation

The interp1d() function is used to interpolate a distribution with one variable.

It takes x and y points as input and returns a callable function that, when given new x values, returns the corresponding y values.

Example

For the given xs and ys, interpolate values for the range 2.1, 2.2, …, up to 2.9.

from scipy.interpolate import interp1d
import numpy as np

xs = np.arange(10)
ys = 2*xs + 1

interp_func = interp1d(xs, ys)

newarr = interp_func(np.arange(2.1, 3, 0.1))

print(newarr)

Result:

[5.2 5.4 5.6 5.8 6. 6.2 6.4 6.6 6.8]

Note: The new xs should lie within the same range as the original xs, meaning we cannot call interp_func() with values greater than 10 or less than 0.