Interpolation is a technique used to generate points between given data points. For instance, for points 1 and 2, we can interpolate to find values like 1.33 and 1.66. In machine learning, interpolation is commonly used to handle missing data in a dataset, a process known as imputation. Beyond imputation, interpolation is also useful for smoothing discrete data points in a dataset.
SciPy offers a module called scipy.interpolate
, which contains various functions for handling interpolation.
The interp1d()
function is used to interpolate a distribution with one variable.
It takes x and y points as input and returns a callable function that, when given new x values, returns the corresponding y values.
For the given xs
and ys
, interpolate values for the range 2.1, 2.2, …, up to 2.9.
from scipy.interpolate import interp1d import numpy as np xs = np.arange(10) ys = 2*xs + 1 interp_func = interp1d(xs, ys) newarr = interp_func(np.arange(2.1, 3, 0.1)) print(newarr) |
[5.2 5.4 5.6 5.8 6. 6.2 6.4 6.6 6.8] |
Note: The new xs should lie within the same range as the original xs , meaning we cannot call interp_func() with values greater than 10 or less than 0. |