Curriculum
Course: SCIPY
Login
Text lesson

SciPy Sparse Data

What is Sparse Data

Sparse data refers to data with mostly unused elements (elements that do not carry meaningful information).

For example, it can be an array like this:

[1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 0]

Sparse Data: is a data set where most of the item values are zero.

 Dense Array: is the opposite of a sparse array: most of the values are not zero.

In scientific computing, sparse data often arises when working with partial derivatives in linear algebra.

How to Work With Sparse Data

SciPy provides the scipy.sparse module, which includes functions for working with sparse data.

The two primary types of sparse matrices are:

  • CSC (Compressed Sparse Column): Optimized for efficient arithmetic operations and fast column slicing.
  • CSR (Compressed Sparse Row): Optimized for fast row slicing and quicker matrix-vector products.

In this tutorial, we will use the CSR matrix.

CSR Matrix

A CSR matrix can be created by passing an array to the scipy.sparse.csr_matrix() function.

Example

Construct a CSR matrix from an array:

import numpy as np
from scipy.sparse import csr_matrix

arr = np.array([000001102])

print(csr_matrix(arr))

The above example outputs:

(0, 5) 1

(0, 6) 1

(0, 8) 2

From the result, we observe three non-zero elements:

  1. The first element is in row 0, position 5, with a value of 1.
  2. The second element is in row 0, position 6, with a value of 1.
  3. The third element is in row 0, position 8, with a value of 2.