Curriculum
Course: NumPy
Login

Curriculum

NumPy

Text lesson

Zipf Distribution

Zipf distributions are used to sample data according to Zipf’s law.

Zipf’s Law states that in a collection, the frequency of the nth most common term is 1/n of the most common term. For example, the 5th most common word in English appears roughly 1/5 as frequently as the most common word.

It has two parameters:

a (distribution parameter): Controls the skewness of the distribution.

size: Defines the shape of the returned array.

Example

Generate a sample from the Zipf distribution with a distribution parameter of 2 and a size of 2×3.

from numpy import random

x = random.zipf(a=2, size=(23))

print(x)

Visualization of Zipf Distribution

Sample 1,000 points, but only plot those with values less than 10 for a more meaningful chart.

Example

from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns

x = random.zipf(a=2, size=1000)
sns.distplot(x[x<10], kde=False)

plt.show()

Result

zipf1