Zipf distributions are used to sample data according to Zipf’s law.
Zipf’s Law states that in a collection, the frequency of the nth most common term is 1/n of the most common term. For example, the 5th most common word in English appears roughly 1/5 as frequently as the most common word. |
It has two parameters:
a (distribution parameter): Controls the skewness of the distribution.
size: Defines the shape of the returned array.
Generate a sample from the Zipf distribution with a distribution parameter of 2 and a size of 2×3.
from numpy import random x = random.zipf(a=2, size=(2, 3)) print(x) |
Sample 1,000 points, but only plot those with values less than 10 for a more meaningful chart.
from numpy import random import matplotlib.pyplot as plt import seaborn as sns x = random.zipf(a=2, size=1000) sns.distplot(x[x<10], kde=False) plt.show() |