Curriculum
Course: Pandas
Login
Text lesson

Pandas Intro

What is Pandas?

Pandas is a Python library designed for handling datasets, offering functions for analyzing, cleaning, exploring, and manipulating data. The name “Pandas” is derived from “Panel Data” and “Python Data Analysis,” and it was created by Wes McKinney in 2008.

Why Use Pandas?

Pandas enables the analysis of large datasets and drawing conclusions based on statistical principles. It can clean messy datasets, transforming them into readable and meaningful formats. Having relevant data is crucial in data science.

Data Science is a field of computer science focused on how to store, utilize, and analyze data to extract meaningful insights from it.

What Can Pandas Do?

Pandas helps you derive insights from data, such as:

  • Is there a correlation between two or more columns?
  • What is the average value?
  • What is the maximum value?
  • What is the minimum value?

Pandas can also remove irrelevant rows or those with incorrect values, such as empty or NULL values, a process known as data cleaning.