NumPy
NumPy is a powerful open-source library for Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. It is widely used in scientific computing, data analysis, and machine learning due to its performance and ease of use.
NumPy Tutorial
NumPy is a powerful Python library for numerical computing that provides support for large, multi-dimensional arrays, along with a vast collection of mathematical functions to operate on these arrays efficiently.
NumPy Getting Started is an introductory guide to using NumPy, a powerful library in Python for numerical computing, providing support for arrays, matrices, and a variety of mathematical functions.
NumPy Creating Arrays refers to the process of generating arrays using various methods like array(), zeros(), ones(), and other functions to initialize and manipulate data in array form.
Dimensions in arrays refer to the number of axes or coordinates needed to specify an element within the array, indicating its shape and structure, such as one-dimensional (1D), two-dimensional (2D), or multi-dimensional arrays.
Check Number of Dimensions refers to the process of determining the total count of dimensions in a given array or data structure.
NumPy array indexing is the method of accessing and modifying elements or subsets of elements within a NumPy array using integer indices or slices.
Accessing 2-D arrays involves using row and column indices to retrieve or modify specific elements within the array's grid-like structure.
Accessing 3-D arrays involves using three indices to specify the depth, row, and column of the desired element within the array's multi-dimensional structure.
NumPy array slicing is the technique of extracting a subset of elements from an array by specifying a range of indices.
Negative slicing is a technique that allows accessing elements of an array using negative indices, which count from the end of the array rather than the beginning.
Slicing 2-D arrays involves selecting a subset of rows and columns to create a new array, allowing for the extraction of specific sections from the original array.
NumPy data types are the various formats that define the type of data stored in a NumPy array, such as integers, floats, booleans, and strings, each with specific memory requirements.
Creating arrays with a defined data type involves explicitly specifying the desired data type when initializing a NumPy array, ensuring that all elements conform to that type.
Converting the data type on existing arrays involves using NumPy's astype() method to change the data type of the elements in an array without creating a new array.
NumPy array copy creates a new array with its own data, while a view provides a reference to the original array's data, meaning changes to the view will affect the original array.
A view is a way to access or modify the data of an array without creating a copy, allowing for more memory-efficient operations while maintaining a reference to the original data.
Checking if an array owns its data involves verifying whether the array is a copy with its own data or a view that references the original data.
NumPy array shape refers to a tuple that specifies the dimensions of the array, indicating the size along each axis.
NumPy array reshaping is the process of changing the shape or dimensions of an existing array without altering its data.
The Returns Copy or View? function checks whether a NumPy array operation results in a copy of the original data or a view referencing the same data.
Flattening an array involves converting a multidimensional array into a one-dimensional array.
NumPy array iterating refers to the process of looping through the elements of an array, allowing access to each element for operations or computations.
Iterating 2-D arrays involves accessing each element row by row or column by column using loops or other iteration techniques.
Iterating 3-D arrays involves looping through each element across all three dimensions, allowing access to individual scalars or sub-arrays.
Iterating with different step size involves traversing an array's elements at specified intervals rather than sequentially, allowing for flexible access to elements.
Joining NumPy arrays involves concatenating two or more arrays along a specified axis to create a unified array.
Joining arrays using stack functions involves combining multiple arrays along a specified axis, such as stacking them vertically or horizontally, to create a new array.
Stacking along rows refers to the process of combining multiple arrays vertically, creating a new array where the original arrays are placed one on top of the other along a new axis.
Splitting 2-D arrays involves dividing a two-dimensional array into multiple sub-arrays along either the horizontal or vertical axis using functions like np.hsplit()
(horizontal split) and np.vsplit()
(vertical split).
NumPy array splitting is the process of dividing an array into multiple sub-arrays using functions like np.split(), np.hsplit(), np.vsplit(), or np.array_split() based on specified indices or sections.
Splitting into arrays refers to the process of dividing a single array into multiple smaller arrays based on specified indices or conditions.
Searching arrays in NumPy involves using functions like np.where(), np.searchsorted(), or boolean indexing to locate specific elements or conditions within an array.
Search sorted in NumPy refers to using the np.searchsorted() function to identify the indices at which specified values can be inserted into a sorted array while preserving its order.
Searching from the right side involves locating elements in an array starting from the last element and moving toward the first, often using functions that specify search direction.
NumPy sorting arrays involves using functions like np.sort()
and np.argsort()
to arrange the elements of an array in a specified order (ascending or descending).
NumPy filtering arrays involves using boolean indexing or the np.where() function to create a new array that contains only the elements that meet specified conditions.
Creating a filter directly from an array involves applying a condition to the array to generate a boolean index list that selects elements meeting that condition.
NumPy Random
Random refers to the generation of values or selections in a way that is unpredictable and lacks a discernible pattern, often using algorithms or processes that mimic randomness.
Generating a random array involves creating an array filled with random values using functions from libraries like NumPy, such as np.random.rand() or np.random.randint().
Generating a random number from an array involves selecting a random element from an existing array using functions like np.random.choice().
Data distribution refers to the way in which values are spread or arranged within a dataset, often described using statistical measures such as mean, median, variance, and visualized through graphs like histograms or box plots.
Random distribution refers to the way in which random values are spread or allocated over a range, characterized by their occurrence frequencies and probabilities.
Random permutations refer to the rearrangement of the elements in an array into a random order, typically achieved using functions like np.random.permutation() in NumPy.
Seaborn is a Python data visualization library based on Matplotlib that provides a high-level interface for drawing attractive statistical graphics and visualizing complex datasets.
Importing Seaborn involves including the library in your Python script or notebook using the statement import seaborn as sns.
Normal distribution is a continuous probability distribution represented by a symmetrical bell-shaped curve, where observations cluster around the mean and probabilities decline symmetrically as values diverge from the mean.
Binomial distribution is a discrete probability distribution that models the number of successes in a fixed number of independent Bernoulli trials, each with the same probability of success.
Poisson distribution is a discrete probability distribution that models the number of events occurring within a fixed interval of time or space, given a constant mean rate and independence of events.
The main difference between Normal and Poisson distributions is that Normal is continuous, modeling data around a mean, while Poisson is discrete, modeling the frequency of events in a fixed interval.
Uniform distribution is a probability distribution where all outcomes are equally likely within a given range.
Logistic distribution is a continuous probability distribution characterized by a symmetrical S-shaped curve, commonly used to model growth and logistic regression.
Multinomial Distribution is a generalization of the binomial distribution that models the outcomes of experiments where each trial results in one of multiple possible categories, defined by the number of trials and the probabilities of each category.
Exponential distribution is a continuous probability distribution that models the time between events in a Poisson process, characterized by a constant average rate of occurrence.
The Chi-Square distribution is a probability distribution that represents the distribution of the sum of squared standard normal variables, commonly used in hypothesis testing and goodness-of-fit tests.
The Rayleigh distribution is a probability distribution used to model the magnitude of a vector that has orthogonal components following independent and identical normal distributions.
The Pareto distribution is a probability distribution that models the distribution of wealth or other quantities, characterized by the principle that a small percentage of the population controls a large portion of the total.
The Zipf distribution is a probability distribution that models the frequency of events, where the rank of an event is inversely proportional to its frequency, often used to describe word frequencies in natural language.
NumPy ufunc
Ufuncs (universal functions) are functions in NumPy that operate element-wise on arrays, enabling fast and efficient computations across entire datasets without the need for explicit loops.
Vectorization in computing refers to the process of applying an operation to an entire set of data (such as an array or vector) in a single step, rather than using loops to apply the operation element by element.
A ufunc create function allows users to define their own universal functions in NumPy, enabling custom element-wise operations on arrays.
Ufunc Simple Arithmetic refers to universal functions in NumPy that perform basic arithmetic operations—such as addition, subtraction, multiplication, and division—element-wise on arrays.
Subtraction is a basic arithmetic operation that calculates the difference between two numbers or quantities.
Multiplication is an arithmetic operation that combines two numbers to produce a product, representing repeated addition of one number by the other.
Power is the ability or capacity to influence, control, or direct people, events, or resources.
Quotient is the result of dividing one number by another, while Mod (modulus) is the remainder left after division.
Ufunc Rounding Decimals refers to NumPy universal functions that round numerical values to a specified number of decimal places, such as round(), floor(), ceil(), and trunc().
Floor is an operation that rounds a number down to the nearest integer, returning the largest integer less than or equal to the given value.
NumPy logs are functions that compute logarithms of elements in arrays, supporting various bases like natural log (log), base 2 (log2), and base 10 (log10).
Log at Base 10 (log₁₀) is the power to which 10 must be raised to equal a given number.
Summations refer to the process of adding a sequence of numbers or values to obtain their total.
Summation over an axis refers to adding elements along a specified axis or dimension in a multi-dimensional array, collapsing that axis into a single value.
Products refer to the result of multiplying a sequence of numbers or values together.
Product over an axis refers to the process of multiplying elements along a specified axis in a multi-dimensional array to obtain a cumulative product.
Cumulative refers to the total amount resulting from the gradual addition of parts or elements over time.
Differences refer to the result of subtracting one number or value from another.
The least common multiple (LCM) is the smallest positive integer that is a multiple of two or more numbers.
Finding the LCM (Least Common Multiple) in arrays involves determining the smallest multiple that is evenly divisible by all elements in the array.
A ufunc for finding the greatest common divisor (GCD) is a universal function in NumPy that computes the largest positive integer that divides two or more numbers without leaving a remainder.
Trigonometry is the branch of mathematics that studies the relationships between the angles and sides of triangles.
Finding angles involves determining the measure of an angle in a triangle or geometric figure, often using trigonometric functions like sine, cosine, or tangent.
Hyperbolic refers to the mathematical functions and identities related to hyperbolas, similar to trigonometric functions but based on the geometry of hyperbolas, such as sinh, cosh, and tanh.
Set operations are mathematical operations that manipulate sets, including union, intersection, difference, and symmetric difference, to combine or compare the elements of two or more sets.
Finding the union of sets involves combining all unique elements from multiple sets into a single set, excluding duplicates.
Finding the intersection of sets involves identifying the common elements shared by multiple sets.
Finding the difference between sets involves determining the elements that are present in one set but not in another.