A regular expression is a sequence of characters that defines a search pattern. It allows you to specify what you are searching for when finding or manipulating text data.
Regular expressions can range from simple single characters to complex patterns, enabling various text search and replace operations.
In PHP, regular expressions are defined as strings containing delimiters, a pattern, and optional modifiers.
$exp |
In the example above, /
serves as the delimiter, w3schools
is the pattern being searched for, and i
is a modifier that enables case-insensitive searching.
Delimiters in regular expressions can be any character that isn’t a letter, number, backslash, or space. While the forward slash (/
) is common, other delimiters like #
or ~
are used when the pattern includes forward slashes for clarity and convenience.
PHP offers a variety of functions that facilitate the use of regular expressions.
Some of the most common functions include:
Function |
Description |
preg_match() |
Returns 1 if the pattern is found in the string and 0 otherwise. |
preg_match_all() |
Returns the number of times the pattern is found in the string, which could be 0 if no matches are found. |
preg_replace() |
Returns a new string where matched patterns have been replaced with another specified string. |
The preg_match()
function determines whether a string contains matches of a specified pattern.
Use a regular expression to perform a case-insensitive search for “w3schools” within a string:
$str |
The preg_match_all()
function provides information on how many matches were found for a pattern within a string.
Use a regular expression to perform a case-insensitive count of the occurrences of “ain” within a string:
$str |
The preg_replace() function replaces all matches of a pattern in a string with another specified string.
Use a case-insensitive regular expression to replace occurrences of “Microsoft” with “W3Schools” in a string:
$str |
Modifiers alter how a search operation is executed.
Modifier |
Description |
i |
Performs a search that ignores case sensitivity. |
m |
Performs a search that considers each line individually, allowing patterns to match at the beginning or end of each line within a multiline string. |
u |
Enables accurate matching of UTF-8 encoded patterns. |
Brackets are used to specify a set or range of characters to match within a regular expression:
Expression |
Description |
[abc] |
Match one or more occurrences of any of the characters specified inside the brackets |
[^abc] |
Match any character that is not included within the specified brackets |
[a-z] |
Match any alphabetical character that falls between two specified letters. |
[A-z] |
Match any alphabetical character that falls between a specified uppercase letter and a specified lowercase letter. |
[A-Z] |
Match any alphabetical character that is alphabetically between two specified uppercase letters. |
[123] |
Match one or more occurrences of digits specified inside the brackets. |
[0-5] |
Match any digit that falls between two specified numbers. |
[0-9] |
Match any digit. |
Metacharacters are characters that hold special meanings in the context of regular expressions.
Metacharacter |
Description |
| |
Find a match for any one of the specified patterns separated by |, for example: cat, dog, or fish. |
. |
Match any character in the string. |
^ |
Finds a match at the beginning of a string, as in: ^Hello |
$ |
Finds a match at the end of the string, as in: World$ |
\d |
Match any digit character. |
\D |
Match any character that is not a digit. |
\s |
Match any whitespace character (spaces, tabs, newline characters). |
\S |
Match any character that is not whitespace (spaces, tabs, newline characters). |
\w |
Match any alphanumeric character (either a letter from a to Z or a digit from 0 to 9). |
\W |
Match any character that is neither a letter nor a digit. |
\b |
Find a match at the beginning of a word, like this: \bWORD, or at the end of a word, like this: WORD\b |
\uxxxx |
Represent the Unicode character designated by the hexadecimal number |
Quantifiers specify how many instances of a character or group are expected in a pattern.
Quantifier |
Description |
n+ |
Matches any string that contains at least one occurrence of the character “n”. |
n* |
Matches any string that contains zero or more occurrences of the character “n”. |
n? |
Matches any string that contains zero or one occurrence of the character “n”. |
n{3} |
Matches any string that contains a sequence of exactly three occurrences of the character “n”. |
n{2, 5} |
Matches any string that contains a sequence of between 2 and 5 occurrences of the character “n”. |
n{3,} |
Matches any string that contains a sequence of at least three occurrences of the character “n”. |
Note: If your regular expression needs to search for special characters like question marks, you can use a backslash (\ ) to escape them. For example, to search for one or more question marks, you can use the following expression: $pattern = '/\?+/'; |
Parentheses ( ) in regular expressions are used to apply quantifiers to entire patterns and to capture parts of the pattern for use as matches.
Use grouping in a regular expression to search for the word “banana” by looking for “ba” followed by two instances of “na”:
$str |