In Python, regular expressions (regex) are a powerful way to search for and manipulate strings. Python’s built-in re
module provides functionality for working with regular expressions.
Here’s a basic overview of how to use regular expressions in Python:
1. Importing the re
Module
To use regular expressions in Python, you need to import the re
module.
import re
2. Basic Syntax for Regular Expressions
.
: Matches any character except a newline.^
: Matches the start of the string.$
: Matches the end of the string.[]
: Matches any character within the brackets.|
: Acts like a logical OR, matches the pattern on the left or right of it.()
: Groups patterns together for applying quantifiers or capturing groups.\
: Escapes a special character or signals a special sequence (like\d
for digits,\w
for word characters, etc.).
3. Commonly Used Functions in re
Module
re.match(pattern, string)
: Tries to match the pattern at the start of the string.re.search(pattern, string)
: Searches for the first match of the pattern anywhere in the string.re.findall(pattern, string)
: Returns all non-overlapping matches of the pattern in the string as a list.re.finditer(pattern, string)
: Returns an iterator yielding match objects for all matches.re.sub(pattern, repl, string)
: Replaces the occurrences of the pattern with the replacement string.re.split(pattern, string)
: Splits the string by the occurrences of the pattern.
4. Special Sequences in Regex
\d
: Matches any digit (0-9).\D
: Matches any non-digit character.\w
: Matches any alphanumeric character (letters and digits) plus the underscore.\W
: Matches any non-alphanumeric character.\s
: Matches any whitespace character (spaces, tabs, newlines).\S
: Matches any non-whitespace character.\b
: Matches a word boundary.\B
: Matches a non-word boundary.
5. Example Usage
Example 1: Matching a Pattern at the Start of a String
import re
text = "Hello, world!"
pattern = r"^Hello"
match = re.match(pattern, text)
if match:
print("Match found:", match.group())
else:
print("No match")
Example 2: Searching for a Pattern Anywhere in the String
text = "The rain in Spain falls mainly in the plain."
pattern = r"rain"
result = re.search(pattern, text)
if result:
print("Match found:", result.group())
else:
print("No match")
Example 3: Finding All Matches
text = "cat, bat, rat, flat"
pattern = r"\b\w{3}\b" # Matches all 3-letter words
matches = re.findall(pattern, text)
print(matches)
Example 4: Replacing Text Using re.sub
text = "The sky is blue."
pattern = r"blue"
replacement = "clear"
new_text = re.sub(pattern, replacement, text)
print(new_text)
6. Using Groups in Regex
You can capture parts of the matched string using parentheses ()
to create groups.
text = "My number is 123-456-7890."
pattern = r"(\d{3})-(\d{3})-(\d{4})"
match = re.search(pattern, text)
if match:
print("Area code:", match.group(1))
print("First part of number:", match.group(2))
print("Second part of number:", match.group(3))
7. Quantifiers
*
: Matches 0 or more occurrences of the preceding pattern.+
: Matches 1 or more occurrences of the preceding pattern.?
: Matches 0 or 1 occurrence of the preceding pattern.{n}
: Matches exactlyn
occurrences of the preceding pattern.{n,}
: Matchesn
or more occurrences of the preceding pattern.{n,m}
: Matches betweenn
andm
occurrences of the preceding pattern.
8. Example: Validating an Email Address
import re
email = "[email protected]"
pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
if re.match(pattern, email):
print("Valid email")
else:
print("Invalid email")
This is just a brief overview of regular expressions in Python. Regular expressions are highly flexible and can be used for complex pattern matching and text manipulation tasks. If you have a specific regex problem or pattern you’re working with, feel free to ask for more tailored help!