Regular Expressions Cheat Sheet and Common Patterns
Regular expressions, or regex, are powerful pattern-matching tools used in virtually every programming language and text editor. While the syntax can appear cryptic at first, mastering regex unlocks the ability to search, validate, extract, and transform text with precision that would require dozens of lines of procedural code. This guide covers the essential syntax, common patterns, and practical techniques you need to know.
Basic Regex Syntax Cheat Sheet
Character Classes
| Pattern | Meaning | Example |
|---|---|---|
.
|
Any character except newline | a.c matches abc, a1c, a c |
\d
|
Any digit (0-9) | \d\d matches 42 |
\D
|
Any non-digit | \D matches a, !, space |
\w
|
Word character (a-z, A-Z, 0-9, _) | \w+ matches hello_123 |
\W
|
Non-word character | \W matches @, #, space |
\s
|
Whitespace (space, tab, newline) | a\sb matches a b |
\S
|
Non-whitespace | \S+ matches hello |
[abc]
|
Any character in the set | [aeiou] matches a vowel |
[^abc]
|
Any character not in the set | [^0-9] matches non-digits |
[a-z]
|
Character range | [A-Za-z] matches any letter |
Quantifiers
| Pattern | Meaning | Example |
|---|---|---|
*
|
Zero or more | ab*c matches ac, abc, abbc |
+
|
One or more | ab+c matches abc, abbc |
?
|
Zero or one | colou?r matches color, colour |
{n}
|
Exactly n times | \d{4} matches 4 digits |
{n,}
|
n or more times | \d{2,} matches 2+ digits |
{n,m}
|
Between n and m times | \d{3,5} matches 3-5 digits |
Anchors and Boundaries
| Pattern | Meaning | Example |
|---|---|---|
^
|
Start of string or line | ^Hello matches Hello at start |
$
|
End of string or line | world$ matches world at end |
\b
|
Word boundary | \bcat\b matches cat alone |
\B
|
Non-word boundary | \Bcat matches scattered |
Groups and Alternation
| Pattern | Meaning | Example |
|---|---|---|
(abc)
|
Capturing group | (ab)+ matches abab |
(?:abc)
|
Non-capturing group | (?:ab)+ matches abab |
a|b
|
Alternation (or) | cat|dog matches cat or dog |
\1
|
Backreference to group 1 | (\w)\1 matches aa, bb |
(?<name>abc)
|
Named capturing group | Referenced as name |
Common Regex Patterns
Here are the most frequently needed regex patterns for web development and data validation:
Email Validation
A practical email regex that catches most invalid addresses without
being overly strict:
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$. This
pattern validates the basic email structure while allowing common
characters in the local part and requiring a valid domain with a TLD
of at least two characters.
URL Validation
To match HTTP and HTTPS URLs:
^https?:\/\/(www\.)?[a-zA-Z0-9-]+\.[a-zA-Z]{2,}(\/[^\s]*)?$. This pattern matches URLs with optional www prefix, a domain name,
and an optional path.
Phone Number (US)
For US phone numbers in various formats:
^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$. This handles
formats like (555) 123-4567, 555-123-4567, and 5551234567.
IP Address (IPv4)
To validate IPv4 addresses:
^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$. This ensures each octet is between 0 and 255.
Date (YYYY-MM-DD)
For ISO date format:
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$. This
validates the format structure but does not check for valid days in
each month.
Hex Color Code
To match CSS hex colors:
^#?([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$. This matches both
3-digit shorthand (#fff) and 6-digit (#ffffff) formats.
Advanced Techniques
Lookahead and Lookbehind
Lookahead and lookbehind assertions let you match patterns only when they are (or are not) followed or preceded by another pattern, without including the surrounding text in the match.
-
Positive lookahead
(?=pattern): Matches only if the pattern follows. Example:\d(?=px)matches a digit only before "px". -
Negative lookahead
(?!pattern): Matches only if the pattern does not follow. Example:\d(?!px)matches a digit not before "px". -
Positive lookbehind
(?<=pattern): Matches only if the pattern precedes. Example:(?<=\$)\d+matches numbers after a dollar sign. -
Negative lookbehind
(?<!pattern): Matches only if the pattern does not precede. Example:(?<!un)\w+matches words not preceded by "un".
Greedy vs Lazy Matching
By default, quantifiers are greedy, meaning they match as much text as
possible. Adding a question mark after a quantifier makes it lazy,
matching as little as possible. For example,
<.*> applied to "<b>bold</b>" matches
the entire string, while <.*?> matches only
"<b>".
Regex Performance Tips
-
Use non-capturing groups
(?:)when you do not need to extract the matched group content - Avoid excessive backtracking by using possessive quantifiers or atomic groups where supported
- Be specific with character classes instead of using the dot metacharacter broadly
- Anchor your patterns with ^ and $ when you need to match the entire string
- Pre-compile regex patterns when using them repeatedly in loops
-
Use word boundaries
\binstead of^and$when searching within larger text
Testing and Debugging Regex
Always test your regex patterns with real-world input before deploying them. Use an interactive regex tester that highlights matches in real time and explains each part of the pattern. This helps you catch edge cases and understand why a pattern does or does not match specific input. Try our free Regex Tester to build and test your patterns interactively.
Build, test, and debug your regex patterns with our free tools.
Try Our Regex Tester Regex Cheat SheetFrequently Asked Questions
What is the difference between greedy and lazy regex matching?
Greedy quantifiers match as much text as possible, while lazy quantifiers match as little as possible. Adding a question mark after a quantifier makes it lazy. For example, .* matches the longest possible string, while .*? matches the shortest.
Are regular expressions the same across all programming languages?
No, regex implementations vary across languages. Most support the basic POSIX syntax, but advanced features like lookbehind, named groups, and Unicode support differ. JavaScript historically had limited regex support but has added many features in recent versions.
Can regex parse HTML or XML?
No, regular expressions cannot reliably parse HTML or XML because these are nested, context-free languages. Regex has no concept of nesting depth or balanced tags. Use a proper HTML or XML parser instead. Regex can extract simple patterns from HTML, but it will fail on complex structures.
What are lookahead and lookbehind in regex?
Lookahead and lookbehind are zero-width assertions that check for patterns without consuming characters. (?=pattern) is a positive lookahead that matches if the pattern follows. (?<=pattern) is a positive lookbehind that matches if the pattern precedes. Negative versions use ! instead of =.
How do I make my regex case-insensitive?
Add the i flag after the closing delimiter to make the entire pattern case-insensitive. In JavaScript, use /pattern/i. In Python, use re.IGNORECASE or re.I as a flag. You can also use (?i) inline to make specific portions case-insensitive.