Developer Tools March 26, 2026 8 min read

Regex to Match Email: Patterns That Actually Work (2026)

Email validation with regex is one of the most debated topics in programming. The internet has dozens of patterns, ranging from a 5-character approximation to a 6,000-character RFC-compliant monstrosity. This guide cuts through the noise and tells you exactly which pattern to use and why.

Why Email Validation Is Harder Than It Looks

An email address looks simple: something@something.something. But the formal specification (RFC 5321 and RFC 5322) allows an astonishing range of valid formats:

user@example.com - the obvious case
user+tag@example.com - subaddressing (plus addressing), widely used
first.last@subdomain.company.co.uk - dots and subdomains
"user name"@example.com - quoted local part with spaces (valid!)
user@[192.168.1.1] - IP address as domain (valid per RFC)
user@xn--nxasmq6b.com - internationalized domain names (Punycode)
very.unusual."@".unusual.com@example.com - yes, this is technically valid

No practical regex handles all of these correctly. The goal is not to implement RFC 5322 in a regex - it is to catch obvious typos while accepting the vast majority of real addresses people use every day.

The Standard Practical Pattern

This pattern covers the overwhelming majority of real-world email addresses:

[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}

Breaking it down:

[a-zA-Z0-9._%+\-]+    Local part: letters, digits, dots, underscores,
                       percent, plus, hyphen. One or more characters.

@                      Literal @ symbol (required)

[a-zA-Z0-9.\-]+        Domain: letters, digits, dots, hyphens.
                       One or more characters.

\.                     Literal dot before TLD

[a-zA-Z]{2,}          TLD: two or more letters (covers .io, .com, .museum, etc.)

What this pattern accepts

user@example.com            ✓
first.last@company.co.uk    ✓
user+tag@gmail.com          ✓
name_123@subdomain.org      ✓
contact@startup.io          ✓
admin@company.museum        ✓

What this pattern rejects (correctly)

@example.com                ✗  (missing local part)
user@.com                   ✗  (domain starts with dot)
user@com                    ✗  (no dot in domain)
user@                       ✗  (no domain at all)
plaintext                   ✗  (no @ at all)
user@-example.com           ✗  (domain starts with hyphen)

What this pattern rejects (incorrectly)

"user name"@example.com     ✗  (quoted local parts are valid per RFC)
user@[192.168.1.1]          ✗  (IP domain is valid per RFC)
unicode@münchen.de           ✗  (Unicode in domain, valid in modern email)

These false negatives are acceptable for most applications because they represent a tiny fraction of real user submissions.

The HTML5 Email Input Pattern

HTML5 browsers validate <input type="email"> using the WHATWG specification's "valid email address" definition. The equivalent regex is:

/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~\-]+@[a-zA-Z0-9](?:[a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?)*$/

This is the pattern used by every modern browser. If you want your server side validation to match what the browser accepts, use this. It is intentionally a "willful violation" of RFC 5322 - simpler and more practical. It is what millions of websites already rely on.

Step-by-Step: Email Validation in Different Languages

JavaScript

// Simple validation
function isValidEmail(email) {
  return /^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/.test(email);
}

// HTML5-equivalent (matches browser behavior)
function isValidEmailHTML5(email) {
  const re = /^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~\-]+@[a-zA-Z0-9](?:[a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?)*$/;
  return re.test(email);
}

// Usage
console.log(isValidEmail("user@example.com"));     // true
console.log(isValidEmail("user+tag@gmail.com"));   // true
console.log(isValidEmail("notanemail"));           // false
console.log(isValidEmail("@domain.com"));          // false

Python

import re

EMAIL_RE = re.compile(
    r'^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$'
)

def is_valid_email(email):
    return bool(EMAIL_RE.match(email))

# Test
emails = ["user@example.com", "bad@", "test+tag@company.co.uk"]
for e in emails:
    print(f"{e}: {is_valid_email(e)}")

PHP

<?php
// PHP has a built-in filter - prefer this over regex
$email = "user@example.com";
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
    echo "Valid";
} else {
    echo "Invalid";
}

// If you need regex explicitly:
$pattern = '/^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$/';
preg_match($pattern, $email, $matches);

Test Your Email Regex Live

Paste your pattern and test it against a list of email addresses. Instant feedback, free, no data leaves your browser.

Open Regex Tester →

Extracting Emails From Text

When scanning text for email addresses (rather than validating a single field), you need to find all matches within a larger string. Use the global flag and looser anchoring:

// JavaScript: extract all emails from a block of text
const text = "Contact alice@example.com or bob+work@company.co.uk for support.";
const emails = text.match(/[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g);
// ["alice@example.com", "bob+work@company.co.uk"]

# Python: extract all emails from text
import re
text = "Send to user@domain.com and admin@company.org"
emails = re.findall(r'[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}', text)
# ['user@domain.com', 'admin@company.org']

Why Regex Alone Is Never Enough

Even if an email passes your regex, there is no guarantee it is a working inbox. A regex can only check format. It cannot check:

Whether the domain actually exists and has MX records
Whether the mailbox exists at that domain
Whether the user has access to the inbox (is it theirs?)
Whether the inbox is full or blocked

The only reliable email verification is sending a confirmation link. For registration flows, the pattern should always be: (1) basic regex check to catch obvious typos immediately, (2) store the address, (3) send a confirmation email, (4) require the user to click the link before activating the account.

Regex validates format. Sending a confirmation email validates reality. Never skip step 2 for any application where email is used for authentication or communication.

Frequently Asked Questions

Should I validate email with regex on the client side, server side, or both?

Both, but for different reasons. Client side validation with regex provides immediate feedback to users without a round trip. Server-side validation is the authoritative check - client side validation can always be bypassed by an attacker. Never trust client side input alone. The server should re-validate before storing. Both validations should use the same pattern for consistent behavior.

Why does my email regex reject "user.name+tag+sorting@example.com"?

Check your character class for the local part. The + character is valid in email local parts. The pattern [a-zA-Z0-9._%+\-]+ includes + explicitly. If your pattern uses [a-zA-Z0-9._%-]+ without the +, it will reject plus-addressed emails. Gmail uses + for subaddressing (user+newsletters@gmail.com) and it is widely used for email filtering.

Is "user@localhost" a valid email address?

Technically yes, per RFC 5321 - it refers to a mailbox on the local machine. In practice, virtually no production email system accepts it, and the practical regex patterns above reject it (no TLD dot). If you are building a developer tool or internal system that genuinely uses local addresses, you would need to adjust the domain part to allow single-label domains: change [a-zA-Z0-9.\-]+\.[a-zA-Z]{2,} to [a-zA-Z0-9.\-]+(?:\.[a-zA-Z]{2,})?.

Why do some email validators reject new TLDs like .photography or .academy?

Old patterns used [a-zA-Z]{2,4} for the TLD portion, capping it at 4 characters. That was written when the longest TLD was .info (4 characters). ICANN now allows TLDs up to 24 characters. The correct quantifier is {2,} (two or more, no upper limit), which this guide's patterns use. If you inherited an older pattern, change {2,4} to {2,}.

What is the best library for production email validation?

For most languages, using the platform's built-in validator plus a confirmation flow is sufficient. Beyond that: Python's email-validator package (checks DNS MX records), Node.js's email-validator or validator.js, PHP's filter_var(FILTER_VALIDATE_EMAIL) (built-in), and Java's javax.mail.internet.InternetAddress. For high-volume applications requiring MX verification, a dedicated email verification API service is worth the cost.

The Bottom Line

For most applications, the practical pattern [a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,} is the right choice. It is readable, handles all common email formats, and avoids both the false positives of a too-loose pattern and the false negatives of over-engineering. Use ^...$ anchors when validating a single field; omit them when searching within text.

For production applications, always supplement regex validation with a confirmation email. Regex catches typos; confirmation proves the address is real and belongs to the user.

Use our free tool here → Regex Tester to test email patterns against a list of addresses and see exactly which ones match.