Regex Pattern Prompt Templates

AI prompt templates for creating regex patterns. Build and understand complex regular expressions.

Overview

Regular expressions are incredibly powerful but notoriously hard to get right. These prompts help you create patterns that match exactly what you need, no more false positives catching unintended strings, no more false negatives missing valid input. The key is being specific about what should and shouldn't match.

Best Practices

1

Provide examples of strings that SHOULD match AND strings that should NOT match

2

Specify the regex flavor (JavaScript, Python, PCRE, etc.), syntax varies between implementations

3

Describe edge cases you're concerned about (empty strings, unicode, very long input)

4

Mention if performance matters (some patterns cause catastrophic backtracking)

5

Ask for an explanation with the pattern, you'll need to maintain it later

Prompt Templates

1. Regex Pattern Generator

Create a regex pattern for this requirement:

**What to match:**
[DESCRIBE WHAT YOU'RE TRYING TO MATCH]

**Examples that SHOULD match:**
- [VALID EXAMPLE 1]
- [VALID EXAMPLE 2]
- [VALID EXAMPLE 3]

**Examples that should NOT match:**
- [INVALID EXAMPLE 1]
- [INVALID EXAMPLE 2]
- [INVALID EXAMPLE 3]

**Regex flavor:** [JAVASCRIPT / PYTHON / PCRE / POSIX]

**Requirements:**
- [ANY SPECIFIC REQUIREMENTS - e.g. case insensitive]
- [PERFORMANCE CONCERNS]

**Provide:**
1. The regex pattern
2. Explanation of each component
3. Test cases covering edge cases
4. Any caveats or limitations
Match: US phone numbers
Should match: (555) 123-4567, 555-123-4567, 555.123.4567, 5551234567
Should NOT match: 123-4567 (too short), 55-123-4567 (wrong format), 555-1234-567 (wrong grouping)
Flavor: JavaScript
Requirements: Capture area code, exchange, and subscriber number as groups
## US Phone Number Regex

```javascript
const phoneRegex = /^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$/;
```

### Component Breakdown

| Component | Meaning |
|-----------|--------|
| `^` | Start of string |
| `\(?` | Optional opening parenthesis |
| `([0-9]{3})` | **Group 1:** Area code (exactly 3 digits) |
| `\)?` | Optional closing parenthesis |
| `[-. ]?` | Optional separator (hyphen, dot, or space) |
| `([0-9]{3})` | **Group 2:** Exchange (exactly 3 digits) |
| `[-. ]?` | Optional separator |
| `([0-9]{4})` | **Group 3:** Subscriber (exactly 4 digits) |
| `$` | End of string |

### Usage

```javascript
const phoneRegex = /^\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})$/;

// Test matching
phoneRegex.test('(555) 123-4567'); // true
phoneRegex.test('555-123-4567'); // true
phoneRegex.test('5551234567'); // true

// Extract groups
const match = '(555) 123-4567'.match(phoneRegex);
// match[1] = '555' (area code)
// match[2] = '123' (exchange)
// match[3] = '4567' (subscriber)
```

### Test Cases

```javascript
// Should match
✓ '(555) 123-4567' // Parentheses with space and hyphen
✓ '555-123-4567' // Hyphens only
✓ '555.123.4567' // Dots only
✓ '555 123 4567' // Spaces only
✓ '5551234567' // No separators
✓ '(555)123-4567' // Parentheses, no space

// Should NOT match
✗ '123-4567' // Missing area code (only 7 digits)
✗ '55-123-4567' // Area code too short
✗ '5555-123-4567' // Area code too long
✗ '555-1234-567' // Wrong grouping
✗ '555-123-45678' // Subscriber too long
✗ '+1 555-123-4567' // Has country code (not handled)
✗ '555) 123-4567' // Mismatched parenthesis
```

### Caveats

1. **Doesn't validate real numbers:** `000-000-0000` matches but isn't valid
2. **No country code:** Add `(?:\+1[-. ]?)?` at start if needed
3. **Mismatched parens:** `555) 123-4567` currently matches; tighten if needed:
 ```javascript
 /^(?:\(([0-9]{3})\)|([0-9]{3}))[-. ]?([0-9]{3})[-. ]?([0-9]{4})$/
 ```
4. **Extensions not supported:** Add `(?:\s*(?:ext|x)\.?\s*([0-9]+))?$` for extensions

When you need to match or validate specific text patterns like phone numbers, emails, or custom formats

  • Always test with edge cases, empty strings, very long strings, unicode characters
  • Use regex101.com to test and visualize your pattern interactively

2. Regex Debugger

Help me fix this regex that isn't working correctly:

**Current pattern:**
```
[YOUR REGEX]
```

**Regex flavor:** [JAVASCRIPT / PYTHON / etc.]

**Expected behavior:**
[WHAT YOU WANT IT TO DO]

**Actual behavior:**
- This matches but shouldn't: [EXAMPLE]
- This doesn't match but should: [EXAMPLE]

**Explain:** What's wrong and provide the fixed pattern
Pattern: /^[a-z]+@[a-z]+\.[a-z]{2,3}$/
Flavor: JavaScript
Expected: Match email addresses
Actual:
- Doesn't match: John@Example.COM (valid but has uppercase)
- Doesn't match: user.name@company.co.uk (has dots and long TLD)
- Matches: a@b.cc (too simple but technically matches)
## Regex Debug Analysis

### Problems Identified

1. **Case sensitivity:** Pattern only allows lowercase `[a-z]`
2. **No dots in local part:** `.` before `@` not allowed
3. **Single-part domain:** No subdomains (`.co.uk`) allowed
4. **TLD length limited:** `{2,3}` excludes `.info`, `.museum`, etc.
5. **Too permissive:** No length minimums, `a@b.cc` matches

### Fixed Pattern

```javascript
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/i;
```

### What Changed

| Original | Fixed | Why |
|----------|-------|-----|
| `[a-z]+` | `[a-zA-Z0-9._%+-]+` | Allow common email characters |
| `@[a-z]+` | `@[a-zA-Z0-9.-]+` | Allow subdomains (mail.company) |
| `\.[a-z]{2,3}` | `\.[a-zA-Z]{2,}` | Allow any TLD length |
| No flag | `i` flag | Case insensitive |

### Even Better: More Solid Pattern

```javascript
// Handles more edge cases
const robustEmailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]{2,}$/;
```

This is simpler and catches most valid emails:
- `[^\s@]+` = anything except whitespace and @
- Allows unicode characters in local part
- Less prone to false negatives

### Test Results

```javascript
const fixed = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/i;

// Now matches:
✓ 'John@Example.COM' // Uppercase handled
✓ 'user.name@company.co.uk' // Dots and subdomains
✓ 'test@example.info' // Long TLD

// Still correctly rejects:
✗ 'missing@tld' // No TLD
✗ '@nodomain.com' // No local part
✗ 'spaces not@allowed.com' // Has spaces
```

### Caveat

No regex perfectly validates email addresses (the RFC is complex). For production:
1. Use a library (validator.js, email-validator)
2. Or just check for `@` and send a confirmation email

When your regex has unexpected matches or misses, or when you need to understand why a pattern isn't working

  • Regex issues are usually at boundaries, check anchors (^$) and quantifiers (+*?)
  • Test with deliberately adversarial input to find edge cases

3. Regex Performance Optimizer

Help me optimize this regex for performance:

**Current pattern:**
```
[YOUR REGEX]
```

**Context:**
- Input size: [TYPICAL STRING LENGTH]
- Frequency: [HOW OFTEN IT RUNS]
- Flavor: [JAVASCRIPT / PYTHON / etc.]

**Problem:**
[DESCRIBE PERFORMANCE ISSUE - slow, hangs, times out]

**Analyze:**
1. Why is it slow?
2. Provide improved version
3. Explain the improvement
Pattern: /^(a+)+$/
Input: Long strings of 'a's followed by a non-matching character
Problem: Regex takes exponentially longer as string grows, browser hangs on 30+ characters
## Catastrophic Backtracking Analysis

### The Problem

Your pattern `/^(a+)+$/` has **catastrophic backtracking**, a common regex performance trap.

### Why It's Slow

```
Pattern: /^(a+)+$/
Input: 'aaaaaaaaaaaaaaab' (15 a's + b)
```

**What happens:**
1. `(a+)` matches all 15 a's
2. `$` fails (there's a 'b')
3. Regex backtracks, tries `(a+)` with 14 a's
4. Inner `(a+)+` tries to match remaining 'a'
5. Fails, backtracks again
6. This creates **2^n combinations** to try

For 30 characters: 2^30 = 1 billion+ attempts

### Visualization

```
(a+)+$ matching 'aaab':

Attempt 1: (aaa) → fail at b
Attempt 2: (aa)(a) → fail at b 
Attempt 3: (aa) → fail (need more a's for +)
Attempt 4: (a)(aa) → fail at b
Attempt 5: (a)(a)(a) → fail at b
Attempt 6: (a)(a) → fail
Attempt 7: (a) → fail
... exponential explosion
```

### The Fix

**Option 1: Possessive quantifier (if supported)**
```javascript
// Not in JavaScript, but in Java/PCRE:
/^(a++)+$/ // ++ means "don't backtrack"
```

**Option 2: Atomic group (if supported)**
```javascript
// Not in JavaScript:
/^(?>a+)+$/ // Atomic: no backtracking into group
```

**Option 3: Simplify (JavaScript solution)**
```javascript
// If you just want "one or more a's":
/^a+$/

// If you need the groups for some reason:
/^(a)+$/ // Each 'a' is a group
```

**Option 4: Use non-regex**
```javascript
// For simple patterns, string methods are faster:
function isAllAs(str) {
 return str.length > 0 && !str.includes('b') && /^a+$/.test(str);
 // Or even:
 return str.length > 0 && [...str].every(c => c === 'a');
}
```

### Avoiding Catastrophic Backtracking

**Red flags in patterns:**
- Nested quantifiers: `(a+)+`, `(a*)*`, `(a+)*`
- Adjacent overlapping: `(a+)(a+)` or `.*.*`
- Alternation with overlap: `(a|aa)+`

**Safe alternatives:**
```javascript
// Instead of: (a+)+
// Use: a+

// Instead of:.*.*
// Use:.*

// Instead of: (a|aa)+
// Use: a+
```

### Performance Comparison

```
Input: 'a'.repeat(25) + 'b'

/^(a+)+$/: >30 seconds (hangs)
/^a+$/: <1ms
```

### Testing for Catastrophic Backtracking

```javascript
function testRegexPerformance(regex, input) {
 const start = performance.now();
 regex.test(input);
 const time = performance.now() - start;
 console.log(`${time.toFixed(2)}ms`);
}

// Test with increasing lengths
for (let i = 15; i <= 30; i += 5) {
 const input = 'a'.repeat(i) + 'b';
 console.log(`Length ${i}:`);
 testRegexPerformance(/^(a+)+$/, input); // Dangerous
 testRegexPerformance(/^a+$/, input); // Safe
}
```

When regex causes performance issues, browser hangs, or server timeouts

  • Nested quantifiers ((x+)+) are almost always a bug, flatten them
  • Test regex with 'evil input' (long strings designed to cause backtracking) before production

4. Regex Search and Replace

Help me create a search-and-replace regex:

**Find:** [WHAT TO FIND]
**Replace with:** [WHAT TO REPLACE IT WITH]

**Examples:**
| Input | Desired Output |
|-------|----------------|
| [EXAMPLE 1 INPUT] | [EXAMPLE 1 OUTPUT] |
| [EXAMPLE 2 INPUT] | [EXAMPLE 2 OUTPUT] |

**Context:**
- Tool/Language: [VSCODE / JAVASCRIPT / SED / etc.]
- Scope: [SINGLE FILE / CODEBASE-WIDE]

**Provide:**
1. Find pattern
2. Replace pattern
3. Explanation of capture groups
4. Any edge cases to watch for
Find: console.log statements with variable arguments
Replace: Remove them entirely (for production build)
Examples:
- console.log('test') → (removed)
- console.log(user, 'data:', data) → (removed)
- console.log() → (removed)
- // console.log('commented') → (keep, it's commented)
Tool: VS Code find/replace
## Console.log Removal Regex

### VS Code Find/Replace

**Find:**
```
^(?!\s*//).*console\.log\([^)]*\);?\s*$
```

**Replace:**
```
(empty - just delete)
```

**Settings:** Enable "Use Regular Expression" (Alt+R)

### Pattern Breakdown

| Component | Meaning |
|-----------|--------|
| `^` | Start of line |
| `(?!\s*//)` | Negative lookahead: NOT followed by optional whitespace and `//` |
| `.*` | Any characters before console.log |
| `console\.log\(` | Literal `console.log(` |
| `[^)]*` | Any characters except `)` (the arguments) |
| `\)` | Closing parenthesis |
| `;?` | Optional semicolon |
| `\s*$` | Optional trailing whitespace to end of line |

### Handling Nested Parentheses

The simple pattern above fails on:
```javascript
console.log(fn(a, b)); // Has nested parens
```

**More solid pattern:**
```
console\.log\((?:[^)(]+|\((?:[^)(]+|\([^)(]*\))*\))*\);?
```

Or use a multi-pass approach:

### Safer: Multiple Passes

**Pass 1 - Simple logs:**
```
Find: console\.log\([^)]*\);?\n?
Replace: (empty)
```

**Pass 2 - Single nested:**
```
Find: console\.log\([^)]*\([^)]*\)[^)]*\);?\n?
Replace: (empty)
```

**Pass 3 - Manual review:**
Search for remaining `console.log` and handle manually.

### JavaScript Alternative

For a build tool, use an AST-based approach instead:

```javascript
// babel-plugin-transform-remove-console
// In babel.config.js:
module.exports = {
 plugins: [
 ['transform-remove-console', { exclude: ['error', 'warn'] }]
 ]
};
```

### Edge Cases

| Input | Handled? |
|-------|----------|
| `console.log('test');` | ✓ Removed |
| `console.log(a, b, c)` | ✓ Removed |
| `// console.log()` | ✓ Kept (commented) |
| `console.log(fn())` | ⚠️ Simple pattern fails |
| `console.log(\`template\`)` | ⚠️ May fail on backticks |
| Multi-line console.log | ✗ Not handled |

### Recommendation

For production code removal:
1. Use the Babel plugin (safest)
2. Or use ESLint with `no-console` rule + `--fix`
3. Regex is good for quick cleanup but review the diff

When you need to find and transform patterns across code, refactoring, cleanup, format conversion

  • Always preview changes before applying codebase-wide, regex replaces can have unintended effects
  • For code transformations, AST-based tools (Babel, jscodeshift) are safer than regex

Common Mistakes to Avoid

Not specifying the regex flavor, JavaScript regex differs from Python differs from PCRE

Only providing examples that should match, negative examples are equally important for precision

Creating patterns with catastrophic backtracking potential in nested quantifiers

Frequently Asked Questions

Regular expressions are incredibly powerful but notoriously hard to get right. These prompts help you create patterns that match exactly what you need, no more false positives catching unintended strings, no more false negatives missing valid input. The key is being specific about what should and shouldn't match.

Related Templates

Have your own prompt to optimize?