Configuration
| Setting | Description |
|---|---|
| Mode | Operation mode: Classify, Extract, Detect, or Multi-Label |
| Input Field | The field to apply regex patterns to |
| Case Insensitive | Ignore case when matching (default: enabled) |
| Virtual Object Name | Namespace prefix for output fields (default: regex_pattern) |
Mode-specific settings
Classify — first matching pattern assigns a label:| Setting | Description |
|---|---|
| Rules | Ordered list of {label, pattern, exclude_pattern} |
| Output Field Name | Name for the label column |
| Default Value | Label when no pattern matches |
| Setting | Description |
|---|---|
| Extract Pattern | Regex with capture groups |
| Extract Groups | Map each group to {group_index, output_field, cast_type} |
| Setting | Description |
|---|---|
| Rules | Single pattern rule |
| Output Field Name | Name for the boolean column |
| Setting | Description |
|---|---|
| Rules | List of {label, pattern} — each creates a separate boolean column |
How It Works
Choose a mode
Select the operation that fits your use case — classification, extraction, detection, or multi-labeling.
Output
Depends on the mode:- Classify: a single label column with the first matching category
- Extract: one column per capture group, optionally cast to specific types
- Detect: a single boolean column
- Multi-Label: one boolean column per rule
Examples
Classify products by name
- Mode: Classify
- Input: Product Name
- Rules:
- Label “Electronics” → pattern
phone|laptop|tablet - Label “Clothing” → pattern
shirt|pants|jacket
- Label “Electronics” → pattern
- Default: “Other”
Extract price and currency
- Mode: Extract
- Input: Price Text (e.g., “USD 149.99”)
- Pattern:
([A-Z]{3})\s+(\d+\.\d+) - Groups: group 1 →
currency(str), group 2 →amount(float)
Detect email addresses
- Mode: Detect
- Input: Notes field
- Pattern:
[\w.-]+@[\w.-]+\.\w+ - Output:
has_email(boolean)
Best Practices
- Use Classify for first-match-wins categorization (order rules from most specific to most general)
- Use Extract when you need to pull structured data out of text
- Use Detect for simple yes/no pattern presence checks
- Use Multi-Label when a record can belong to multiple categories simultaneously
- Test patterns on sample data before running on the full dataset
Related Nodes
- Transform — rule-based field computation without regex
- Data Normalization — LLM-powered text cleaning when regex is too rigid
- AI Enrichment — LLM-based classification when pattern matching isn’t sufficient