Documentation Index
Fetch the complete documentation index at: https://docs.bondata.ai/llms.txt
Use this file to discover all available pages before exploring further.
Applies regex patterns to field values for classification, extraction, detection, or multi-labeling. Use it to categorize records, pull out structured data from text, or flag records matching specific patterns.
Configuration
| Setting | Description |
|---|
| Mode | Operation mode: Classify, Extract, Detect, or Multi-Label |
| Input Field | The field to apply regex patterns to |
| Case Insensitive | Ignore case when matching (default: enabled) |
| Virtual Object Name | Namespace prefix for output fields (default: regex_pattern) |
Mode-specific settings
Classify - first matching pattern assigns a label:
| Setting | Description |
|---|
| Rules | Ordered list of {label, pattern, exclude_pattern} |
| Output Field Name | Name for the label column |
| Default Value | Label when no pattern matches |
Extract - capture groups pull out structured data:
| Setting | Description |
|---|
| Extract Pattern | Regex with capture groups |
| Extract Groups | Map each group to {group_index, output_field, cast_type} |
Detect - single pattern produces a boolean:
| Setting | Description |
|---|
| Rules | Single pattern rule |
| Output Field Name | Name for the boolean column |
Multi-Label - each pattern produces an independent boolean:
| Setting | Description |
|---|
| Rules | List of {label, pattern} - each creates a separate boolean column |
How It Works
Choose a mode
Select the operation that fits your use case - classification, extraction, detection, or multi-labeling.
Select the input field
Choose which field to apply patterns to.
Define patterns
Write regex patterns. For Classify and Multi-Label, add multiple rules with labels.
Output
Depends on the mode:
- Classify: a single label column with the first matching category
- Extract: one column per capture group, optionally cast to specific types
- Detect: a single boolean column
- Multi-Label: one boolean column per rule
Examples
Classify products by name
- Mode: Classify
- Input: Product Name
- Rules:
- Label “Electronics” → pattern
phone|laptop|tablet
- Label “Clothing” → pattern
shirt|pants|jacket
- Default: “Other”
Extract price and currency
- Mode: Extract
- Input: Price Text (e.g., “USD 149.99”)
- Pattern:
([A-Z]{3})\s+(\d+\.\d+)
- Groups: group 1 →
currency (str), group 2 → amount (float)
Detect email addresses
- Mode: Detect
- Input: Notes field
- Pattern:
[\w.-]+@[\w.-]+\.\w+
- Output:
has_email (boolean)
Best Practices
- Use Classify for first-match-wins categorization (order rules from most specific to most general)
- Use Extract when you need to pull structured data out of text
- Use Detect for simple yes/no pattern presence checks
- Use Multi-Label when a record can belong to multiple categories simultaneously
- Test patterns on sample data before running on the full dataset
- Transform - rule-based field computation without regex
- Data Normalization - LLM-powered text cleaning when regex is too rigid
- AI Enrichment - LLM-based classification when pattern matching isn’t sufficient