Configuration tab

| Setting | Description |
|---|---|
| Preset Configuration | Choose a starting point — Conservative (95%), Balanced (92%), Aggressive (88%), or Custom |
| Auto-merge threshold | Records scoring above this are merged automatically |
| Review threshold | Records between review and auto-merge thresholds need manual review |
| Fields to Compare | Select fields and set comparison type (Fuzzy or Exact) and weight |
Rules & Performance tab

| Setting | Description |
|---|---|
| Blocking Keys | Only compare records that share a blocking key value — dramatically improves performance on large datasets |
| Must-Match Rules | Records must match on these fields to be considered duplicates |
| Must-Not-Match Rules | Records matching on these fields are never considered duplicates |
Advanced tab

- Trim & lowercase
- Remove punctuation
- Unicode normalize
- Ignore company suffixes (e.g., “Inc”, “LLC”)
- Phone normalize (E.164)
- Address normalize
| Setting | Description |
|---|---|
| Dry run | Simulate without actually merging |
| Max cluster size | Prevents one bad key from merging hundreds of records |
| Max total merges | Stops after N merges for gradual rollout |
Output
Two output paths:- All Records — every record with duplicate scores attached
- Deduplicated — clean dataset with duplicates removed
Best Practices
- Start with the Balanced preset and adjust thresholds based on results
- Use Blocking Keys for large datasets — comparing every record pair is expensive
- Enable Dry run first to preview results before committing merges
- Set Max total merges for gradual rollout on critical data
Related Nodes
- Bond Node — matches records across different entities, not within the same dataset
- Data Normalization — clean field values before duplicate detection for better accuracy