Enrichments
Extract custom data from documents using AI—no coding required
What are Enrichments?
Enrichments are AI-powered data extraction fields that you configure in plain language. Instead of manually reading thousands of documents, you define what you're looking for once, and George AI extracts it automatically from all documents.
Enrichment fields are added to Lists (custom views of your Library files). Each field defines a piece of information to extract, and George AI processes all documents in the List to populate that field.
How It Works
- Define FieldName + AI prompt
- AI ProcessesExtracts from all docs
- Review DataTable view + export
Real-World Example: Pharmaceutical Packaging
A pharmaceutical company needed to extract specifications from 30,000+ packaging PDFs from design agencies.
| Enrichment Field | What It Extracts | Example Result |
|---|---|---|
| SAP Product ID | 10-digit product code | 4012345678 |
| Printing Colors | Color specifications | Pantone 1234, CMYK |
| Package Dimensions | Width × height × depth in mm | 150×200×30 mm |
| Market Languages | Target market codes | EN, DE, FR |
Creating an Enrichment Field
Open the List where you want to add an enrichment field
Click List Settings → Fields
Click "Add Field" button
| Field Name | Required • 2-100 characters What you want to extract (e.g., "Product Code", "Invoice Amount") |
|---|---|
| Data Type | Required string text markdown number date datetime boolean Choose the type that matches your data. Use text for long content, markdown for formatted text, string for short values. |
| AI Model | Required Select which AI model to use for extraction. Available models depend on your AI Services configuration. |
| AI Prompt | Required • 10-2000 characters Describe in plain language what to extract. Be specific about format, location, and variations. |
| Failure Terms | Comma-separated terms that indicate extraction failure or missing data If the AI returns any of these terms, the enrichment will be marked as failed. |
|---|---|
| Vector Store Search | Enable semantic search to find relevant document chunks before extraction When enabled, provide a Content Query (2-100 characters) describing what content to search for. This helps the AI focus on relevant sections of large documents, improving accuracy and speed. |
| Context Fields | Select other enrichment fields to provide context to the AI Context fields are shown to the AI along with the document, helping it make better extraction decisions. |
After creating the field, test it on a few documents before processing the entire List:
Testing Workflow
- Filter your List to show 5-10 representative documents
- Click the field header → Start Enrichment
- Review extracted values in the table
- If results are inaccurate, edit the field and adjust the AI prompt
- Once satisfied, remove filters and run enrichment on the entire List
Enrichment Queue
Track processing progress in Admin Panel → Enrichment Queue. Monitor success rates and troubleshoot failures.
Dynamic Context Sources
Context sources give your enrichment fields dynamic data. Instead of static prompts, you can reference other fields with {{FieldName}} syntax, search libraries, or fetch data from web APIs—all without writing code.
Field References
Most Common
Use values from other enrichment fields in your prompts
Query Template: "Find technical specs for {{ProductName}}" Example:
• Field 1: Extract "Product Name" → "iPhone 15 Pro"
• Field 2: Use {{ProductName}} to search specs
• Result: Finds "iPhone 15 Pro specifications" in docs
Vector Search Context
Dynamically search your library for relevant content
Configuration:
- Query template with {{Variables}}
- Max tokens (default: 2000)
- Library selection
Example Query: "{{ProductName}} safety warnings" Web Fetch Context
Fetch data from web APIs with dynamic URLs
URL Template: https://api.example.com/products/{{SKU}} Supports:
• JSON APIs (auto-converted to markdown)
• HTML pages (cleaned and formatted)
• Authentication headers
• Custom max tokens
Static Markdown
Provide the same context for all list items
Use cases:
- Company glossary or terminology
- Extraction guidelines
- Classification categories
- Reference tables
Real-World Example: Product Data Sheet Generation
Generate comprehensive product data sheets by combining multiple context sources:
| Field | Context Type | Configuration | Result |
|---|---|---|---|
| Product Name | None | Direct extraction from document | "Premium Widget X200" |
| Specifications | Vector Search | Query: "{{ProductName}} technical specifications" | Finds spec sheet in library → formatted markdown |
| Market Price | Web Fetch | URL: "https://api.prices.example/{{SKU}}" | Fetches current price from API → "$299.99" |
| Category | Static Markdown | Product taxonomy (20 categories) | AI classifies based on taxonomy → "Electronics > Widgets" |
Automatic Markdown Conversion
Tips for Writing Effective Prompts
✓ Do This
- • Be specific about format ("10-digit number starting with 40")
- • Mention typical location ("top right of first page")
- • Provide examples ("like 4012345678 or 5087654321")
- • Describe variations ("may have dashes or spaces")
- • Specify units if applicable ("in millimeters", "in EUR")
✗ Avoid This
- • Vague descriptions ("find the code")
- • Conflicting requirements ("must be text and number")
- • Too many things at once (split into multiple fields)
- • Ambiguous language ("the main ID"—which one?)
- • Assuming document structure (not all docs are the same)
Managing Enrichment Fields
Start Enrichment
Process all documents in the List (or filtered subset) to populate the field. Only missing values are enriched by default.
Stop Enrichment
Cancel all pending and processing tasks for this field. Already completed enrichments remain.
Clean Enrichments
Clear all cached enrichment values for this field. Use this to re-extract data after updating the prompt.
Editing Fields
You can edit enrichment fields at any time. After editing, use Clean Enrichments to clear old values, then Start Enrichment to re-process with the new prompt.
Field Types
| Type | Description | Use Cases |
|---|---|---|
string | Short text values (IDs, codes, names) | Product codes, customer IDs, status values |
text | Long text content (descriptions, notes) | Product descriptions, comments, summaries |
markdown | Formatted text with markdown syntax | Rich text content, formatted specifications, documentation |
number | Numeric values (integers or decimals) | Prices, quantities, dimensions, percentages |
date | Date without time (YYYY-MM-DD) | Expiration dates, manufacturing dates |
datetime | Date with time (ISO 8601 format) | Order timestamps, delivery times |
boolean | True/false values | Compliance flags, approval status, availability |
Enrichment Details & Troubleshooting
Click any cell in a List to view detailed enrichment information, inspect how values were extracted, debug errors, and retry failed extractions.
The enrichment side panel provides complete transparency into how each value was extracted:
| Section | What It Shows |
|---|---|
| Value | The extracted value, error message, or failure term. For markdown fields, toggle between rendered and source views. |
| Metadata | Status, AI model used, processing duration, requested/completed timestamps |
| Issues | Warnings or validation issues encountered during extraction |
| Context | Field references, vector search results, web fetch data, or full content used for extraction |
| Prompt | The exact prompt sent to the LLM (after template variable substitution) |
| Complete Messages | All messages (system, user, assistant) sent to the LLM for debugging complex prompts |
How to Access
Click any cell in the List table to open the enrichment details side panel. Close with ESC key or close button.
If an enrichment fails or returns an incorrect value, you can retry it:
Single Retry
- Click the cell with the failed enrichment to open the side panel
- Review the error message or issues section to understand what went wrong
- Click the Retry button at the bottom
- The enrichment will be added back to the queue and processed again
Batch Retry
- Click the field header to open the field menu
- Select Start Enrichment
- By default, only missing and failed enrichments are retried
- Check "Overwrite existing enrichments" to re-process all items
Edit Prompt Before Retry
If the prompt is incorrect, edit the field first, then use Clean Enrichments and Start Enrichment to re-process with the new prompt.
Error: "Failure term detected"
Cause: The AI returned one of your configured failure terms (e.g., "not found", "N/A")
Solution:
- Check if the data actually exists in the document (view extraction)
- Improve the prompt to be more specific about where to find the data
- Add context sources (vector search, field references) to provide more information
- If the term is actually valid data, remove it from the failure terms list
Error: "Timeout" or "Processing error"
Cause: LLM took too long or encountered an error
Solution:
- Reduce context size (lower max tokens for vector search/web fetch)
- Simplify the prompt to require less reasoning
- Try a faster AI model
- Retry the enrichment (temporary infrastructure issues)
Issue: Incorrect values extracted
Cause: Prompt ambiguity or insufficient context
Solution:
- Review the "Prompt Sent to LLM" section to see what the AI actually received
- Check the "Context" section to verify the right data was provided
- Make prompt more specific (add examples, describe format, mention location)
- Add field references to provide clarifying context
- Verify vector search queries are returning relevant chunks
Issue: Slow processing
Cause: Large files, complex prompts, or high queue load
Solution:
- Check Admin Panel → Enrichment Queue for queue status
- Reduce max tokens for context sources
- Process in smaller batches (filter list before starting enrichment)
- Use faster AI models for simple extractions
Track enrichment progress in real-time:
Field Header Status
Each field header shows:
- Number of processing items
- Number of pending items
- Click header for field controls (start, stop, clean)
Enrichment Queue (Admin Panel)
Global queue view showing:
- All enrichment tasks across all Lists
- Queue status and worker health
- Failed tasks with error details