No-Code

Enrichments

Extract custom data from documents using AI—no coding required

What are Enrichments?

Enrichments are AI-powered data extraction fields that you configure in plain language. Instead of manually reading thousands of documents, you define what you're looking for once, and George AI extracts it automatically from all documents.

Enrichment fields are added to Lists (custom views of your Library files). Each field defines a piece of information to extract, and George AI processes all documents in the List to populate that field.

How It Works

  • Define Field
    Name + AI prompt
  • AI Processes
    Extracts from all docs
  • Review Data
    Table view + export

Real-World Example: Pharmaceutical Packaging

A pharmaceutical company needed to extract specifications from 30,000+ packaging PDFs from design agencies.

Enrichment Field What It Extracts Example Result
SAP Product ID 10-digit product code 4012345678
Printing Colors Color specifications Pantone 1234, CMYK
Package Dimensions Width × height × depth in mm 150×200×30 mm
Market Languages Target market codes EN, DE, FR
Result: All 30,000 documents processed automatically. Data exported to SAP for product management.

Creating an Enrichment Field

1
Navigate to List Settings

Open the List where you want to add an enrichment field

Click List SettingsFields

Click "Add Field" button

2
Configure Required Settings
Field Name

Required • 2-100 characters

What you want to extract (e.g., "Product Code", "Invoice Amount")

Data Type

Required

string
text
markdown
number
date
datetime
boolean

Choose the type that matches your data. Use text for long content, markdown for formatted text, string for short values.

AI Model

Required

Select which AI model to use for extraction. Available models depend on your AI Services configuration.

AI Prompt

Required • 10-2000 characters

Describe in plain language what to extract. Be specific about format, location, and variations.

Example: "Extract the SAP product identification
number. It's usually a 10-digit code starting
with '40' or '50', found in the top-right corner
of the first page."
3
Configure Optional Settings
Failure Terms

Comma-separated terms that indicate extraction failure or missing data

Example: "not found, N/A, missing, unknown"

If the AI returns any of these terms, the enrichment will be marked as failed.

Vector Store Search

Enable semantic search to find relevant document chunks before extraction

When enabled, provide a Content Query (2-100 characters) describing what content to search for.

This helps the AI focus on relevant sections of large documents, improving accuracy and speed.

Context Fields

Select other enrichment fields to provide context to the AI

Example: When extracting "Unit Price", include
"Currency" as context so AI knows the price format

Context fields are shown to the AI along with the document, helping it make better extraction decisions.

4
Test and Apply

After creating the field, test it on a few documents before processing the entire List:

Testing Workflow

  1. Filter your List to show 5-10 representative documents
  2. Click the field header → Start Enrichment
  3. Review extracted values in the table
  4. If results are inaccurate, edit the field and adjust the AI prompt
  5. Once satisfied, remove filters and run enrichment on the entire List

Enrichment Queue

Track processing progress in Admin Panel → Enrichment Queue. Monitor success rates and troubleshoot failures.

Dynamic Context Sources

Context sources give your enrichment fields dynamic data. Instead of static prompts, you can reference other fields with {{FieldName}} syntax, search libraries, or fetch data from web APIs—all without writing code.

Field References
Most Common

Use values from other enrichment fields in your prompts

Query Template:
"Find technical specs for {{ProductName}}"

Example:

• Field 1: Extract "Product Name" → "iPhone 15 Pro"

• Field 2: Use {{ProductName}} to search specs

• Result: Finds "iPhone 15 Pro specifications" in docs

Vector Search Context

Dynamically search your library for relevant content

Configuration:

  • Query template with {{Variables}}
  • Max tokens (default: 2000)
  • Library selection
Example Query:
"{{ProductName}} safety warnings"

Web Fetch Context

Fetch data from web APIs with dynamic URLs

URL Template:
https://api.example.com/products/{{SKU}}

Supports:

• JSON APIs (auto-converted to markdown)

• HTML pages (cleaned and formatted)

• Authentication headers

• Custom max tokens

Static Markdown

Provide the same context for all list items

Use cases:

  • Company glossary or terminology
  • Extraction guidelines
  • Classification categories
  • Reference tables

Real-World Example: Product Data Sheet Generation

Generate comprehensive product data sheets by combining multiple context sources:

Field Context Type Configuration Result
Product Name None Direct extraction from document "Premium Widget X200"
Specifications Vector Search Query: "{{ProductName}} technical specifications" Finds spec sheet in library → formatted markdown
Market Price Web Fetch URL: "https://api.prices.example/{{SKU}}" Fetches current price from API → "$299.99"
Category Static Markdown Product taxonomy (20 categories) AI classifies based on taxonomy → "Electronics > Widgets"
Result: Four enrichment fields working together create complete, accurate product data sheets automatically

Automatic Markdown Conversion

JSON API responses and HTML pages are automatically converted to well-formatted markdown. This improves LLM extraction accuracy.

Tips for Writing Effective Prompts

✓ Do This

  • • Be specific about format ("10-digit number starting with 40")
  • • Mention typical location ("top right of first page")
  • • Provide examples ("like 4012345678 or 5087654321")
  • • Describe variations ("may have dashes or spaces")
  • • Specify units if applicable ("in millimeters", "in EUR")

✗ Avoid This

  • • Vague descriptions ("find the code")
  • • Conflicting requirements ("must be text and number")
  • • Too many things at once (split into multiple fields)
  • • Ambiguous language ("the main ID"—which one?)
  • • Assuming document structure (not all docs are the same)

Managing Enrichment Fields

Start Enrichment

Process all documents in the List (or filtered subset) to populate the field. Only missing values are enriched by default.

Stop Enrichment

Cancel all pending and processing tasks for this field. Already completed enrichments remain.

Clean Enrichments

Clear all cached enrichment values for this field. Use this to re-extract data after updating the prompt.

Editing Fields

You can edit enrichment fields at any time. After editing, use Clean Enrichments to clear old values, then Start Enrichment to re-process with the new prompt.

Field Types

Type Description Use Cases
string Short text values (IDs, codes, names) Product codes, customer IDs, status values
text Long text content (descriptions, notes) Product descriptions, comments, summaries
markdown Formatted text with markdown syntax Rich text content, formatted specifications, documentation
number Numeric values (integers or decimals) Prices, quantities, dimensions, percentages
date Date without time (YYYY-MM-DD) Expiration dates, manufacturing dates
datetime Date with time (ISO 8601 format) Order timestamps, delivery times
boolean True/false values Compliance flags, approval status, availability

Enrichment Details & Troubleshooting

Click any cell in a List to view detailed enrichment information, inspect how values were extracted, debug errors, and retry failed extractions.

Viewing Enrichment Details

The enrichment side panel provides complete transparency into how each value was extracted:

Section What It Shows
Value The extracted value, error message, or failure term. For markdown fields, toggle between rendered and source views.
Metadata Status, AI model used, processing duration, requested/completed timestamps
Issues Warnings or validation issues encountered during extraction
Context Field references, vector search results, web fetch data, or full content used for extraction
Prompt The exact prompt sent to the LLM (after template variable substitution)
Complete Messages All messages (system, user, assistant) sent to the LLM for debugging complex prompts

How to Access

Click any cell in the List table to open the enrichment details side panel. Close with ESC key or close button.

Retry Failed Enrichments

If an enrichment fails or returns an incorrect value, you can retry it:

Single Retry

  1. Click the cell with the failed enrichment to open the side panel
  2. Review the error message or issues section to understand what went wrong
  3. Click the Retry button at the bottom
  4. The enrichment will be added back to the queue and processed again

Batch Retry

  1. Click the field header to open the field menu
  2. Select Start Enrichment
  3. By default, only missing and failed enrichments are retried
  4. Check "Overwrite existing enrichments" to re-process all items

Edit Prompt Before Retry

If the prompt is incorrect, edit the field first, then use Clean Enrichments and Start Enrichment to re-process with the new prompt.

Common Issues & Solutions

Error: "Failure term detected"

Cause: The AI returned one of your configured failure terms (e.g., "not found", "N/A")

Solution:

  • Check if the data actually exists in the document (view extraction)
  • Improve the prompt to be more specific about where to find the data
  • Add context sources (vector search, field references) to provide more information
  • If the term is actually valid data, remove it from the failure terms list

Error: "Timeout" or "Processing error"

Cause: LLM took too long or encountered an error

Solution:

  • Reduce context size (lower max tokens for vector search/web fetch)
  • Simplify the prompt to require less reasoning
  • Try a faster AI model
  • Retry the enrichment (temporary infrastructure issues)

Issue: Incorrect values extracted

Cause: Prompt ambiguity or insufficient context

Solution:

  • Review the "Prompt Sent to LLM" section to see what the AI actually received
  • Check the "Context" section to verify the right data was provided
  • Make prompt more specific (add examples, describe format, mention location)
  • Add field references to provide clarifying context
  • Verify vector search queries are returning relevant chunks

Issue: Slow processing

Cause: Large files, complex prompts, or high queue load

Solution:

  • Check Admin Panel → Enrichment Queue for queue status
  • Reduce max tokens for context sources
  • Process in smaller batches (filter list before starting enrichment)
  • Use faster AI models for simple extractions
Monitoring Processing Progress

Track enrichment progress in real-time:

Field Header Status

Each field header shows:

  • Number of processing items
  • Number of pending items
  • Click header for field controls (start, stop, clean)

Enrichment Queue (Admin Panel)

Global queue view showing:

  • All enrichment tasks across all Lists
  • Queue status and worker health
  • Failed tasks with error details

Related Topics

Explore related features and concepts:

George-Cloud