Schema Discovery

Automatic understanding of your data relationships—no manual configuration required.

What is Schema Discovery?

Schema Discovery is how Shadowfax automatically figures out your data structure. When you import data, the AI analyzes column names, data types, and patterns to understand relationships between tables. It identifies foreign keys, suggests joins, and builds a schema map—all without you defining anything manually.

Schema Map

Automatically discovered relationships between datasets

Why Schema Discovery Matters

Zero configuration: Import data and start analyzing—no setup needed.

Smart joins: The AI knows how tables relate, so it suggests correct join keys.

Error prevention: Reduces mistakes from incorrect table relationships.

Time savings: Skip hours of manual schema documentation.

Onboarding speed: New team members understand data structure visually.

How It Works

Automatic Analysis

When you add Sources to your Workbook:

Column inspection: Examines column names for patterns (id, user_id, customer_id, etc.)
Data type analysis: Checks if columns contain appropriate data for keys
Relationship inference: Identifies likely foreign key relationships
Cardinality detection: Determines one-to-one, one-to-many, or many-to-many relationships
Naming pattern matching: Recognizes common naming conventions

Visual Schema Map

Access the schema visualization to see:

All your Sources as entities
Relationship lines connecting related tables
Join keys labeled on each connection
Relationship types (one-to-many, etc.)

ER Diagram

Entity-relationship diagram showing discovered schema

What Gets Discovered

Foreign Key Relationships

Pattern recognition:

customer_id in orders table → links to id in customers table
product_id → links to products table
user_id → links to users table

Naming variations handled:

customerId (camelCase)
customer_id (snake_case)
CustomerID (PascalCase)

Join Recommendations

When you ask to combine datasets:

@[orders] and @[customers] Show customer names with their orders

The AI already knows to join on customer_id = id because of schema discovery.

Data Type Inference

Shadowfax detects:

Primary keys: Unique identifiers
Foreign keys: References to other tables
Dates and times: For temporal analysis
Categorical fields: For grouping
Numeric measures: For aggregation
Text fields: For filtering and display

Common Scenarios

Multi-Table Analysis

Your request:

@[orders], @[customers], and @[products] Show revenue by customer segment
and product category

What happens:

AI sees three tables mentioned
Checks discovered relationships
Knows: orders.customer_id → customers.id
Knows: orders.product_id → products.id
Constructs correct joins automatically
Groups by customer segment and product category

You do: Nothing—just mention the tables.

Multi-Table Join

AI uses discovered schema to join multiple tables correctly

Ambiguous Column Names

Scenario: Both "orders" and "returns" have a date column

What happens:

Schema discovery notes both columns
When you mention "date", AI asks which one you mean
Or uses context to infer (e.g., if you mentioned @[orders], uses orders.date)

Your role: Add clarifying context in column annotations to avoid ambiguity.

Missing Relationships

Scenario: Two tables should relate but discovery missed it

What happens:

You explicitly tell the AI the join key:

Join @[table1] with @[table2] on table1.custom_field = table2.id

AI learns and remembers this relationship for future queries

Viewing Your Schema

Accessing Schema Map

Click the schema visualization icon (usually in the toolbar)
See all Sources as connected entities
Click relationships to see join keys
Zoom and pan to explore complex schemas

Schema Visualization UI

Interactive schema map interface

Understanding the Visualization

Boxes: Represent Sources (tables) Lines: Represent relationships Labels on lines: Show the join keys (e.g., "customer_id = id") Line style: Indicates relationship type

Solid: One-to-many
Dashed: Many-to-many
Arrow direction: Shows the "many" side

Enhancing Schema Discovery

Add Context to Columns

Help the AI understand your data better:

After importing, add column context in the Source settings
Explain unusual naming conventions
Clarify what IDs represent
Note any data quality issues

Example:

Column: cust_ref
Context: "This is the customer ID, links to customers.id"

Specify Join Conditions

If automatic discovery misses something:

@[orders] Join with @[special_discounts] where orders.promo_code =
special_discounts.code

The AI will note this relationship for future use.

Tips & Best Practices

Use consistent naming: Stick to one convention (snake_case or camelCase) across datasets.

Name foreign keys clearly: Use patterns like customer_id, product_id rather than generic ref or fk.

Review the schema map: Check that discovered relationships match your expectations.

Add context for unusual patterns: If your schema doesn't follow conventions, explain it in column annotations.

Leverage auto-discovery: Let the AI figure out relationships instead of specifying every join manually.

Verify complex joins: For multi-table joins, check the generated SQL to ensure correctness.

Teach the AI: When you correct a relationship, the AI learns for that Workbook.

Benefits Over Manual Schema Definition

Speed: Instant understanding vs. hours of documentation

Accuracy: Detects patterns you might miss

Maintenance-free: Automatically understands new tables

No configuration files: No need to write schema YAML or config

Smart defaults: Works out-of-the-box for standard patterns

Limitations

Non-standard schemas: Unusual naming patterns may require clarification

Implicit relationships: Relationships without naming hints might be missed

Multiple valid joins: When two tables can join on different keys, you may need to specify

In these cases, simply tell the AI explicitly and it will remember.

Sources - Where schema discovery happens
Views - Benefit from discovered relationships
AI Chat Interface - Where you leverage schema knowledge
@Mentions - Reference datasets the AI understands

What is Schema Discovery?​

Why Schema Discovery Matters​

How It Works​

Automatic Analysis​

Visual Schema Map​

What Gets Discovered​

Foreign Key Relationships​

Join Recommendations​

Data Type Inference​

Common Scenarios​

Multi-Table Analysis​

Ambiguous Column Names​

Missing Relationships​

Viewing Your Schema​

Accessing Schema Map​

Understanding the Visualization​

Enhancing Schema Discovery​

Add Context to Columns​

Specify Join Conditions​

Tips & Best Practices​

Benefits Over Manual Schema Definition​

Limitations​

Related Features​

What is Schema Discovery?

Why Schema Discovery Matters

How It Works

Automatic Analysis

Visual Schema Map

What Gets Discovered

Foreign Key Relationships

Join Recommendations

Data Type Inference

Common Scenarios

Multi-Table Analysis

Ambiguous Column Names

Missing Relationships

Viewing Your Schema

Accessing Schema Map

Understanding the Visualization

Enhancing Schema Discovery

Add Context to Columns

Specify Join Conditions

Tips & Best Practices

Benefits Over Manual Schema Definition

Limitations

Related Features