Skip to main content

Data Refresh

Keep your analysis current with automatic and on-demand data updates.

What is Data Refresh?

Data refresh is how Shadowfax updates Sources with the latest data from file uploads. When data refreshes, all dependent Views recalculate automatically thanks to the reactive system, ensuring your entire analysis stays current. Database refresh capabilities are coming soon.

Data Refresh Flow

Fresh data flows through your entire pipeline automatically

Why Data Refresh Matters

Always current: Dashboards reflect the latest data without manual updates.

Automatic propagation: Refresh one Source, all downstream Views update.

Scheduled updates: Set it once, forget about it—data refreshes on schedule.

Time savings: No manual CSV exports and re-uploads.

Trust: Confidence that your analysis is based on current data.

How Data Refresh Works

The Refresh Process

  1. Trigger: Manual click, scheduled time, or file re-upload
  2. Fetch: Shadowfax queries the database or processes the file
  3. Update Source: New data replaces old in the Source
  4. Propagate: Reactive system detects change
  5. Recalculate: All dependent Views update automatically
  6. Refresh visualizations: Charts and dashboards show new data

Typically completes in seconds to minutes, depending on data volume.

Refresh Process

Step-by-step refresh and propagation

Refresh Options

Manual Refresh

What it is: You trigger refresh whenever you want fresh data.

How to do it:

  1. Right-click the Source node
  2. Select "Refresh data"
  3. Wait for refresh to complete
  4. Views update automatically

Best for:

  • Ad-hoc analysis
  • Infrequently changing data
  • Cost-sensitive database queries
  • Testing before scheduling

Indicators: Loading spinner while refreshing, timestamp updates when complete

Manual Refresh

Right-click menu with Refresh option

Scheduled Refresh

What it is: Automatic refresh at regular intervals.

Options:

  • Hourly: Every hour at the top of the hour
  • Daily: Once per day (usually early morning)
  • Custom: Specific times or intervals (if supported)

How to configure:

  1. Go to Source settings
  2. Select "Refresh schedule"
  3. Choose frequency (hourly, daily)
  4. Set time (for daily)
  5. Save

Best for:

  • Operational dashboards
  • Regular reporting
  • Monitoring live systems

Schedule Configuration

Refresh schedule settings

File Re-Upload Refresh

What it is: Upload a new version of a file to replace the old one.

How to do it:

  1. Click the Source you want to update
  2. Select "Replace file"
  3. Upload new file (same schema expected)
  4. Shadowfax validates and updates
  5. Views recalculate

Best for:

  • CSV or Excel Sources
  • Weekly or monthly data drops
  • Replacing snapshots

Important: New file should have the same columns—Views depend on schema consistency.

Refresh Frequency Considerations

Hourly Refresh

Pros:

  • Near-real-time dashboards
  • Catch operational issues quickly
  • Up-to-date for fast-moving data

Cons:

  • More database load
  • Higher query costs (for billed databases)
  • May not be necessary for slow-changing data

Good for: Operations dashboards, live monitoring, SaaS metrics

Daily Refresh

Pros:

  • Fresh data for daily reports
  • Lower database load
  • Cost-effective
  • Sufficient for most business analysis

Cons:

  • Not real-time
  • Delayed insights for fast-moving data

Good for: Daily reports, trend analysis, executive dashboards

Manual (On-Demand) Refresh

Pros:

  • Full control over timing
  • No unnecessary queries
  • Cost-effective for exploration

Cons:

  • Requires remembering to refresh
  • Risk of analyzing stale data
  • Not suitable for shared dashboards

Good for: Ad-hoc analysis, one-time investigations, development/testing

Frequency Comparison

Refresh frequency trade-offs

Understanding Refresh Status

Status Indicators

Last refreshed: Timestamp showing when data was last updated Refreshing: Loading spinner during active refresh Failed: Error icon if refresh encountered issues Scheduled next: Shows when next auto-refresh will occur

Viewing Refresh History

Some setups show refresh history:

  • Timestamps of past refreshes
  • Success/failure status
  • Error messages (if failed)

Use this to troubleshoot issues or verify refresh patterns.

Refresh History

Refresh history log showing past updates

Automatic View Recalculation

The Reactive Chain

When a Source refreshes:

Source updates

First-level Views recalculate (directly use Source)

Second-level Views recalculate (use first-level Views)

...continue downstream...

Visualizations refresh

All automatic, all consistent.

Performance Impact

Small data: Views recalculate in seconds Large data: May take minutes for complex pipelines Progress indicators: See which Views are updating

The system processes updates in dependency order for consistency.

Handling Schema Changes

Same Schema (Safe)

If new data has the same columns:

  • Refresh works smoothly
  • Views continue working
  • No issues

Schema Changes (Risky)

If columns are added, removed, or renamed:

  • Views referencing missing columns may break
  • Error messages appear
  • Manual fixes needed

Best practice: Keep schema stable. If schema must change, update Views to match.

Notification of Issues

If a refresh causes downstream errors:

  • Red error indicators on broken Views
  • Error messages explaining what broke
  • You can fix Views and re-run

Schema Change Warning

Warning when schema change breaks Views

Refreshing Multiple Sources

Independent Sources

If you have multiple database connections:

  • Each refreshes on its own schedule
  • Timing may differ
  • Views update as each Source refreshes

Coordinating Refreshes

Challenge: Source A refreshes at 8am, Source B at 9am—Views using both may be inconsistent for an hour.

Solution:

  • Align refresh schedules (both at 8am)
  • Or use manual refresh to sync timing

Dependency Awareness

Views joining multiple Sources:

  • Recalculate whenever any input Source refreshes
  • Always show data consistent with current Source states

Cost and Performance Considerations

Network and Storage

Large datasets:

  • Longer transfer times
  • More storage in Shadowfax
  • Consider filtering data before upload when possible

Cost Optimization

Managing data efficiently

Troubleshooting Refresh Issues

Refresh Failed

Common causes:

  • Database connection lost
  • Database credentials changed
  • Query timeout (data too large)
  • Schema changed
  • Network issues

Solutions:

  • Check connection settings
  • Verify credentials still valid
  • Test database connectivity
  • Review error messages
  • Contact database admin if needed

Views Not Updating

Cause: Refresh succeeded but Views didn't recalculate

Solutions:

  • Check View dependencies (should be automatic)
  • Refresh page
  • Manually trigger View recalculation (if option available)

Slow Refresh

Causes:

  • Large dataset
  • Slow database query
  • Complex downstream Views

Solutions:

  • Optimize database queries (indexes, views)
  • Reduce data volume (filter, sample)
  • Simplify View logic

Best Practices

Match refresh to data change frequency: If data updates daily, daily refresh is sufficient.

Align schedules: For joined Sources, sync refresh times.

Monitor costs: Track database query costs and adjust frequency if needed.

Test before scheduling: Use manual refresh to verify everything works.

Notify stakeholders: Let dashboard users know when data refreshes (e.g., "Daily at 6am EST").

Handle failures gracefully: Set up alerts if refresh fails (if supported).

Document refresh schedules: Keep a record of what refreshes when.

Use off-peak times: Schedule database queries during low-traffic periods.

Archive old snapshots: If analyzing historical states, export old data before refresh overwrites it.

Refresh and Collaboration

When working with a team:

  • All users see refreshed data: Shared Sources update for everyone
  • No version conflicts: Everyone always on the latest data
  • Coordinated schedules: Team agrees on refresh timing