Understanding these foundational concepts will help you make the most of Lume’s capabilities. This guide introduces the key components and how they work together.

Project Basics

Source Data

Source data is any user-provided data that you want to interpret or transform. Lume currently supports CSV files, and you can upload multiple CSV files to work with. All data must be structured, meaning:

  • The first row must contain column headers/names
  • Each subsequent row must follow the same structure
  • Data should be organized in a tabular format
  • Each column should contain consistent data types

There are two types of source files you can work with:

  1. Source Data: Your primary customer or business data that needs to be transformed
  2. Seed Data: External or internal enhancement data that can be used to enrich your source data. This includes:
    • Reference data (e.g., country codes, state abbreviations)
    • Lookup tables
    • Master data
    • Any non-customer data that helps enhance your primary dataset

While Lume only requires a single record to generate mapping logic, providing larger data samples improves mapping accuracy through better pattern recognition.

Support for JSON and XML formats is coming soon! In the meantime, we recommend converting these files to CSV or reaching out to our support team for assistance.

Need support for additional data formats? Contact the Lume team for assistance.

Target Schema

A target schema defines the desired output format for your transformed data. It uses YAML format to specify:

  • Target Model Name
  • Column Names
  • Test Rules
  • Business logic and descriptons

For more details on building target schemas, see our Building Target Schemas guide.

Key Components

Projects

A Project is your complete data transformation pipeline. It can:

  • Accept multiple file inputs (sources and seeds)
  • Include multiple transformation steps
  • Join and combine data
  • Produce final mapped output

Projects help you organize related transformations into logical sequences. Complex transformations can be broken down into manageable steps, making them easier to maintain and modify.

Project Versions

The project versions is a place to quickly manage different versions of your project and the runs associated with each version. Lume will automatically snapshot versions of the project as edits are made that result in changes to the code. These changes include:

  • Code
  • Source Schema
  • Target Schema

File Manager

The file manager is a place to manage and access your uploaded data. It can:

  • View metadata per model on row count, column count, and file size.
  • Insert, upsert, and remove source tables and source seed files.
  • Add additional context to the source table description to guide the AI generation.
  • Provide column level metadata around data type, nullability, and additional notes.

Workbook

Lume generates a spreadsheet style artifact called a Workbook, but you don’t need to be a programmer to use it effectively. The platform provides:

Core Concepts:

  • Data lineage showing how fields map between source and target
  • Sample data previews for curosry visual inspections
  • Natural language explanations of the transformation logic
  • Interactive edit interface for adjusting or providing additional mapping context
  • AI Chat to explore daata nd gain a deeper understanding

Code

Code represents the section to gain insights about the testing validation and sql models produced:

  • Compiled Code
  • Data Preview
  • Lineage
  • Validation

Lineage

A visual representation of the table and column level lineage to better understand the relationships between the transformations that Lume’s AI engine created.

Data

Lume provides comprehensive target data review. You can quickly scan the set of produced data to ensure it passes a quick visual inspection.