Understanding these foundational concepts will help you make the most of Lume’s capabilities. This guide introduces the key components and how they work together.

Data Pipeline Basics

Source Data

Source data is any user-provided data that you want to interpret or transform. Lume supports various structured and semi-structured formats:

  • JSON
  • CSV
  • Excel
  • And more

While Lume only requires a single record to generate mapping logic, providing larger data samples improves mapping accuracy through better pattern recognition.

Need support for additional data formats? Contact the Lume team for assistance!

Target Schema

A target schema defines the desired output format for your transformed data. It uses JSON Schema format to specify:

  • Expected data types
  • Field requirements
  • Data validation rules
  • Format specifications

Remember: Property names in Lume’s API cannot contain periods (.).

Don’t know JSON Schema? Lume can automatically generate a target schema from a sample CSV file containing your desired output format. This makes it easy for non-technical users to define their data requirements.

Pipeline Components

Flows

A flow is your complete data transformation pipeline. It can:

  • Accept multiple data inputs
  • Include multiple transformation steps
  • Join and combine data
  • Produce final mapped output

Flows help you organize related transformations into logical sequences. Complex transformations can be broken down into manageable steps, making them easier to maintain and modify.

Mapping

Lume generates Python code to transform your data, but you don’t need to be a programmer to use it effectively. The platform provides:

For Excel Users:

  • Visual data lineage showing how fields map between source and target
  • Sample data previews at each transformation step
  • Natural language explanations of the transformation logic
  • Interactive workshopping interface for adjusting mappings

For Developers:

  • Python-based transformation code (with SQL, dbt, and JSONata support coming soon)
  • Version control for all mapping logic
  • Direct code editing capabilities
  • Test-driven development support

Whether you’re Excel-proficient or a seasoned developer, Lume provides the tools you need to understand and control your data transformations.

Access the Schema Transform node within your flow to workshop and improve your mapper logic. Test changes with sample data before deployment.

Runs

A run represents the actual execution of your flow. Each run tracks:

  • Input source data
  • Output mapped data
  • Mapper version used
  • Real-time execution status
  • Validation results

Quality Control

Validation

Lume provides comprehensive validation capabilities:

  • Schema compliance checking
  • Data format verification
  • Required field validation
  • Custom validation rules

Monitoring

Monitor your data transformations through:

  • Real-time run status
  • Field-level validation results
  • Macro statistics
  • Performance metrics

Iteration

Improve your pipelines through:

  • Direct code inspection
  • Interactive workshopping
  • Test-driven development
  • Version control