Core Concepts
Understanding these foundational concepts will help you make the most of Lume’s capabilities. This guide introduces the key components and how they work together.
Data Pipeline Basics
Source Data
The input data you want to transform
Target Schema
The desired structure for your output data
Source Data
Source data is any user-provided data that you want to interpret or transform. Lume supports various structured and semi-structured formats:
- JSON
- CSV
- Excel
- And more
While Lume only requires a single record to generate mapping logic, providing larger data samples improves mapping accuracy through better pattern recognition.
Need support for additional data formats? Contact the Lume team for assistance!
Target Schema
A target schema defines the desired output format for your transformed data. It uses JSON Schema format to specify:
- Expected data types
- Field requirements
- Data validation rules
- Format specifications
Remember: Property names in Lume’s API cannot contain periods (.).
Don’t know JSON Schema? Lume can automatically generate a target schema from a sample CSV file containing your desired output format. This makes it easy for non-technical users to define their data requirements.
Pipeline Components
Flows
Orchestrate your data transformation journey
Mapping
AI-powered data transformation
Runs
Execute and monitor your transformations
Flows
A flow is your complete data transformation pipeline. It can:
- Accept multiple data inputs
- Include multiple transformation steps
- Join and combine data
- Produce final mapped output
Flows help you organize related transformations into logical sequences. Complex transformations can be broken down into manageable steps, making them easier to maintain and modify.
Mapping
Lume generates Python code to transform your data, but you don’t need to be a programmer to use it effectively. The platform provides:
For Excel Users:
- Visual data lineage showing how fields map between source and target
- Sample data previews at each transformation step
- Natural language explanations of the transformation logic
- Interactive workshopping interface for adjusting mappings
For Developers:
- Python-based transformation code (with SQL, dbt, and JSONata support coming soon)
- Version control for all mapping logic
- Direct code editing capabilities
- Test-driven development support
Whether you’re Excel-proficient or a seasoned developer, Lume provides the tools you need to understand and control your data transformations.
Access the Schema Transform node within your flow to workshop and improve your mapper logic. Test changes with sample data before deployment.
Runs
A run represents the actual execution of your flow. Each run tracks:
- Input source data
- Output mapped data
- Mapper version used
- Real-time execution status
- Validation results
Quality Control
Validation
Ensure data quality and accuracy
Monitoring
Track performance and catch issues
Iteration
Improve and refine your pipelines
Validation
Lume provides comprehensive validation capabilities:
- Schema compliance checking
- Data format verification
- Required field validation
- Custom validation rules
Monitoring
Monitor your data transformations through:
- Real-time run status
- Field-level validation results
- Macro statistics
- Performance metrics
Iteration
Improve your pipelines through:
- Direct code inspection
- Interactive workshopping
- Test-driven development
- Version control