The quickest way to get started with flat, tabular data is to upload a sample CSV file with your desired output format. Lume will automatically generate a target schema for you! For nested or complex data structures, we recommend building your schema manually using this guide.
If you prefer to build your schema manually or need to customize an existing one, this guide will walk you through the process.
Schema Basics
Target schemas in Lume use JSON Schema format to define your desired output structure. Each field requires:
A field name
One or more data types
A clear description of the field’s business meaning and context
Basic Field
Business Context
{
"customer_name" : {
"type" : [ "string" ],
"description" : "The full legal name of the customer as it appears on official documents"
}
}
Write clear, specific descriptions that explain your business’s unique requirements and context. For example, specify if “revenue” means monthly recurring revenue, annual revenue, or revenue before returns. Learn more about writing effective descriptions in our Creating Field Descriptions guide.
Field Types
Common JSON Schema types include:
string
: Text data
number
: Numeric values
integer
: Whole numbers
boolean
: True/false values
array
: Lists of values
object
: Nested structures
null
: Missing or undefined values
Data Classification with Enums
Use enums to classify data into specific categories:
{
"subscription_tier" : {
"type" : [ "string" ],
"description" : "The customer's subscription level" ,
"enum" : [ "free" , "basic" , "premium" , "enterprise" ]
}
}
Validation Rules
JSON Schema provides several validation options:
{
"phone_number" : {
"type" : [ "string" ],
"description" : "Customer's contact phone number" ,
"pattern" : "^ \\ +?[1-9] \\ d{1,14}$" ,
"minLength" : 10 ,
"maxLength" : 15
}
}
{
"age" : {
"type" : [ "integer" ],
"description" : "Customer's age in years" ,
"minimum" : 0 ,
"maximum" : 120
},
"success_rate" : {
"type" : [ "number" ],
"description" : "Success rate as a decimal" ,
"minimum" : 0 ,
"maximum" : 1 ,
"exclusiveMaximum" : true
}
}
Complete Example
Here’s a complete target schema example:
{
"type" : "object" ,
"properties" : {
"customer_id" : {
"type" : [ "string" ],
"description" : "Unique identifier for the customer" ,
"pattern" : "^CUST \\ d{6}$"
},
"full_name" : {
"type" : [ "string" ],
"description" : "Customer's full legal name"
},
"email" : {
"type" : [ "string" , "null" ],
"description" : "Primary contact email address" ,
"format" : "email"
},
"account_type" : {
"type" : [ "string" ],
"description" : "Type of account held by the customer" ,
"enum" : [ "personal" , "business" , "enterprise" ]
},
"monthly_spend" : {
"type" : [ "number" ],
"description" : "Average monthly spend in USD" ,
"minimum" : 0
},
"is_active" : {
"type" : [ "boolean" ],
"description" : "Whether the customer account is currently active"
}
},
"required" : [ "customer_id" , "full_name" , "account_type" ]
}
Best Practices
Clear Descriptions : Write clear, specific descriptions that explain the business meaning of each field
Appropriate Types : Use the most specific type(s) possible for each field
Validation Rules : Add validation rules where appropriate to ensure data quality
Required Fields : Mark essential fields as required in the schema
Consistent Naming : Use consistent field naming conventions throughout your schema
Remember: Focus on describing what each field means, not how to transform it. Lume handles the transformation logic automatically!
Advanced Schema Structures
Nested Objects
Your schema can include nested objects to represent complex data structures:
{
"billing_address" : {
"type" : [ "object" ],
"description" : "Customer's billing address details" ,
"properties" : {
"street" : {
"type" : [ "string" ],
"description" : "Street address including unit number"
},
"city" : {
"type" : [ "string" ],
"description" : "City name"
},
"state" : {
"type" : [ "string" ],
"description" : "State or province code" ,
"minLength" : 2 ,
"maxLength" : 2
},
"postal_code" : {
"type" : [ "string" ],
"description" : "Postal or ZIP code"
}
}
}
}
Arrays
Use arrays to represent lists of values or objects:
Simple Array
Array of Objects
{
"tags" : {
"type" : [ "array" ],
"description" : "List of tags associated with the customer" ,
"items" : {
"type" : [ "string" ]
}
}
}
Database-Based Schemas
Your schema can mirror database tables and relationships:
{
"user" : {
"type" : [ "object" ],
"description" : "User record from the database" ,
"properties" : {
"id" : {
"type" : [ "integer" ],
"description" : "Primary key from users table"
},
"departments" : {
"type" : [ "array" ],
"description" : "Departments this user belongs to" ,
"items" : {
"type" : [ "object" ],
"properties" : {
"dept_id" : {
"type" : [ "integer" ],
"description" : "Foreign key to departments table"
},
"role" : {
"type" : [ "string" ],
"description" : "User's role in this department" ,
"enum" : [ "member" , "lead" , "manager" ]
}
}
}
}
}
}
}
Field names cannot contain periods (.) as this is a protected character in Lume. Use underscores or camelCase instead:
❌ user.first.name
✅ user_first_name
✅ userFirstName