The quickest way to get started with flat, tabular data is to upload a sample CSV file with your desired output format. Lume will automatically generate a target schema for you! For nested or complex data structures, we recommend building your schema manually using this guide. 
 
If you prefer to build your schema manually or need to customize an existing one, this guide will walk you through the process. 
Schema Basics  
Target schemas in Lume use JSON Schema  format to define your desired output structure. Each field requires: 
A field name 
One or more data types 
A clear description of the field’s business meaning and context 
 
Basic Field
Business Context
{  
  "customer_name" : {  
    "type" : [ "string" ],  
    "description" :  "The full legal name of the customer as it appears on official documents"  
  }  
}  
 
Write clear, specific descriptions that explain your business’s unique requirements and context. For example, specify if “revenue” means monthly recurring revenue, annual revenue, or revenue before returns. Learn more about writing effective descriptions in our Creating Field Descriptions  guide.  
Field Types  
Common JSON Schema types include: 
string: Text data 
number: Numeric values 
integer: Whole numbers 
boolean: True/false values 
array: Lists of values 
object: Nested structures 
null: Missing or undefined values 
 
Data Classification with Enums  
Use enums to classify data into specific categories: 
{  
  "subscription_tier" : {  
    "type" : [ "string" ],  
    "description" :  "The customer's subscription level" ,  
    "enum" : [ "free" ,  "basic" ,  "premium" ,  "enterprise" ]  
  }  
}  
 
Validation Rules  
JSON Schema provides several validation options: 
{  
  "phone_number" : {  
    "type" : [ "string" ],  
    "description" :  "Customer's contact phone number" ,  
    "pattern" :  "^ \\ +?[1-9] \\ d{1,14}$" ,  
    "minLength" :  10 ,  
    "maxLength" :  15  
  }  
}  
{  
  "age" : {  
    "type" : [ "integer" ],  
    "description" :  "Customer's age in years" ,  
    "minimum" :  0 ,  
    "maximum" :  120  
  },  
  "success_rate" : {  
    "type" : [ "number" ],  
    "description" :  "Success rate as a decimal" ,  
    "minimum" :  0 ,  
    "maximum" :  1 ,  
    "exclusiveMaximum" :  true  
  }  
}  
 
Complete Example  
Here’s a complete target schema example: 
{  
  "type" :  "object" ,  
  "properties" : {  
    "customer_id" : {  
      "type" : [ "string" ],  
      "description" :  "Unique identifier for the customer" ,  
      "pattern" :  "^CUST \\ d{6}$"  
    },  
    "full_name" : {  
      "type" : [ "string" ],  
      "description" :  "Customer's full legal name"  
    },  
    "email" : {  
      "type" : [ "string" ,  "null" ],  
      "description" :  "Primary contact email address" ,  
      "format" :  "email"  
    },  
    "account_type" : {  
      "type" : [ "string" ],  
      "description" :  "Type of account held by the customer" ,  
      "enum" : [ "personal" ,  "business" ,  "enterprise" ]  
    },  
    "monthly_spend" : {  
      "type" : [ "number" ],  
      "description" :  "Average monthly spend in USD" ,  
      "minimum" :  0  
    },  
    "is_active" : {  
      "type" : [ "boolean" ],  
      "description" :  "Whether the customer account is currently active"  
    }  
  },  
  "required" : [ "customer_id" ,  "full_name" ,  "account_type" ]  
}  
 
Best Practices  
Clear Descriptions : Write clear, specific descriptions that explain the business meaning of each field 
Appropriate Types : Use the most specific type(s) possible for each field 
Validation Rules : Add validation rules where appropriate to ensure data quality 
Required Fields : Mark essential fields as required in the schema 
Consistent Naming : Use consistent field naming conventions throughout your schema 
 
Remember: Focus on describing what each field means, not how to transform it. Lume handles the transformation logic automatically! 
 
Advanced Schema Structures  
Nested Objects  
Your schema can include nested objects to represent complex data structures: 
{  
  "billing_address" : {  
    "type" : [ "object" ],  
    "description" :  "Customer's billing address details" ,  
    "properties" : {  
      "street" : {  
        "type" : [ "string" ],  
        "description" :  "Street address including unit number"  
      },  
      "city" : {  
        "type" : [ "string" ],  
        "description" :  "City name"  
      },  
      "state" : {  
        "type" : [ "string" ],  
        "description" :  "State or province code" ,  
        "minLength" :  2 ,  
        "maxLength" :  2  
      },  
      "postal_code" : {  
        "type" : [ "string" ],  
        "description" :  "Postal or ZIP code"  
      }  
    }  
  }  
}  
 
Arrays  
Use arrays to represent lists of values or objects: 
Simple Array
Array of Objects
{  
  "tags" : {  
    "type" : [ "array" ],  
    "description" :  "List of tags associated with the customer" ,  
    "items" : {  
      "type" : [ "string" ]  
    }  
  }  
}  
 
Database-Based Schemas  
Your schema can mirror database tables and relationships: 
{  
  "user" : {  
    "type" : [ "object" ],  
    "description" :  "User record from the database" ,  
    "properties" : {  
      "id" : {  
        "type" : [ "integer" ],  
        "description" :  "Primary key from users table"  
      },  
      "departments" : {  
        "type" : [ "array" ],  
        "description" :  "Departments this user belongs to" ,  
        "items" : {  
          "type" : [ "object" ],  
          "properties" : {  
            "dept_id" : {  
              "type" : [ "integer" ],  
              "description" :  "Foreign key to departments table"  
            },  
            "role" : {  
              "type" : [ "string" ],  
              "description" :  "User's role in this department" ,  
              "enum" : [ "member" ,  "lead" ,  "manager" ]  
            }  
          }  
        }  
      }  
    }  
  }  
}  
 
Field names cannot contain periods (.) as this is a protected character in Lume. Use underscores or camelCase instead: 
❌ user.first.name 
✅ user_first_name 
✅ userFirstName