If you are building your schema or need to customize an existing one, this guide will walk you through the process.

Schema Basics

Target schemas in Lume use YAML format to define your desired output structure. A target schema requires a models section and each entry requires a name field and columns section. Each column entry contains the following:

  1. A field name
  2. A clear description of the field’s business meaning and context
  3. A set of tests to validate your transformed data
models:
  - columns:
      - name: customer_name
        description: "The full legal name of the customer as it appears on official documents"

Write clear, specific descriptions that explain your business’s unique requirements and context. For example, specify if “revenue” means monthly recurring revenue, annual revenue, or revenue before returns. Learn more about writing effective descriptions in our Creating Field Descriptions guide.

Defining Enums for Classifications

Within your Target Schema, you can define an enum set that will trigger Lume’s classification module. The classification module will classify the transformed source data needed to fit your target field to one of your options if it fits. Here is an example of defining an enum set of Apparel, Electronics, and Perishable for the field category:

models:
  - columns:
    - name: category 
      description: Category of the product
      tests:
      - accepted_values:
          values:
          - Apparel
          - Electronics
          - Perishable

Classifications for SQL projects coming soon!

Defining Code Generation Language Preference

A user can define per model what language they would like Lume’s AI engine to generate code. Here is a quick example:

Language Specification
    models:
        - name: orders
          language: python
          columns:
            - name: order_id

Lume currently supports code generation in both SQL and Python.

Types of Default Tests

The YAML schema provides built in test options: unique, not_null, accepted_values, and relationships. Here is an example using those tests for an orders model:

Lume also provides built in support for DBT Utils Tests.

Complete Example

Here’s a complete target schema example:

models:
  - name: customers
    description: "Customer records and metadata"
    columns:
      - name: customer_id
        description: "Unique identifier for each customer"
        tests:
          - not_null
          - unique

      - name: customer_name
        description: "Full legal name of the customer"
        tests:
          - not_null

      - name: customer_type
        description: "Type of customer (e.g., individual, business)"
        tests:
          - accepted_values:
              values: ["individual", "business"]

  - name: orders
    description: "All customer orders"
    columns:
      - name: order_id
        description: "Primary key for the order"
        tests:
          - not_null
          - unique

      - name: customer_id
        description: "Foreign key to customers"
        tests:
          - not_null
          - relationships:
              to: ref('customers')
              field: customer_id

      - name: status
        description: "Current status of the order"
        tests:
          - accepted_values:
              values: ["pending", "shipped", "delivered", "cancelled"]

  - name: payments
    description: "Payments made toward orders"
    columns:
      - name: payment_id
        description: "Unique ID for the payment record"
        tests:
          - not_null
          - unique

      - name: order_id
        description: "Associated order ID"
        tests:
          - relationships:
              to: ref('orders')
              field: order_id

      - name: payment_method
        description: "Method of payment (e.g., credit card, PayPal)"
        tests:
          - accepted_values:
              values: ["credit_card", "paypal", "bank_transfer"]

Best Practices

  1. Clear Descriptions: Write clear, specific descriptions that explain the business meaning of each field
  2. Test Rules: Add test rules where appropriate to ensure data quality
  3. Consistent Naming: Use consistent field naming conventions throughout your schema

Remember: Focus on describing what each field means, not how to transform it. Lume handles the transformation logic automatically!

Property names in Lume’s API cannot contain periods (.).

Lume currently does not support custom test and macros.

  • our_custom_macros_test
  • not_null
  • unique