Target schemas in Lume use YAML format to define your desired output structure. A target schema requires a models section and each entry requires a name field and columns section. Each column entry contains the following:
A field name
A clear description of the field’s business meaning and context
A set of tests to validate your transformed data
Copy
Ask AI
models: - columns: - name: customer_name description: "The full legal name of the customer as it appears on official documents"
Write clear, specific descriptions that explain your business’s unique requirements and context. For example, specify if “revenue” means monthly recurring revenue, annual revenue, or revenue before returns. Learn more about writing effective descriptions in our Creating Field Descriptions guide.
Within your Target Schema, you can define an enum set that will trigger Lume’s classification module. The classification module will classify the transformed source data needed to fit your target field to one of your options if it fits. Here is an example of defining an enum set of Apparel, Electronics, and Perishable for the field category:
The YAML schema provides built in test options: unique, not_null, accepted_values, and relationships. Here is an example using those tests for an orders model:
models: - name: customers description: "Customer records and metadata" columns: - name: customer_id description: "Unique identifier for each customer" tests: - not_null - unique - name: customer_name description: "Full legal name of the customer" tests: - not_null - name: customer_type description: "Type of customer (e.g., individual, business)" tests: - accepted_values: values: ["individual", "business"] - name: orders description: "All customer orders" columns: - name: order_id description: "Primary key for the order" tests: - not_null - unique - name: customer_id description: "Foreign key to customers" tests: - not_null - relationships: to: ref('customers') field: customer_id - name: status description: "Current status of the order" tests: - accepted_values: values: ["pending", "shipped", "delivered", "cancelled"] - name: payments description: "Payments made toward orders" columns: - name: payment_id description: "Unique ID for the payment record" tests: - not_null - unique - name: order_id description: "Associated order ID" tests: - relationships: to: ref('orders') field: order_id - name: payment_method description: "Method of payment (e.g., credit card, PayPal)" tests: - accepted_values: values: ["credit_card", "paypal", "bank_transfer"]
When working with ecommerce data, product catalogs often require specific schema structures to handle product attributes, variants, and categorization. Here’s an example schema that demonstrates common ecommerce patterns:
Copy
Ask AI
models: - name: products description: "Core product information and metadata" columns: - name: product_id description: "Unique identifier for each product (SKU)" tests: - not_null - unique - name: product_name description: "Display name of the product" tests: - not_null - name: product_type description: "Main product category (e.g., physical, digital, subscription)" tests: - accepted_values: values: ["physical", "digital", "subscription", "service"] - name: brand description: "Manufacturer or brand name" tests: - not_null - name: status description: "Current product status in the catalog" tests: - accepted_values: values: ["active", "draft", "archived", "discontinued"] - name: category description: "Primary product category" tests: - accepted_values: values: ["clothing", "electronics", "home", "beauty", "sports"] - name: product_variants description: "Product variations (size, color, etc.)" columns: - name: variant_id description: "Unique identifier for the variant" tests: - not_null - unique - name: product_id description: "Reference to parent product" tests: - not_null - relationships: to: ref('products') field: product_id - name: color description: "Product color variant" tests: - accepted_values: values: ["red", "blue", "green", "black", "white", "yellow"] - name: size description: "Product size variant" tests: - accepted_values: values: ["XS", "S", "M", "L", "XL", "XXL"] - name: material description: "Product material variant" tests: - accepted_values: values: ["cotton", "polyester", "wool", "silk", "leather"]
Lume currently supports only single-level category hierarchies in the schema definition. If your product catalog requires multiple category levels (e.g., Clothing > Men > Shirts > T-Shirts), please contact Lume support for assistance with implementing a custom solution.