LogoLogo
  • Preen
  • Getting Started
    • Installation
    • Hello World
      • Configuring Sources
      • Creating Models
  • Concepts
    • Overview
    • Sources
    • Models
    • Validation
  • Documentation
    • Config
      • Sources
      • Models
    • Integrations
      • Databases
        • Postgres
        • MySQL
        • MongoDB
      • Cloud Blob Storage
        • Amazon S3
      • File Formats
        • CSV
Powered by GitBook
On this page
  • Overview
  • CLI Commmands
  • Code References
  1. Concepts

Validation

how is data validated?

PreviousModelsNextConfig

Last updated 7 months ago

Overview

When collating data from multiple sources, it is possible that the data types of the columns do not match. For example, a column may be defined as a string in one source and as an int in another. Preen will attempt to coerce the data types of the columns to the most common data type across all sources. We do this by implementing a . If we are unable to determine the data type of a column, we will error out and require manual intervention.

Note: There will be cases where you need to manually cast the data types of the columns in your model.

We store the results of the validation step in a DuckDB table called preen_information_schema. You can use this table to inspect the results of the validation step and to cast the data types of the columns in your model.

CLI Commmands

preen source validate

Code References

majority voting algorithm
metadata.go
columns.go