Metadata

What is metadata?

The technical definition of metadata is data that gives information about other data. This is an apt description, but can be a bit confusing. Within bframe, metadata can be described as data that gives information about a specific business modeling or adminstrative object. For example, the id column is a piece of metadata that doesn’t have any business value within a SaaS operation, but it gives valuable information about the associated row. It’s much easier to find a specific row using the id rather than scanning the entire table manually. Metadata offers shortcuts or additional functionality to a data model. Since bframe is in the data warehouse, metadata is one of the primary levers for incorporating a new use case. Almost all bframe tables includesome type of metadata and this demonstrated by the customer table below.

Customers table description

column_name

data_type

is_nullable

is_metadata

id

BIGINT

NO

true

org_id

INTEGER

NO

true

env_id

INTEGER

NO

true

branch_id

INTEGER

NO

true

durable_key

VARCHAR

NO

true

ingest_aliases

VARCHAR[]

YES

false

created_at

TIMESTAMP

YES

true

effective_at

TIMESTAMP

YES

true

ineffective_at

TIMESTAMP

YES

true

name

VARCHAR

NO

false

archived_at

TIMESTAMP

YES

true

version

INTEGER

NO

true

Why is metadata important?

The metadata stored within bframe tables enable a plethora of functionality. A few salient use cases are uniqueness, slowly changing dimensions and branching logic. In general, metadata is one of the reasons modern data modeling is possible. File formats, incremental updates and data catalogs are just a few examples that require additional data about the data to be possible. Without metadata the amount of information processing possible would be an order of magnitude smaller.

How is metadata modeled in bframe?

Metadata can exist as a column or a table. There is no wrong answer on how to store it, just tradeoffs. As of this being written all metadata within bframe is colocated with each model in a column. This means that when new rows are created metadata will often be a required field. This isn’t always the case, but even if it’s optional these columns shouldn’t be ignored. Unfortunately there is some complexity overhead created due to the amount of metadata, but this can be seen as a one time cost since themajority of use cases can be generated automatically.

The corpus of metadata within bframe isn’t generic enough to describe a one size fits all approach. The following sections will describe different types of metadata, their use cases and how to model them.