Data freshness

What is data freshness?

Data freshness refers to how up to date the underlying data is. For example, if a newsworthy event occurred and it took one hour to appear in print the freshness of the news would be 1 hour. Within a business model, the data freshness would be based on the time since the revenue was calculated. So if customer usage occurred at 9am and it took until 11am for that usage to show up as revenue, the data freshness of the business model output would be 2 hours. Data freshness is often directly tied to metering since ingesting, processing and loading usage can result in a performance bottleneck.

Data freshness isn’t always referred to in the context of a specific timeframe. One example of this is when referring to data freshness as “realtime”. The term realtime can be misconstrued since based on the speaker it can have a variety of meanings. It can mean milliseconds, minutes, hours or even days in some contexts. The technical definition of realtime is that events are immediately processed as they come in, with little to no latency. In this definition, data freshness is measured in milliseconds. For the purposes of business modeling realtime often means less than a daily cadence.

../../_images/realtime_definition.png

Why is data freshness important?

Freshness is important because it dictates how the data can be used. If the data isn’t fresh enough then certain functionality isn’t feasible. The following table lays out different use cases and the quality of support for different timespans.

../../_images/needs_by_latency.png