{"id":9491,"date":"2021-12-08T14:49:50","date_gmt":"2021-12-08T07:49:50","guid":{"rendered":"https:\/\/bestarion.com\/us\/?p=9491"},"modified":"2025-07-23T17:21:54","modified_gmt":"2025-07-23T10:21:54","slug":"recovering-data-quality-when-big-data-goes-wrong","status":"publish","type":"post","link":"https:\/\/bestarion.com\/us\/recovering-data-quality-when-big-data-goes-wrong\/","title":{"rendered":"Recovering Data Quality When Big Data Goes Wrong"},"content":{"rendered":"

In today’s data-driven world, having data to make decisions gives you a significant advantage… unless the data quality<\/a> is bad. See how Datafold can assist you.<\/p>\n

\"Recovering
Recovering Data Quality When Big Data Goes Wrong – Bestarion<\/figcaption><\/figure>\n

In today’s society, everything is based on data.<\/p>\n

Despite the fact that John Mashey coined the term over two decades ago, Big Data has risen to the forefront of technology in the last ten years. Companies have formed teams that use mathematical analysis and inductive statistics to uncover linkages and dependencies as a result of the Big Data hunt. This subset of Big Data engineers’ purpose is to use data to forecast events and behaviors, resulting in a competitive advantage for the company.<\/p>\n

In order to use data in this way, the data must be sound and dependable in the first place. In other words, attempting to make decisions based on faulty data is really worse than making a decision based on no data at all.<\/p>\n

\u201cGood business decisions cannot be made with bad data.\u201d<\/em> – Uber Engineering<\/p>\n

In this essay, I reflect on a lesson I learned from a past employer who tried to exploit data that turned out to be bad data. We’ll fast-forward to modern engineering methodologies that preserve data quality as part of the development lifecycle based on that lesson.<\/p>\n

<\/span>Taking a Look Back at the Real Estate Industry<\/span><\/h2>\n

Prior to Big Data, data warehouse (DW) and business intelligence (BI) tools were used to get insight into the state of a company’s operations. Even before then, information technologists were frequently recreating the wheel (in silos) in the hopes of gaining a competitive advantage through custom programming.<\/p>\n

At this time, I was fortunate enough to be working with a real estate industry leader. Despite being the market leader in their industry category, keeping a safe distance from competitors became a struggle.<\/p>\n

The length of time required to define, justify, and safeguard the amount of money charged to tenants became one of the company’s interest areas. Rather than charging a flat cost per square foot, additional data factors influenced the rent\u2014a price agreed upon by both parties.<\/p>\n

Consider the following five data points:<\/p>\n