The Lie of Data-Driven Culture

This article is the second part of this series, where I am going to be talking about Decision making, data, and Strategy, you can find the first part here.


Data-Driven Decisions, KPIs, Indicators, OKRs, etc. All of those buzz words are circulating the internet since everyone is talking about big data. I have started to notice this trend, especially within the Product Management Community talking about how to be Data-Driven. Well, it’s not a surprise to anyone since the amount of data generated by digitalization is unprecedented.


With the beginning of the COVID-19 Pandemic in 2020 companies accelerated their adaptation of technology and digitalization of their customer supply-chain interactions, which was followed by consumers moving dramatically towards online channels during the lockdown, and the companies in return have responded. A report by McKinsey and Company “How COVID-19 has pushed companies over the technology tipping point—and transformed business forever” they have mentioned how COVID has dramatically changed the consumer behavior which has forced the companies to adapt “Respondents are three times likelier now than before the crisis to say that at least 80 percent of their customer interactions are digital in nature.”


Furthermore, this shift in behavior forced founders and business owners to realize the value of being digital, and how important it is nowadays for the survival of the company. Which, has resulted in increased funding and investments into digital solutions, and accelerating the process of experimenting and innovation. While, adding more advanced technologies, as well as, tools into the workflows of the company.


McKinsey & Company Covid Report


Back in the old Corporate Offices decisions were made by the highest-paid person in the office dictating what should the company do. This mentality existed since companies were expertise-oriented. Nowadays with the rise of data, companies are pushing more and more into a data-driven mindset, because making decisions based on data reduces the risks and increases the potential for creating more useful outcomes. However, making decisions based on bad data has a similar risk, or even worse, than making decisions based on assumptions or untested hypotheses.


The Problem with Being Data-Driven


So, where is the problem with Data-Driven Culture, shouldn’t we all be more data-driven right? The truth is data does not tell you anything. It is not like you open up Google Analytics, and the data will come telling you stuff. Data is just there, it is up to you to interpret it and to pull insights out of it. Hence, there lays the problem. Humans in nature are not good statisticians and we are full of biases, any decisions based on gut feelings or instincts can be wildly wrong. Additionally, with all the piles of data that could be collected from all the tools available online, this could lead the decision-makers into making even worst decisions than making them solely based on opinions, or untested hypotheses.


An article came out from Harvard Business Review talking about the Big Data Analytics market, which is valued at 215.7 Billion Dollars in 2021 (Figure 2) to nobody’s surprise. But, what was shocking is a study done by IBM estimating the yearly cost of poor quality data in the US alone is around 3.1 Trillion Dollars in 2016 and, I guess this number would have doubled in 2021.



The Challenges of Being Data-Driven


The Natural human biases in decision making, including confirmation, availability, and anchoring biases, combined with poor data could lead to disasters. The problem with Data Cultures is that it is up to you to interpret data and to pull insights out of it. Doing this without knowing or having a reason, will put you in a tough position. But, there are more hidden problems for Data-Driven Culture behind the obvious ones. Here are some of them:


1. “Dirty” Data.


As I have mentioned earlier, the acceleration of companies trying to adapt to technology, resulted in companies acquiring data from multiple sources such as Social Media, Third-Party Data, and Partnerships, etc… Most of the data come in unstructured and different forms, companies think with a small amount of effort cleaning and organizing the data would make it better quality.


But, there lies the problem, because the cause of Dirty data is much more complicated than that. Also, we tend to forget the human element in the equation; for example faulty processes, ad hoc data policies, poor discipline in capturing and storing data, and external factors that are outside a company’s control. What is more important than accumulating data is the ways, and methods you gather and treat your data. No matter what is the industry, companies usually suffer from lots of inconsistencies. If you are not careful enough data pollution could become a serious problem causing you to lose thousands of dollars.





An article published by David Opolon on the World Economic Forum, where David gives a case study about the Danger of Dirty Data:


“A global financial institution conducted a big-data pilot project and identified what it thought was a pricing opportunity to increase margins by more than $50 million per year, or 10 percent of revenues. But the underlying data contained only invoice line items; it was missing important metadata about how the bank had calculated and applied fees. In the three months it took to correct the data quality issues and implement the pricing strategy, the company lost more than a quarter of its potential profits for the first year, equal to at least $15 million. It also lost agility in seizing an important opportunity.”


Data Pollution could come in several forms and shapes starting from hardware problems, such as device pollution, where the user owns and uses multiple devices to browse the Internet to 3rd party tools pollution. Where those tools bots are triggering each other leading to inflating your data; (for example when you use multiple analytics tools) at the end analyzing wrong data is more dangerous than you think it might be.


2. Vanity Metrics & Data Manipulation.


Vanity metrics are data points that appear impressive to others but do not inform future strategies, or present success. These numbers are easy to manipulate, either purposely or unintentionally, making it easier to imply success without them being meaningful results, regarding overall goals. Usually, those broad metrics like the number of users, page views, are used to convince investors and advisors to believe in your company.


There is no definite list of vanity metrics for your team to avoid because any metric can be a vanity metric, they are surface-level metrics. They are often large measures, like the number of downloads that impress others. Vanity metrics are notorious for being somewhat basic, and often misleading.


Using surface-level metrics as an indicator for success is the shortest path to crash the company because this will create a culture of unrealistic goals which will result in putting the employees under insane pressure to keep the management satisfied. Marketers, Product Managers, Designers, etc. tend to take this easy route with vanity metrics, not because they are lazy, but because they are under pressure to show immediate success to superiors. Jill Avery, a senior lecturer at Harvard Business School, and co-author of HBR’s Go To Market Tools explains, “CFOs are under tremendous pressure to deliver quarterly earnings, and may not be patient for the longer-term effects of marketing to take hold. You’re asking them to believe in forwarding movement in a progression through a customer’s purchase journey, and that can take a long time.


As far as the numbers go, vanity metrics look great on paper. But, the sheen on these numbers fades when you use them to explain important business outcomes like ROI or customer lifetime value (CLTV); they become hollow digits that contribute little substance to your business goals.


3. The illusion of Statistical Significance


As mentioned at the beginning of this article, due to the pandemic lots of companies, have started adapting more technologies into their operations and business decision-making. In the McKinsey & Company Report, you can see an explosive adaptation of technology where it was expected to take between 672 and 635 days to adapt new technologies into the operations and business decision-making but it took organizations around 26.5 to 25.4 days to get it done.




Due to this insane adaptation of technology into decision making lots of experimentation was required and I quote “At the organizations that experimented with new digital technologies during the crisis, and among those that invested more capital expenditures in digital technology than their peers did, executives are twice more likely to report outsize revenue growth than executives at other companies.” Well, this revenue growth has encouraged a culture of experimentation.


The first thing that comes to my mind when the word experimentation is mentioned, is 95% Statistical Significance and the P-Value those 2 buzzwords are really popular in the Product and Marketing Communities. But, beware if they are misused; those 2 words could become a source of harm to your company.


There is an article published by CXL talking about “Statistical Significance Does Not Equal Validity”, “Yes, your testing tool said you had a 95% statistical significance level (or higher). Well, that doesn’t mean much. Statistical significance and validity are not the same.” this is the key point lots of the Marketing Seniors, Product Managers, Etc… are always seeking statistical significance and as soon as they have that they stop the experiment because their analytical tool has told them that they have reached 95% statistical significance level.


In addition, not paying attention to the other factors in the experiment could cause you to create lots of decisions based on incorrect observations, or False Positive results (Error Type I). A great example demonstrating that, is when Airbnb was testing to change the maximum value of the price filter on the search page from $300 to $1000. The experiment has achieved a 95% statistical significance level for one of the variants after one week of running it but, after they let the experiment run for a while the P-Value started to regress to the mean, which means that there is no difference between the control and the treatment.


What I’m trying to say is simple, technology does help us to make our decision-making process more efficient. But, that doesn’t mean you don’t have to do your own diligence to make sure you are making the right decisions.




In the Era of the internet, data have become an essential part of the decision-making framework but, if you are not careful enough in how you treat, gather, and employ your data it might be the reason why your company didn’t make it. The Challenges mentioned in this article are just the tip of the Iceberg, I encourage you to educate yourself more about the risk of a data-driven culture.


But the question is how could we use data to help us guide our decision-making process? This would be the next topic to talk about in the third and final part of this series. The Era of Experience-Driven Culture.


In the end, I would like to finish this article with a quote for Darrell Huff from his book How to Lie with Statistics ― “IF YOU can’t prove what you want to prove, demonstrate something else and pretend that they are the same thing. In the daze that follows the collision of statistics with the human mind, hardly anybody will notice the difference. The semi-attached figure is a device guaranteed to stand you in good stead. It always has.”