4 Ways Poor Data Slows AI Projects — And How to Avoid Them

According to IBM, data-related challenges are the top reasons businesses have halted or canceled AI projects, and Forrester reported that data quality is among the biggest AI project challenges.

Having a comprehensive data strategy will be crucial to the outcome of enterprise AI project success.

Download our free ebook, 6 Frameworks to Empower Your Team to Lead With AI

Discover how to create a data strategy that will optimize your enterprise AI projects.

The problem of data arises because, traditionally, businesses don’t have a data strategy and try to gather ad-hoc inputs for specific purposes. A survey of C-level executives from companies like Ford Motors and Johnson and Johnson showed that 50% of companies are not treating data as a business asset. Moreover, business leaders knew that people and processes were the problems more than any technology.

What does the lack of a data strategy look like?

A company’s data strategy includes planning around how data will be collected, stored, cleaned and made accessible to the rest of the company. It entails many other variables, such as data privacy, governance and safety.

When it comes to deploying AI projects without a proper data strategy, companies run into three different problems:


  • Lack of data: there isn’t sufficient data to train the AI/ML algorithm
  • Incomplete data: when data is missing or no information is stored for certain variables
  • Limited or no access to data: data is in silos within the organization and not available to everyone in the company.

4 Ways Lack of Data Strategy Can Negatively Impact Enterprise AI Projects

Hinders AI Exploratory Analysis

Exploratory analysis is the process of discovering what is and isn’t possible with AI. Companies can conduct two types of exploratory analysis.

The first way is to explore the data they have and determine what is possible with it. The second way starts with a pain point analysis to determine if AI can be the right solution.

Whichever method a business uses, it will need access to the right data to determine the feasibility of the AI project. A non-existent data infrastructure can wreck this effort.

Exploratory analysis also helps unearth potential issues an organization’s data has, such as data imbalance issues, before it embarks on a full-fledged AI project.

Stale Predictions and Recommendations

When it comes to making real-time decisions or predictions, businesses need to work with fresh data sets. Stale data can be a good way to compare trends or track progress but shouldn't be used to inform current business decisions.

Extracted data begins to get stale soon after it is generated, manual data is subjected to human error and data from a warehouse is only as fresh as the last time it was refreshed.

When an AI algorithm is fed stale data, the recommendations it makes are also irrelevant, not related to current business outcomes.

For instance, if AI systems used pre-pandemic data sets to make business predictions now, they would fall off the mark as consumer behavior and priorities have evolved in this time.

Organizations need to keep their data fresh because of:

  • Changes in customer behavior.
  • Changes in the company, such as growing from startup to maturity.
  • Regulation and policy changes.

A data strategy can help businesses ensure that the data they use for training their AI models are appropriate and relevant to their goals.

Low-Quality AI Models

Low-quality AI models are susceptible to making mistakes in their prediction or recommendation tasks.
In 2013, the University of Texas MD Anderson Cancer Center worked with IBM to develop a new “Oncology Expert Advisor” system, a clinical decision support technology powered by IBM Watson.

But IBM’s Watson wasn’t sufficiently trained with the right data sets using real-life patients or real-life data, and it made incorrect and often dangerous cancer treatment advice.

The issue clearly was data quality, resulting from data that was not centralized, only a subset was available, and the volume was small, which prevented the model from getting an accurate and comprehensive view of the context in which it was used.

Business leaders need to make provisions for data warehousing and integrating diverse data sources to make the data set more holistic. They also need to ensure data is accessible all through their company.

Reinforces Bias in Data and Society

A broken data infrastructure can introduce bias in AI algorithms. When data sets aren’t a true reflection or representative of the people or community they involve, the data set becomes skewed, and that effect is perpetuated in AI and ML models.

For instance, an MIT report found that a popular dataset used to train facial recognition systems was estimated to be 78% male and 84% white, with very little representation of other genders or races. Hence, the facial recognition systems that use this dataset are also biased.

A report from NIST confirmed that facial recognition algorithms used by top companies have inherent bias, “False positive rates are highest in West and East African and East Asian people, and lowest in Eastern European individuals…We found false positives to be higher in women than men, and this is consistent across algorithms and datasets.”

Business leaders have to ensure that their data is centralized, integrated and complete to represent all customers, employees, products and services. This may not completely eliminate bias but will minimize it.

Businesses appreciate that AI adoption is the way to stay competitive in the marketplace today. But without a well-crafted data strategy, AI projects can experience permanent setbacks. Data strategy is the foundation of all analytics and reporting capabilities in an organization and integral to AI adoption.

Sign up for the MIT SAP Data Strategy: Leverage AI for Business course to learn how to access the new business opportunities that data brings your organization and incorporate a data and AI strategy that embodies your business goals.

Data Strategy: Leverage AI for Business is delivered as part of a collaboration with MIT School of Architecture + Planning and Esme Learning. All personal data collected on this page is primarily subject to the Esme Learning Privacy Policy.


© 2021 Esme Learning Solutions. All Right Reserved.