Hidden Mistakes that Companies Make in their AI Journey

BrandPost By Keith Shaw
May 23, 2022
Cloud ArchitectureIT Leadership

Even with the strongest plan and company buy-in, AI projects can encounter problems if they fail to adapt to hidden data costs and poor infrastructure choices.

Credit: iStock

As more companies deploy artificial intelligence (AI) initiatives to help transform their businesses, key areas where projects can go off the rails are becoming clear. Many problems can be avoided with some advanced planning, but several hidden obstacles exist that companies don’t often see until it’s too late.

With a need for speed, organizations must also recognize the fact that almost half of AI projects never make it beyond the proof of concept stage. Blame can go in many directions — such as teams lacking necessary skill sets or little-to-no collaboration among data scientists, IT and business stakeholders. However, there are other reasons projects end up in the AI failure pile.

#1 Watching costs spiral due to data gravity

Many AI teams automatically assume that choosing a cloud-based infrastructure for their models is the best choice in terms of cost and speed. While this may be the case for experiments or initial prototypes, problems can arise when companies attempt to expand AI training to develop a production-ready model or when they see dataset sizes grow exponentially to fuel the AI algorithms.

With growing and more complex data sets, the issue of data gravity can sink an AI project with unmanageable costs if the infrastructure where data is generated is not proximal to the infrastructure where the AI models are to be trained. Data that gets created on premises (such as private financial data) or the edge (such as robotics or autonomous vehicles) can incur unwieldy storage expenses and an unnecessary speed bump in developer workflow when it needs to be moved to the cloud for training.

Teams should make sure that compute resources used for training are located as close to the data as possible. This can mean on-premises only, cloud only (if data is generated in the cloud), or even hybrid cloud models where early, light prototyping is done in cloud and then moved on premises or to a colocation data center as models and data sets grow.

#2 Treating AI as just another software project

Many companies assume that because AI is basically software, they can easily manage its development on existing computing, networking and storage infrastructure because they’ve done it before with other software development projects. But with its reliance on growing data sets, its iterative and highly recursive workflow and computationally-intensive algorithms, AI development is essentially a high-performance computing use case and requires disciplines and expertise in this specialized infrastructure.

“It’s like someone who is used to driving a minivan to pick up their kids at school or run to the grocery store is now handed the keys to a Ferrari, and they say ‘I know how to do this – it’s just driving’, ” says Matthew Hull, vice president of global AI data center sales at NVIDIA.

“While AI at its core is software, it’s a very different beast, and folks need to spend time learning about the nuanced differences between artificial intelligence at every layer, and building out a specific agenda.”

#3 Having a ‘set it and forget it’ mentality

Companies often think that once a model is successful, they can just keep it running in production and move on to the next project.

“The reality is that AI scales and evolves over time,” says Hull. “You need to scale the size of the models and the number of use cases, and you need to plan ahead for that scalability. If you lock yourself into one set of solutions and don’t plan for growth in the infrastructure and data, you’re not going to succeed.”

The reality is that as production data changes over time, businesses need to ensure their applications can give increasingly better predictive accuracy, which necessitates infrastructure that can keep pace. A successful AI strategy involves planning for the short, mid and long terms, as well as monitoring and progressing through those stages to grow the AI workflows.

#4 Choosing to go it alone

With a lot on the line around AI, many companies place the complete burden on the backs of their data scientists and developers. They are often hesitant to reach out to external experts that have run similar projects, and end up stalling or going down a road of trying to hire costly data science expertise.

Hull says companies need to find trustworthy outside expertise from different organizations — from supplementing data science expertise to designing the right infrastructure optimized for AI, to  implementing MLOps in their workflow. Companies like NVIDIA offer purpose-built systems, infrastructure, AI expertise and a comprehensive IT ecosystem so businesses can become more successful in driving more of their valuable AI ideas into full production deployments.

Expert partners can also help you avoid the other hidden mistakes discussed in this article, and put you on a solid path to successful AI.

Click here to learn more about ways to succeed in your AI strategy with NVIDIA DGX Systems, powered by DGX A100 Tensor core GPUs and AMD EPYC CPUs.

About Keith Shaw:

Keith is a freelance digital journalist who has written about technology topics for more than 20 years.