CHAPTER

Arrival Of AI Abundance

The growth in NLP research and open-source frameworks reflect what Prolego has witnessed over the past four years: NLP is an exponential AI technology that has tremendous potential, and NLP applications are rapidly improving at solving business problems. This trend will continue as leading companies invest in NLP applications and make the operational changes necessary to take advantage of them.

Figure 10 is my best estimate of the rate of NLP application adoption. It shows the major milestones on the path to abundance.

Let’s analyze each phase in detail.

FOUNDATIONAL PHASE (2016-2020)

In 2016 Google relaunched Google Translate by using neural networks instead of traditional statistical and rule-based methods. The New York Times Magazine dubbed this event the beginning of “The Great AI Awakening.”

Unfortunately Google’s approach required Google-scale data and Google-scale budgets. Like all exponential technologies in the foundational phase, AI was too difficult and too expensive for broader applications.

But researchers slowly chipped away at the biggest obstacles. In 2017 Google developed a new NLP algorithm called transformers. This algorithm was based on the concept of attention. Researchers began using attention to set new performance records on almost every NLP task.

Open-source machine learning libraries like PyTorch and TensorFlow made model training easier on NVIDIA’s graphical processing units.

In 2018 two innovations instantly lowered the cost of using transformers to build state-oftheart

NLP applications:

Jeremy Howard and Sebastian Ruder developed a transfer learning approach calledUniversal Language Model Fine-tuning (ULMFiT).

Google released Bidirectional Encoder Representations from Transformers (BERT) as a free, open-source language model based on transformers.

Developers could now download a free pre-trained transformer model and customize it for specific business problems.

Facebook, OpenAI, and other tech leaders soon produced improved language models. Hugging Face developed an open-source library that made it easier to build and train these models. Snorkel began releasing tools that developers could use to efficiently build NLP training data through weak supervision.

TRANSFORMATIONAL PHASE (2021-2025)

By the beginning of 2021, all major technical obstacles to cost-effectively building deep learning NLP applications were resolved. The NLP-based pilot projects that Prolego has built are demonstrating clear business impact for our clients. Other AI companies report similar results. These successes are evidence that the transformational phase of AI has begun.

Of course many challenges remain before we reach AI Abundance. Building and deploying AI applications that solve business problems is still incredibly hard. Here are the major problems that the technology will resolve over the next five years:

NLP requires specialized tools and teams.
Deploying NLP at scale is prohibitively difficult.
Important data is inaccessible.

NLP requires specialized tools and teams

The overwhelming majority of corporate NLP projects fail because the engineering teams don’t know how to build training data. Most online resources and tutorials about building NLP applications start with the assumption that the data scientist is working with labeled training data. Many companies attempt to label documents by using computer-vision techniques. But labeling documents is much harder than labeling images. Currently the most efficient approach relies on weak supervision.

A few teams know how to use weak supervision to label their training data. As those teams make breakthroughs and begin sharing results, best practices and better tools will soon be available.

Deploying and scaling AI is difficult

Deploying and scaling AI is difficult One of the biggest challenges for AI companies is machine learning at scale, often referred to as machine learning operations (MLOps). No best practices or tools exist for using NLP transformers to do MLOps. NLP adoption will happen slowly until tools and approaches mature.

Important data is inaccessible

Many companies—especially those that have been in business for decades—have a substantial amount of contracts, policies, and correspondence on paper rather than in digital files. Few companies capture and convert customer conversations and other spoken data into digital formats. The data in these resources is effectively inaccessible because companies can’t efficiently analyze it at scale without making large investments to digitize, store, and label it.

Most companies haven’t been able to justify the cost of converting analog data into digital formats. This dilemma will disappear as NLP projects demonstrate initial value. Companies will recognize the value of their data and begin broadly deploying optical character recognition and automated speech recognition to digitize it.

By 2024 most data-access problems will be solved, and we will start seeing high profile successes in the business press. Companies that currently dedicate large amounts of human labor to reading and extracting information from text documents will begin realizing cost savings from automating 80 percent of this work. For example, financial services companies can begin to unlock the value of large-scale unstructured data in contracts, policies, claims, applications, and forms.

The early adopters will begin to realize data-access efficiencies by 2025. Budgets will shift to AI from other initiatives. As they do, the pace of AI development will increase. The companies that have invested well will be able evolve their product offerings because they can make better informed decisions faster and at a fraction of the former cost.

ABUNDANCE PHASE (2026 AND BEYOND)

By 2026 all financial services companies will realize that AI is fundamentally changing the competitive landscape. Late starters will scramble to respond. After years of painfully slow growth, venture-backed startups will leverage their AI investments to steal market share from incumbents. Google, Apple, Amazon, Facebook, and Microsoft will acquire the financial-services leaders and begin offering direct-to-consumer insurance, banking, and financial services.

As the growth curve swings upward, the incumbents that invested in AI during the transformational phase will accelerate their investments to stay competitive. Funding will come from layoffs, spinoffs, mergers, and reorganizations that shift resources from human-driven business processes to AI.

What about companies that make little effort to prepare during the transformational phase? Unfortunately they won’t be able to catch up. When the abundance phase begins, their executives will rush to develop their AI strategy. The CEO and other officers will be fired. Their successors will begin making desperate moves to survive. Some will attempt to build in-house AI capabilities. Others will try to buy AI from third parties. Still others will invest in AI startups at insane valuations or sign one-sided partnerships just to regain their footing. But these efforts will be too expensive, too little, and too late, just as they were for Toys “R” Us in 2000.

After the AI wave hits financial services, the same pattern will repeat in the retail, manufacturing, hospitality and recreation, telecommunications, and media sectors. Healthcare, energy, and the public sector will follow. The leaders will ride the exponential adoption wave, and the laggards will never catch up.