Companies only begin investing in AI when an “AI killer-app”—a must-have competitive asset for every company in the sector—arrives. Computer vision is an AI killer-app for automotive. Recommendation engines are an AI killer-app for ecommerce. Predictive analytics is an AI killer-app for retail. When visionary companies started showing results with these technologies every competitor soon followed.
But what is the AI killer-app for insurance? Sure, many insurance companies (particularly property and casualty) have pilot projects using computer vision. And most insurance companies have a data science team running offline models to assist sales and marketing.
But there hasn’t been an AI killer-app for insurance—until now. That application is natural-language processing, or NLP. The confluence of 3 powerful economic and technology forces will drive insurance companies to rapidly modernize their technology infrastructure and invest in AI to keep up with competitors.
“We spend money on lawyers and support”
--CTO at a Fortune 100 financial services company
NLP is an AI technology concerned with the interaction between computers and human (i.e. “natural”) languages. The data used by NLP business applications is the unstructured text used by every department such as:
Spend some time with ANY department and you will find unstructured text and hundreds of people interacting with it. In particular you will notice that much of this labor is dedicated to classifying text - that is, reading the text so it can be routed, organized, or acted upon according to the business need.
Lawyers organize contracts based on legal exposure. QA searches for customer calls which may indicate support problems. Marketing and sales want to know which customer emails indicate a risk of cancellation or opportunity for upsell. HR wants to automatically sort resumes in search of potential call center reps most likely to succeed in the job.
Many companies embarked on initial NLP pilot projects which require text generation - such as automated chat. Unfortunately text generation technology is still too immature to apply to high-value business situations. Task classification applications are easier to build and integrate into operations.
In 2019 researchers developed NLP transfer learning techniques. Transfer learning significantly reduces machine learning data requirements by pre-training models on other data. This discovery has significantly reduced the costs of building practical NLP models for two reasons:
A year ago most insurance companies did not have the budgets to build custom NLP applications. Today our engineers have built world class, customized NLP models for our clients in weeks.
AI skeptics often claim that recent breakthroughs are only the result of bigger data sets and faster computers. A review of recent NLP literature reveals the fallacy of this claim. Innovations such as recurrence, attention, and embeddings are allowing researchers to achieve new NLP breakthroughs daily. The following example illustrates this trend.
In late 2019 the Allen Institute published a paper (see https://www.arxiv-vanity.com/papers/1909.01958/) describing their approach for building NLP applications to correctly answer 90 percent of the questions on the Grade 8 New York Regents Science Exam.
The researchers also charted the results they achieved applying the latest NLP techniques to take the exam since 2014.
Notice how the state-of-the-art results asymptotically approached 75% accuracy each year—until the large breakthrough in 2019. Rarely does technology advance at such rapid rates after decades of research.
Still not impressed? Take a look at the test itself at https://www.nysedregents.org/grade8/science/619/ils62019-examw.pdf. (The researchers only considered the multiple-choice, non-diagram questions). For example,
Answering these questions requires a deep understanding of words and context. NLP models are starting to understand the meaning of natural language—such as the properties of metals. What do these results mean for insurance companies?
While you don’t need AI models to answer standardized tests, you can apply the same techniques to develop customized NLP applications in your business. You can build customized models which understand the meaning of your products, your contracts, and your customers. These models can be trained to automatically read, listen to, and classify unstructured text about your business—and ultimately automate your business processes.
In this paper we explain how to do it.
The next time you walk through the office, take a look at what most people do. Then ask yourself the following questions:
NLP will impact the insurance industry the way electric and autonomous vehicles are impacting the automotive industry—it will change everything. You will still have lawyers, but their time will be spent making strategic recommendations and not reading and evaluating contracts. Your support team will be a smaller group of higher-skilled representatives who are leveraging algorithms to engage customers. Marketing and sales will be alerted the instant a customer conversation Underwriting will become faster, increasingly automated, and human review only be required in highest-risk cases. Potentially fraudulent claims will be automatically flagged by algorithms which read them.
The insurance companies which cannot make this transition will have a higher cost structure and slower decision process—and will struggle to maintain competitive advantage.
Have no doubt, your competitors are already gearing up for this future.
Your data is fragmented and disorganized. You struggle to recruit and retain technical talent. Your employees and peers don’t want to change. Your budgets are strained.
Fear not, your competitors—even those who boast about their AI capabilities—are in the same situation. NLP technology is only now becoming economically feasible for companies without Google-sized budgets. While the window of opportunity may begin closing in 2 years, you still have time to be a leader in the NLP revolution.
Most of our clients begin looking for specific AI use-cases—for example, a chatbot, recommendation engine, or contract search tool. While specific applications are easier to understand, most quickly realize they need AI capabilities which they can leverage across applications.
For example, building a competency in NLP requires the ability to:
90% of what you read about AI addresses only one of these capabilities: training machine learning models. But building systems which solve real business processes requires scalable infrastructure and capabilities. Rebuilding this same infrastructure for every NLP application is economically infeasible.
Additionally, any specific NLP application can succeed or fail for unpredictable reasons: the data is too messy, business customers or systems are not ready for adoption, etc. For these reasons our clients usually pursue multiple NLP initiatives in parallel.
Finding potential NLP projects will not be hard. Set up a few meetings with your business line leaders (HR, operations, support, claims, legal, underwriting, marketing, sales) and ask them the following questions:
Most of our clients quickly generate a list of 20-30 potential NLP projects following this technique.
The following NLP projects are being actively pursued by Fortune 1000 insurance companies:
Create a supplemental image which illustrates these example projects. Maybe a grid or something like that? I just want it to jump out at the reader.
Of course these are only examples of NLP projects—the scope of total AI projects is much, much broader.
Part 3 of our book Become and AI Company in 90 Days (download a copy) provides a framework for investigating and ranking potential AI initiatives. This same approach applies to NLP applications.
Also consider the following when decided which projects to pursue:
Since creating NLP capabilities will be one of your top priorities, reduce your risk by choosing projects which have a higher probability of success. For example, using chat to fully automate your support is too hard. Instead, start improving your service by creating applications which make your support team more successful.
Building NLP applications requires some time and feedback from your business partners. Start with your business partners who are excited about AI and have a conceptual understanding of how models work.
NLP projects start with an investment in converting the data into a machine-readable format, usually a .txt file.
NLP projects have three distinct phases requiring different skillsets. Your first NLP project should take ~6 months. Subsequent projects will be faster as you build your technical foundation.
Unstructured text needs to be converted from other digital formats (e.g. email), images (e.g. contracts), or speech (e.g. call center recordings) to .txt format to train the ML models. Depending on the variety and complexity of the language involved you will need at least 1000 documents.
Time: 0-2 months depending on data format.
Skills: General IT. No expertise required.
Technology: Leverage tools (i.e. OCR, ASR) from the cloud providers.
Outcome: Your unstructured data is converted to .txt format and stored in an accessible location.
A data scientist needs to organize the data into a format consumable by the machine learning models. Additionally, the models need to be trained and iteratively improved through feedback with the end-user or business owner.
Time: 2-4 months.
Skills: Senior data scientist with experience building deep learning models with neural networks. NLP background is helpful but not required.
Technology: Server with GPUs, Jupyter notebooks, Python, Pytorch/Tensorflow, other open-source language-processing packages.
Outcome: Prototype machine learning models running in Jupyter notebooks. Results used to evaluate whether to deploy into production.
Finally data engineers need to build the software to run the models as a production application. While the scope and timeframe can vary, let the following best practices guide your planning:
Time: 3-6 months.
Skills: Senior data engineer with experience building custom server-side pipelines, devops, dataops, model ops, and APIs.
Technology: Basic data processing infrastructure for building cloud-based production applications (servers, databases, code repositories, etc). Where possible, leverage your existing infrastructure and vendors.
Outcome: NLP model running in your production environment with results available to you end-users or applications through APIs.
NLP is a fundamental technical capability which will have impact across your organization. Like any technical capability, you will both buy and build your own solutions depending on your technical strategy and use cases.
The general buy vs. build guidelines also apply to NLP:
For example, don’t build an NLP application which automatically populates expense reports with receipts. Expensify (https://www.expensify.com/) has already developed this solution and you can buy it for a fraction of the cost.
Do build an NLP application which automates the support of your financial services products based on your business strategy.
Venture capitalists have invested billions of dollars into NLP companies run by world-class teams with great technology. Unfortunately most of these companies are still struggling to get adoption and burning through investor cash.
After exhaustive search our clients typically don’t adopt them for a few reasons:
Start by betting fundamental infrastructure projects which you can leverage across all potential NLP initiatives. For example, investing in OCR solutions and text data stores is a capability you can leverage across NLP projects whether you build and manage your own solutions or 3rd-party products.
Did you just carefully read through this entire article? Do you share our vision and see how NLP will completely change your industry? If so, congratulations—most of your peers will see the headline, note it as something interesting, and will move on to check their email or attend to another pressing matter.
But what will you do next? I hope you choose to take action. Take a small step forward towards helping your company begin preparing for this revolution. For instance:
I hope you act because NLP is coming—fast. Your competitors are investing in it and the technology will only get better.
March 31, 2021
In this video Justin Pounders, Director of Machine Learning and AI Research at Prolego, breaks down natural language generation (NLG) into its most basic components and describes how you can begin building out these components in your business. (And, no, it doesn’t depend on GPT-3!) He describes how NLG depends critically on two questions (WHAT you want to say and HOW you say it), the types of data you can feed into NLG systems, and a development path for being able to summarize multiple sources of data in plain English.
March 30, 2021
Like most engineers, I hate tedious work. That’s why I love the idea of automatic machine learning (AutoML). As much as I want to love AutoML, it’s been incorrectly framed as a substitute for data scientists. This confusion arises from a misunderstanding of what actually happens in machine learning projects.
March 24, 2021
Document analysis and understanding is an active area of research in the applied NLP community. In this talk, we demonstrate an unsupervised method to organize a body of text into a set of topics and outliers. This approach uses a transformer model that has been fine-tuned for semantic similarity (SentenceTransformers hyperlink: sbert.net). It can be used to quickly review a large set of documents to identify areas of interest or concern without requiring a human to exhaustively read through each document one-by-one. We demonstrate this approach applied to the lyrics of an early-2000s hit musical piece.