In this rapidly evolving space of large language models (LLMs), it’s clear that no current model rivals the performance of OpenAI’s GPT-4 for the majority of significant business tasks. Nevertheless, GPT-4’s implementation faces several hurdles in enterprise applications, including security, latency, throttle limits, and cost. While APIs from OpenAI, Google, and Amazon will undeniably be a part of your infrastructure, there will be times when you need a locally controlled and operated LLM.
The good news is, several recent advancements offer a promising roadmap for developing your own. Let’s take a look at these building blocks, arranged by increasing complexity:
1. High-Performance Open Source LLM: At present, the Falcon model is the top-performing open-source LLM available for commercial use. Visit Hugging Face’s Open LLM leaderboard to compare various models. The Falcon model might serve your use case sufficiently, without the need for further adaptations.
2. Fine-Tuning the Model: If you need to enhance your model’s performance for a specific task, consider fine-tuning it. A couple of weeks ago, this operation was incredibly resource-intensive, both in terms of data and computation time. However, the QLoRA technique now allows you to fine-tune a model as large as Falcon (40 billion parameters) on a single GPU within 24 hours. In fact, you might even be able to do it on a single MacBook Pro. This should yield GPT 3.5 level performance, if not better, for many tasks.
3. Leverage Multiple Models: The next tier of complexity might require a few weeks of data science or software engineering to adjust to your specific application. Techniques like SMARTGPT or Tree of Thought provide ways to arrange multiple GPUs in an architecture to drastically improve performance. You might even be able to merge your local models with fewer API calls to GPT-4, enhancing performance while sidestepping GPT-4’s substantial constraints.
These strategies represent the best options available today, considering the current tools and research. Keep in mind, all three developments occurred within the last three weeks — the pace of change in this field is astounding. And given the massive economic incentives for big tech companies like Meta (formerly Facebook), NVIDIA, and Apple, we should expect one of them to invest in and release a highly capable open-source LLM that approaches GPT-4’s performance in the coming months. Meanwhile, you can begin constructing your solutions using these techniques, with the assurance that you can refine them as superior alternatives emerge.
PROLEGO GREATEST HITS:
Prolego is an elite consulting team of AI engineers, strategists, and creative professionals guiding the world’s largest companies through the AI transformation. Founded in 2017 by technology veterans Kevin Dewalt and Russ Rands, Prolego has helped dozens of Fortune 1000 companies develop AI strategies, transform their workforce, and build state-of-the-art AI solutions.