Advancing with Generative AI: The Long and Windy Road From Data to Performance
It took ChatGPT just 5 days to hit 1 million users, compared to 3.5 years for Netflix! The pace of generative AI adoption has been so staggering that even businesses are not immune to it. According to one survey, 44% of private sector businesses are already in various stages of AI adoption. Thanks to generative AIs, the AI market is expected to balloon into a $407 billion industry worldwide by 2027.
Much has been said about the benefits of AI for the private sector, which span from improved customer satisfaction to a significant boost in product innovation. As businesses race to embrace this new “tech magic” and unlock its untold “miracles,” they find that it doesn’t always work. Just this month, a logistics company in the UK found out, much to its chagrin, that its chat AI went rogue and penned down a poem ridiculing the company, thanks to prompts from a frustrated customer. It was promptly shut down soon after. Similar incidences of AIs turning racist, sexist, and even outright hallucinating abound in the blogosphere.
Such incidences, although rare, leave their users — organizations, employees, or customers — deeply dissatisfied with them, if not disillusioned with their capabilities altogether. They have the potential to derail AI adoption initiatives completely, even though such problems can be prevented — or at least minimized anyway — with the right planning.
The quality of data and training techniques plays a critical role in the quality of responses generative AI produces. While we’ll focus on them today, it’s also important to be familiar with the other elements crucial for AI success within organizations. So, we’ll briefly touch upon them before moving on to the crux of this post.
Identify Business Problems Suitable for AI Solutions
AI is the next big buzzword in the tech space, and everyone wants it. That doesn’t mean it’s suitable for your business. AI is good for solving problems that necessitate the ingestion of massive amounts of data to produce accurate, insightful, and actionable solutions or ideas to business problems. They are best suited for performing repetitive tasks on a routine basis. Do you have such tasks within your organization, and do you stand to benefit from making AI perform those tasks?
Selecting the Right AI Technology
There is a diverse range of options in AI technologies — generative, discriminative, transformer-based, etc. Then there’s the matter of choosing goal-based AI technologies — LLMs, image generation models, text-to-image translation AIs, and so on. We’re quite far from a general AI that can perform a range of different tasks. So, right now, we have to be content with purpose-designed AIs that excel at only one or a few related types of tasks like chatting, image generation, and so on.
It would be best if you also determined whether you'd build a new AI from the ground up or use a third-party AI with suitable customization, purposeful training, and proprietary elements.
Create a Supportive Organizational Culture
Not only do you need the right AI technology and talent with AI expertise, but you also need your employees to be able to extract maximum value from your AI applications. It would also help to make corporate data easily accessible to your AIs. This is only possible when an organization-wide culture fully embraces AI.
Naturally, this “must” take a top-down approach. Everyone from the top to middle management should put their weight behind making the AI initiative successful.
Plan the Implementation
Too many organizations rush through their AI deployment only to encounter bizarre issues with their initiative. AI implementation will affect the experiences of customers as well as employees. They should be able to accommodate those changes seamlessly. The smoother the transition, the higher the likelihood of your AI initiative’s success.
Assess Your Data Readiness
Do you have vast amounts of corporate data to train your AI? And, is that data easily accessible for AI algorithms? These questions determine whether your organization can enjoy the full benefits of AI adoption.
As you embark on AI adoption, you will be faced with numerous hurdles and options before you. Fortunately, you can anticipate and prepare for some of them, as discussed below:
- Data Collection, Labelling, and Accessibility
Organizations generate large quantities of data as part of their routine operations. This data can start from a variety of sources, including customers, employees, operations, and so on. The data may be stored in employees’ minds, discussion forums, internal chats, emails, office policies, reports, memos, payment gateways, etc. And the more diverse the sources, the greater the variety of data formats. Organizations must find a way to collect data from all these disparate sources (preferably in real-time), clean the data, and make it easily accessible to AI models. They may encounter some internal resistance because of siloed departments, but a top-down AI culture should help get the gears grinding.
- Content Curation
The factual accuracy of the knowledge content becomes paramount when training LLMs in sensitive areas such as STEM fields, law, finance, etc. There’s minimum room for error. The content must be highly accurate, reliable, timely, and not duplicated — constraints that necessitate human curators. For instance, Morgan Stanley employed nearly two dozen knowledge managers to curate the content to feed it to their LLM. Obviously, most organizations do not have such kind of resources.
- Creating Artifact (Embedding)
Before creative AIs, such as those using LLMs, can start churning your perfectly legible responses to questions, they should be able to “interpret” and “understand” both the data they’re supposed to analyze and the question asked. LLMs learn to do that by translating them into a language that they can understand, a.k .a. embedding. Embedding is how LLMs grasp the meaning of a word within the context of its usage; for instance, when the word “run” refers to “execute” among programmers and not the “physical act of running,” and vice versa.
In order to understand and use a word correctly, LLMs must digest ungodly amounts of text content, preferably that gives them plenty of examples of various word usage. LLMs then generate vectors (sets of values) to associate a word’s proximity to different words in its training data. Once they generate these vectors for all words from such data, they’ll finally be ready to use them in their responses.
Likewise, they must also master self-attention of pronouns when they’re used within articles. Consider these two sentences:
The child did not eat broccoli because it wasn’t hungry.
The child did not eat broccoli because it wasn’t cooked.
Self-attention is the ability of an LLM to know when the word “it” refers to the child (in the first sentence) and when it refers to the broccoli (in the second case). So, it’s not only enough that the LLM understands various words, but it should also be able to understand language, especially sentence structure, the way humans use it. That is if it will ever be useful to the vast majority of technically non-savvy end users — whether customers or employees.
Getting AI Ready for Performance
Even when organizations manage to cobble together sufficient data to train their AIs, it’s simply impossible for many of them to muster the resources required to build highly specialized teams of AI experts and supply the expensive computing power required to operate a capable AI. Depending on an organization’s ability to invest these resources, they have three AI development strategies before them. Here are a few of them:
- Build and Train an LLM From Scratch
ChatGPT 3.5 was reportedly trained on 570GB of data gathered from freely available web sources like Wikipedia, books, websites, research articles, and other types of text content. An astonishing 300 billion words were fed into its system.
And it’s not just a general LLM model that requires such significant data training. Take the case of BloombergGPT, a proprietary LLM developed by Bloomberg and integrated into its terminal to provide finance domain-specific content to users. Bloomberg had access to 40 years of financial data, which they promptly fed into the AI. It gobbled up 350 billion words, generated 700 billion tokens, and employed 70 billion parameters — all of which required 1.3 million hours of GPU time.
If your organization can invest the resources to embark on such an undertaking, the rewards would be worthwhile since the AI would be fully attuned to your unique needs and purposes.
- Fine-Tuning an Existing LLM
Suppose your organization does not have the resources to build an AI solution from scratch. In that case, you can take a third-party general AI already trained in general knowledge and language-based models and “fine-tune” it with domain-specific content. Since the LLM does not have to learn and master everything — all words in a language, for instance — from scratch, it would require only a fraction of data to gain domain-specific expertise. Instead of millions of documents, the LLM will only need a few thousand or hundreds of thousands of documents for training.
This strategy saves organizations tremendous amounts of time and costs in many cases (not always). Nevertheless, some data scientists argue that this method is not best suited for feeding new content to the LLMs but instead for adding new formats or styles (such as author styles).
- Prompt-tuning an Existing LLM
Prompt tuning is fast gaining traction as the most popular AI adoption strategy among businesses because it offers multiple advantages over other options. For starters, it is the quickest and most resource-efficient option for them. The original LLM is kept frozen in its original form. It is only modified on a private cloud or server using prompts in a context window containing domain-specific knowledge. Once it’s trained, the LLM will be able to provide domain-specific answers.
With this method, not to be confused with RAG (retrieval augmented generation), only a fraction of content and resources are required for domain-specific training, saving precious time. For instance, Morgan Stanley prompt-tuned the GPT-4 model from OpenAI using only about 100,000 documents. IBM provides detailed instructions on prompt training LLMs on ModelMesh with Kubeflow Pipelines.
When it comes to costs, this is the cheapest option available for businesses. Morningstar’s Mo LLM was reportedly trained on only 10,000 in-house documents and deployed to its financial advisors and independent investor customers. Over the span of a month, Mo racked up a mere $3,000 in costs to answer 25,000 questions from its users.
What’s Next?
Despite extensive testing and training, AIs continue to suffer from issues such as “hallucinations.” Data scientists are employing innovations like grounding, reinforcement learning by human feedback (RLHF), retrieval augmented generation (RAG), and so on to address them. Innovations in LLM development are having a cascading impact on other areas of AI development. Transformer models, which have the ability to predict words or even entire sentences when LLMs are offered partial sentences, are now being used to predict a variety of repeating patterns, including pixels in images. This allows users to improve blurred images or restore partially damaged ones.
As LLM development undergoes rapid evolution, AIs acquire new capabilities, and their development undergoes sharp cost-cutting to make them feasible for a growing number of businesses. So, even if AI development is prohibitively expensive for some organizations at the moment, it likely will not be pretty soon. And the clock’s ticking on the early mover advantage window for those who can afford it now. So, what’s next for your organization and AI?
— Christina shares candid insights and ideas based on her work, network, and passion for mobile, payments, and commerce. As a frequent speaker by invitation to international events -from entrepreneurial and education to executive audiences and settings, she has been recognized as ‘Top B2B Influencer’, ‘Who’s Who in Fintech’ and ‘40 Under 40 in Silicon Valley’. She focuses on the latest product innovations and growth for people during the day while teaching students and mentoring entrepreneurs at night. Connect with her on LinkedIn or Twitter/X. All views are my own. —