Skip to content

Get a Quote Talk to Sales

Loading…

LLMs in production for African enterprises — three deployment lessons.

AI & Data • 60 min • Jan 29, 2026

60 min

2,488 views

AI & Data

Speakers

AS

Dr Akua Sarpong

Akan NLP Toolkit lead

AE

Adaobi Eze

VP Product Strategy

Watch now Download slides Companion repo

Demo placeholder — playback requires backend integration

What this session covered

Three production LLM deployments — at a tier-1 bank, a public-sector agency, and an energy utility — and the patterns that worked across all of them.

Key takeaways

Retrieval-augmented generation is the safe entry point. Pure fine-tuning is rarely the right first move for an enterprise deployment.

Hallucination handling has to be product, not just engineering. Build the UX around the assumption that the model will sometimes be wrong.

Vendor lock-in is a real risk. Architect for model portability from day one — the model that performs best today may not be the cheapest in eighteen months.

Read the transcript

Auto-generated and lightly edited. Let us know about errors.

Dr Akua Sarpong: Welcome. Today we're going to talk about three production LLM deployments we've done in the last twelve months — at a tier-1 Nigerian bank, a West African public-sector agency, and a South African energy utility. The use cases are different. The patterns are surprisingly similar. Adaobi Eze: The first lesson — start with RAG, not fine-tuning. We've been asked many times whether we recommend fine-tuning a base model on the customer's data. The honest answer is — almost never, for a first deployment. RAG is faster to build, easier to debug, and gives you a clearer audit trail because you can show the customer which source documents the model used. Dr Akua Sarpong: And the audit trail matters a lot in regulated industries. When a banker asks the assistant a compliance question and gets an answer, the bank needs to be able to explain which policy document that answer came from. RAG makes that trivial. Fine-tuning makes it impossible. Adaobi Eze: The bank deployment was a customer-service assistant for internal staff. The model is a frontier model called through an API, with retrieval over the bank's internal policy library — about 4,000 documents. The query goes through a retrieval layer that ranks the top ten relevant documents, then the model generates a response grounded in those. Dr Akua Sarpong: The second lesson — hallucinations are inevitable. The model will sometimes generate content that sounds confident but is wrong. The engineering question is: how do you reduce the rate? The product question is: how do you handle it when it happens? Adaobi Eze: We use three engineering techniques. One: structured retrieval, so the model has to ground its answer in source documents. Two: confidence calibration, so we show the user how certain the model is. Three: explicit decline patterns — we train the model to say I don't know when retrieval doesn't return strong matches. Dr Akua Sarpong: And on the product side, we always show the source documents alongside the answer. The user can click through, verify, and challenge. We've found that users develop a healthy skepticism quickly when they have that affordance. They check the source. They don't take the model on faith. Adaobi Eze: The third lesson is about vendor lock-in. The LLM market is moving fast. The model that's best today may not be the cheapest in eighteen months. Your architecture should let you swap the model provider without rewriting the application. Dr Akua Sarpong: We use a model abstraction layer. The application talks to that layer with a stable interface. Behind it, we can route to any of the major model providers, or to an open-source model we host ourselves. For one customer we've already swapped providers once based on cost-per-token changes, with no application code changes. Adaobi Eze: The energy utility deployment is interesting because it's not a chat use case. It's an extraction use case. The utility receives thousands of supplier contracts in PDF. The model reads them and extracts key terms — pricing, term length, renewal clauses, dispute mechanisms. The output goes into a structured database that humans review. Dr Akua Sarpong: That's a sweet spot for current LLM capability. The model isn't generating advice. It's structuring unstructured content. The human is still in the loop. The productivity gain has been substantial — about 40 hours per week of contract review work compressed to about 8. Adaobi Eze: One last point. Don't deploy without an evaluation dataset. Every customer engagement starts with us building a ground-truth set of at least 500 queries with known correct answers. We run the model against that set on every change. Without it, you're flying blind.

Resources

Full deck used in the session

Companion repository

Code referenced in the session

More on AI & Data

Scaling mobile money rails to four countries in nine months

Cloud cost optimization for African mid-market — a tactical walkthrough

Building HIPAA and GDPR-aligned telemedicine for African markets

Want this session live with your team?

We run custom 60-minute briefings for enterprise customers. Topics tailored to your engagement.

Request a private briefing