Loading…

Cutting compute costs 62% for a logistics unicorn.

A FinOps and platform engineering reset that took AfriLogistics from a $4.1M monthly cloud bill to $1.55M without retiring a single feature.

Client

AfriLogistics Group(anonymized)

Industry:Logistics

Region:Southern Africa

Duration:5 months

Team:9 engineers

Year:2024

“We were burning a Series C round on AWS and didn't realise it. Spalce came in, mapped where the money was actually going, and stopped the bleeding without us slowing down a single delivery.”
Thandi Mokoena — VP Engineering, AfriLogistics Group

Executive Summary

The bottom line

AfriLogistics had grown its AWS footprint from a startup posture to a 1,200-microservice fleet in eighteen months. Compute spend was outpacing revenue growth. We ran a five-month FinOps and platform engineering engagement: introducing a workload-aware cluster autoscaler, rightsizing 84% of services, moving the analytics tier to Spot + Graviton, and re-architecting the three services that were responsible for 41% of the bill. Monthly compute spend fell 62% and unit economics on parcel delivery improved by $0.34 per shipment.

62%

Compute cost reduction

$4.1M to $1.55M monthly

$0.34

Saved per shipment

Unit economics improvement

84%

Services rightsized

Across 1,200-service fleet

5 mo

To full payback

Of the engagement cost

The Challenge

AfriLogistics had scaled from twelve engineers to two hundred and thirty in eighteen months on the back of a Series C. Its AWS bill had grown faster: from $180k a month to $4.1M, and the curve was still bending up. The CFO's signal to the board had been that gross margin would expand as the company scaled — instead, every additional shipment was bringing in less contribution than the one before it because compute costs were rising faster than revenue per parcel.

The internal platform team knew something was wrong but did not have the bandwidth to dig in. Every engineer had a kanban board full of feature work and nobody owned the cluster autoscaler, the right-sizing policy, or the analytics workloads that had quietly moved from a nightly batch to a continuously running stream. Half the EKS nodes were running at 18% CPU utilisation; half the EMR clusters were sitting idle between runs that had been on-demand pricing for nine months.

$4.1M monthly AWS spend growing 11% MoM against 6% MoM revenue growth
1,200 microservices, 84% of which had never had a rightsizing review
Analytics tier running entirely on on-demand x86 instances
No FinOps practice — engineers had no visibility into the cost of their own services

Our Approach

We treated this as a platform engineering engagement first and a FinOps engagement second. The hypothesis was that no amount of dashboarding would fix a system where engineers had no path to act on what they saw. We started by instrumenting cost-per-service in Datadog tied to deployment metadata, then built a self-service rightsizing recommender that engineers could approve with a single click from a Slack message.

On the architectural side, we identified three services — geocoding, ETA prediction, and the parcel-event stream processor — that together accounted for 41% of the compute bill. Each got a focused two-week rebuild: geocoding moved from a brute-force lookup to a tiered cache, ETA prediction moved from synchronous to event-driven with a pre-computed prefix tree, and the stream processor moved from Kafka Streams to a Flink job on Graviton.

What We Built

The fleet-wide changes did the volume work: Karpenter replaced cluster-autoscaler with workload-aware bin packing; 78% of stateless workloads moved to Spot with PDBs that survived interruption; the analytics tier moved to Graviton-backed Spot fleets with EMR Serverless for spiky workloads. Each engineering team got a weekly cost report tied to their services with a one-click rightsizing approval flow.

Karpenter-based workload-aware autoscaling across 14 EKS clusters
Spot adoption rising from 6% to 78% of stateless capacity with zero customer-facing SLO breaches
Geocoding service rebuilt as a tiered cache, dropping from 320 vCPUs to 22
ETA prediction moved to a pre-computed event-driven model on Graviton
FinOps Slack bot delivering per-team cost reports with one-click rightsizing approval

“The dashboard didn't fix it. The Slack bot fixed it. Once an engineer could approve a rightsize without leaving a thread, the whole org's habit changed in six weeks.”
Sipho Dlamini — Head of Platform, AfriLogistics Group

The Outcome

Monthly compute spend landed at $1.55M by the end of month five — a 62% reduction against the baseline. The cost curve flattened against revenue, and the CFO presented a revised contribution-margin trajectory at the next board meeting. The three rebuilt services each came in under their target latency budget while running on a fraction of the previous capacity.

Equally important, the FinOps practice stuck. Six months after the engagement closed, the cost-per-shipment metric had continued to fall as engineering teams kept acting on weekly cost reports. AfriLogistics has since hired its first dedicated FinOps engineer to own the platform we built.

Results

By the numbers

62%

Compute cost reduction

$2.55M monthly savings

78%

Spot adoption

up from 6%

320 → 22

Geocoding vCPUs

tiered-cache rewrite

Customer SLO breaches

during the migration

Lessons Learned

FinOps engagements fail when they remain dashboards. They succeed when there is a low-friction path from insight to action — and that path almost always lives in the tool the engineers already use all day. The Slack-bot loop was a deliberately small piece of work that did more for the cost curve than any of the architectural rebuilds. The architectural work was necessary, but it was the cultural rewiring that compounded after we left.

Put the cost data where engineers already work, not in a separate dashboard
Identify the 3 services that drive 40%+ of cost and rebuild those first
Karpenter + Spot + Graviton is a stack that compounds — start with all three
Hire a FinOps owner before the engagement ends or the savings will regress

Tech Stack

AWSKubernetesKarpenterGravitonFlinkDatadogTerraform

Services Engaged

Cloud & DevOps

Platform Engineering

Custom Software Development

Take the Full Story With You

Download Full Case Study (PDF)

Get the long-form case study including architecture diagrams, operating cadence, and the unabridged interviews.

Free Download

Get the full report

Tell us a little about yourself and we'll send you the PDF.

Building Africa's first LLM-powered customer-service stack for a telco

Have a similar challenge?

Let's talk. We'll walk you through what an engagement looks like end to end.

Talk to our team

Cutting compute costs 62% for a logistics unicorn.

A FinOps and platform engineering reset that took AfriLogistics from a $4.1M monthly cloud bill to $1.55M without retiring a single feature.

Client

AfriLogistics Group(anonymized)

Industry:Logistics

Region:Southern Africa

Duration:5 months

Team:9 engineers

Year:2024

“We were burning a Series C round on AWS and didn't realise it. Spalce came in, mapped where the money was actually going, and stopped the bleeding without us slowing down a single delivery.”
Thandi Mokoena — VP Engineering, AfriLogistics Group

Executive Summary

The bottom line

62%

Compute cost reduction

$4.1M to $1.55M monthly

$0.34

Saved per shipment

Unit economics improvement

84%

Services rightsized

Across 1,200-service fleet

5 mo

To full payback

Of the engagement cost

The Challenge

$4.1M monthly AWS spend growing 11% MoM against 6% MoM revenue growth
1,200 microservices, 84% of which had never had a rightsizing review
Analytics tier running entirely on on-demand x86 instances
No FinOps practice — engineers had no visibility into the cost of their own services

Our Approach

What We Built

Karpenter-based workload-aware autoscaling across 14 EKS clusters
Spot adoption rising from 6% to 78% of stateless capacity with zero customer-facing SLO breaches
Geocoding service rebuilt as a tiered cache, dropping from 320 vCPUs to 22
ETA prediction moved to a pre-computed event-driven model on Graviton
FinOps Slack bot delivering per-team cost reports with one-click rightsizing approval

“The dashboard didn't fix it. The Slack bot fixed it. Once an engineer could approve a rightsize without leaving a thread, the whole org's habit changed in six weeks.”
Sipho Dlamini — Head of Platform, AfriLogistics Group