Loading
Loading
A practical framework for evaluating whether to build a custom data platform or adopt a modern data stack with off-the-shelf tools — from a team that has helped 20+ Nordic companies answer this question.
Author
Tom Bergström
Published
28 April 2026
Reading time
9 min read
Topics
data-platform, architecture, nordic-tech, build-vs-buy
The build-vs-buy decision in data platform architecture is one of the most consequential infrastructure choices a mid-market company can make. It is also one of the most poorly framed.
The framing that gets companies into trouble goes something like this: we can buy a managed platform and be operational faster, or we can build something custom and own it long-term. Which is cheaper?
That is not the right question. The right question is: which option will we still be able to operate, extend, and trust in three years — given the team we have, the data volumes we expect, and the strategic outcomes we are actually trying to achieve?
Managed data platforms — Snowflake, Databricks, Azure Synapse, and their equivalents — have become genuinely good. They abstract away infrastructure management, scale horizontally, and offer connectors to most of the systems mid-market companies run. The case for them is real.
But 'buy' in data infrastructure is not the same as buying software. It is more like renting a city. You have a lot of freedom inside the walls. The walls still exist.
What managed platforms do well
What they do less well
Egress costs compound quickly at mid-market data volumes. This is the number that surprises most buyers 18 months in.
Building a custom data platform does not mean starting from a blank terminal. It means assembling a stack from components — orchestration layer, storage layer, transformation layer, serving layer — and making deliberate choices at each level.
The modern open-source data stack (dbt, Apache Airflow or Prefect, Delta Lake or Iceberg, with AWS or Azure as the compute and storage backbone) is mature enough that a well-architected build is no longer the multi-year project it was five years ago. The component selection is still consequential, but the components themselves are battle-tested.
What a well-built platform does well
What it requires
Rather than a binary, the useful model is a set of questions that reveal which direction fits your situation.
| Condition | Direction |
|---|---|
| Your data volumes are under 500GB/month | Managed platform is likely justified — build overhead exceeds benefit |
| You have in-house data engineers | Build is viable. Without them, managed is safer short-term |
| You need results in 90 days | Managed wins on speed. No close contest |
| You have 5+ years of platform horizon | Build economics improve significantly at this timeline |
| Your data is your product | Build. You cannot outsource the core of your product to a vendor roadmap |
| Compliance is primary (healthcare, finance) | Evaluate managed platforms with strong compliance tooling first |
| Cost predictability matters more than speed | Build, with proper architecture. Managed pricing models are variable by design |
The single most important thing we have learned from building data platforms is that the architecture decision cannot be deferred. This is true for both build and buy — but the consequences of getting it wrong differ.
With a managed platform, a poor architecture means rework inside the vendor's constraints. You can fix it, but you pay in migration cost and accumulated technical debt. With a custom build, a poor architecture means rework at the infrastructure level. The cost is higher and the timeline longer.
In either case, the structure has to be decided before a single pipeline runs. Requirements need to be understood deeply — not just the immediate use cases, but the likely evolution of those use cases over the next three to five years. Data schemas change. Business logic changes. Query patterns change. An architecture that cannot absorb those changes gracefully will not survive them.
The architecture decision cannot be deferred. The structure has to be decided before a single pipeline runs.
— Tom Bergström, CTO & Co-Founder, Indpro AB
Data platform architecture cannot be evaluated without considering how AI workloads will interact with it. This changes the analysis in a few important ways.
Managed platforms have invested heavily in AI-ready infrastructure — vector stores, embedding pipelines, model serving integrations. If AI workloads are likely to be significant within two years, the managed platform's head start matters more than it would have three years ago.
But the same caution applies: if your data is the substrate for your AI models, vendor lock-in on the data layer means vendor dependency on your model quality. That is a strategic exposure that some companies are comfortable with and some are not.
The principle that applies to AI tooling generally applies here: adopt what is stable and proven. The companies that integrated agentic AI workflows into their data platforms without architectural controls are the ones with the most expensive technical debt to unwind right now.
Buy if: you need speed, your volumes are moderate, your team does not have deep data engineering capability, and three to five years from now is not your primary planning horizon.
Build if: data is central to your product, you have the engineering capability to operate it (or can acquire it), you are planning for five-plus years, and cost predictability at scale matters.
Hybrid if: you are at mid-scale, building fast, and want to de-risk the lock-in question. Start with a managed platform for the use cases where it excels; build the components where you need ownership. This is increasingly the architecture that makes sense — but it requires a clear overall design, or you end up with neither the advantages of managed nor the freedom of custom.
If the capability question is about the team rather than the platform, see Building Scalable Tech Teams with India for how Nordic companies are solving the data engineering talent gap. For companies who want to hand off data platform delivery entirely, the Data Team as a Service model covers architecture through continuous delivery.
Should a mid-market company build or buy a data platform?
It depends on data volume, team capability, and planning horizon. Buy if you need results within 90 days, have moderate data volumes under 500GB per month, or lack in-house data engineering capability. Build if data is central to your product, you are planning for 5+ years, or cost predictability at scale matters. A hybrid approach — managed platform for standard use cases, custom-built components where you need ownership — increasingly makes sense for mid-market companies.
What are the main risks of building a custom data platform?
The primary risk is architectural: if the structure is wrong before the first pipeline runs, rework is expensive at the infrastructure level. Secondary risks include ongoing operational responsibility (observability, incident response, schema governance), the need for engineers who understand architecture not just data engineering, and a longer path to first value compared to managed platforms. These risks are manageable with the right team and a disciplined architecture-first approach.
How long does it take to implement a data platform?
A managed platform (Snowflake, Databricks, Azure Synapse) can be ingesting and querying data within days to weeks. A custom-built platform using modern open-source components — dbt, Airflow, Delta Lake — typically takes 2–4 months for the core architecture and initial pipelines, and 4–8 months to reach production-ready state with observability and governance. Timeline varies significantly based on data complexity and team capability.
What does a data engineering team from Indpro cost?
Indpro's Data Team as a Service engagements run 200,000–600,000 SEK per month depending on team size and scope. This covers data engineering, AI/ML model building, analytics and BI, and continuous delivery — with a named delivery lead and weekly output. The AI Readiness Sprint, which takes you from data audit to two AI use cases in production, is priced at 300,000–800,000 SEK.

CTO & Co-Founder
Tom leads Indpro's technology strategy and engineering standards. With 20+ years of experience building and leading engineering teams across the Nordic region, he ensures every engagement delivers at the highest technical level.
Connect on LinkedIn →10 pages of practical insight on operating models, compensation benchmarks, and a hiring playbook. Free PDF.
Download the Free GuideOr reach us directly: sales@indpro.se · +46 73 932 21 38