How to Set Up an AI Center of Excellence

Most enterprises do not fail at artificial intelligence because the models are too weak. They fail because they have twelve disconnected pilots, three copies of the same retrieval pipeline, no shared evaluation harness, and nobody who owns the question of whether any of it is safe to put in front of a customer. An AI center of excellence exists to fix exactly this: it is the organisational unit that concentrates scarce expertise, sets standards, and turns one-off experiments into a repeatable capability the rest of the business can draw on. Done well, it is the difference between spending two years proving that large language models can summarise a document and spending two years shipping systems that move real numbers.

This article is a practical guide to setting one up, aimed at the people who will actually be accountable for it: engineering leaders, heads of data, founders and CTOs. We will treat the AI CoE not as a slide in a transformation deck but as a working system with a charter, an operating model, a staffing plan, a technology platform and a governance function. The emphasis throughout is on trade-offs and concrete steps rather than aspiration, because the hard part is never writing the mission statement. It is deciding what the team owns versus what it merely advises on, how it funds itself, and how it avoids becoming either a bottleneck or an ivory tower.

Decide what the center of excellence is actually for

Before you hire anyone, write down the mandate in a single page, because the shape of the team follows directly from what you are asking it to do. There are three common flavours, and confusing them is the most frequent early mistake. The first is a delivery model, where the CoE builds and ships AI products directly. The second is an enablement model, where it builds shared platforms, patterns and training so that product teams ship their own work. The third is a governance-and-standards model, where it sets policy, reviews high-risk use cases and stays out of day-to-day delivery. Most successful teams are a deliberate blend, but the blend must be explicit and it must change over time.

A useful sequencing is to start delivery-heavy and shift towards enablement. In the first year the CoE ships two or three flagship use cases itself, precisely because doing the work is how you discover which patterns, tools and guardrails are actually needed. Once those patterns exist, the centre of gravity moves outward: the CoE productises what it learned into reusable components and coaches embedded teams to use them. If you start in pure enablement mode, you tend to build platforms nobody asked for; if you never leave delivery mode, you become a permanent internal agency and the rest of the organisation never develops muscle.

Write the mandate as a set of decisions the team is allowed to make, not a list of values. For example: the AI CoE owns the standard for how models are evaluated before release, owns the shared retrieval and inference platform, and has veto authority over customer-facing generative features until a defined review is passed. It does not own the product roadmap of any business unit. Clarity on these boundaries is what prevents the political friction that quietly kills these teams.

Choose an operating model: centralised, federated or hub-and-spoke

The single biggest structural choice is how the AI operating model relates to the rest of the organisation. A fully centralised model puts all AI talent in one team that takes requests from the business. It maximises consistency and makes governance trivial, but it becomes a queue, and domain context is lost in translation. A fully federated model embeds AI engineers inside each business unit. It maximises domain fluency and speed, but you get duplicated infrastructure, inconsistent quality, and no coherent view of risk. Neither extreme survives contact with a mid-sized enterprise.

The hub-and-spoke pattern is the pragmatic default and worth adopting deliberately rather than by accident. A small central hub owns the platform, the standards, the evaluation tooling and the governance process. Spokes are AI engineers or product-embedded specialists who sit with business units but adhere to hub standards and contribute back to shared components. The hub is measured on how much it accelerates the spokes; the spokes are measured on business outcomes. A lightweight guild or community of practice ties them together with a shared code repository, regular design reviews and a common backlog for platform requests.

Whichever model you pick, decide funding at the same time, because funding shapes behaviour more than any org chart. A centrally funded CoE can invest in long-horizon platform work but risks building things no one values. A chargeback model, where business units pay for the CoE's time, forces relevance but starves foundational investment. A common compromise is central funding for platform and governance, with delivery work co-funded by the requesting unit so they have skin in the game.

Get the enterprise AI team structure and roles right

Enterprise AI team structure is where good intentions meet reality, because the skills you need are broader than most people assume and genuinely scarce. Beyond the obvious machine learning and applied research roles, a functioning CoE needs strong data engineering, because most AI failures are data failures in disguise. It needs machine-learning platform or MLOps engineers who own deployment, monitoring and cost. It needs software engineers who can wrap models into reliable services with proper error handling and fallbacks. And increasingly it needs people fluent in building on foundation models and agent frameworks, which is a different discipline from training models from scratch.

Two roles are chronically underweighted. The first is the product manager or product owner who translates a vague business ask into a scoped, measurable problem and who says no to work that will not pay off. Without this role the CoE drowns in demos that never reach production. The second is an evaluation or quality specialist who owns test sets, offline and online metrics, and red-teaming. In systems built on probabilistic models, the person who can tell you whether the thing actually works is worth more than another model builder.

Resist the urge to hire a large team on day one. A tight founding group of five to eight strong generalists who can each cover two roles will outperform a bigger team of narrow specialists, and it lets you learn what you actually need before you scale. Hire for judgement and range early, then specialise as the platform and portfolio mature. Plan explicitly for how you will retain these people, because the market for this talent is brutal: interesting problems, real production ownership and a clear path to impact matter more than any perk.

Learn from practitioners in Dubai

Previous editions of World AI Technology Expo Dubai have brought together senior AI practitioners and leaders. Speakers below are shown for reference from previous editions; the 2026 line-up will be announced ahead of the event.

Nitin Akarte

Microsoft

AI Network Director

United States

Akshay Singh Dalal

Google

Head of Regional Risk & Compliance

United Arab Emirates

James Hunter

IBM

Program Director @ IBM | Driving DevOps Automation and AI

United Kingdom

Abhinav Sharma

Cisco

CTO & Director - AI & Automation Leader

India

View Speakers Apply to Speak

Build the shared platform and reusable patterns

The clearest justification for centralising anything is the platform, because rebuilding the same plumbing in every team is pure waste. A mature AI CoE provides a small number of paved roads: a standard way to access foundation models and large language models with logging, rate limiting and cost controls; a shared retrieval stack including vector databases and document-processing pipelines; experiment-tracking tools and a model registry; and a deployment path that includes monitoring for latency, cost, drift and quality regressions. The goal is that a new use case starts at eighty per cent infrastructure and spends its effort on the twenty per cent that is domain-specific.

Be disciplined about what you standardise versus what you leave open. Standardise the things where inconsistency creates risk or duplicated cost: identity and access, data handling, evaluation harnesses, secrets management, and how models are called and monitored. Leave teams freedom in the areas where experimentation still has value, such as prompting strategies, choice of orchestration approach, or which retrieval technique fits a given corpus. Over-standardising too early freezes decisions before you understand the problem space and drives teams to build shadow systems around your platform.

Treat the platform as a product with internal customers, not a mandate. That means a real backlog, versioning, documentation, office hours and a feedback loop from the spokes. Track adoption honestly: if embedded teams are routing around your paved road, that is data about your platform, not about their discipline. The strongest signal that a CoE is working is that shipping an AI feature on the shared platform is genuinely faster and safer than doing it independently.

Stand up an AI governance team without becoming a blocker

Governance is where most people either do far too little or turn the CoE into a compliance checkpoint that everyone learns to avoid. The right approach is risk-tiered and proportionate: classify use cases by potential impact, and apply lighter or heavier review accordingly. An internal tool that drafts text a human always edits carries very different risk from an automated decision that affects a customer with no human in the loop. Your AI governance team should spend its energy on the high-tier cases and get out of the way on the low-tier ones, ideally through self-service checklists and automated policy checks baked into the platform.

A practical governance function covers a defined set of concerns: what data may be used and how it is handled; how systems are evaluated before and after release; how outputs are monitored for quality, bias and failure modes; how human oversight is designed for consequential decisions; and how incidents are detected and rolled back. Rather than a separate bureaucracy, embed these as gates in the delivery workflow, so passing review is a step in shipping rather than a parallel process. The most effective governance is invisible because it is built into the paved road.

Make sure the governance function is staffed by people who understand both the technology and the domain, and give it teeth without giving it a veto over everything. A single accountable owner for AI risk, supported by domain experts pulled in per case, beats a standing committee that meets monthly and slows everyone down. Keep the standards documented, versioned and public inside the company, and revisit them regularly, because the regulatory and technical landscape in 2026 is still moving quickly and a policy written a year ago is probably already stale.

Fund it, measure it and prove value early

A CoE that cannot demonstrate value in its first two or three quarters will lose its budget in the next planning cycle, so treat early proof as a design constraint. Pick initial use cases using a simple two-axis filter: business value and feasibility. The sweet spot is high-value problems where the data exists, the success metric is clear, and a human can validate outputs. Avoid the seductive moonshot as your first project; a boring internal workflow that saves measurable hours is a far better opening because it ships, it teaches the platform, and it builds credibility you can spend later.

Measure the CoE on outcomes, not activity. Number of pilots and models trained are vanity metrics; the ones that matter are production deployments, business impact of those deployments, time-to-production for new use cases, and reuse of shared components across teams. A quiet but powerful metric is the reduction in time it takes a fresh use case to go from idea to production, because that captures whether the platform and patterns are actually compounding. Report these honestly, including the things that were killed, since a healthy portfolio kills more ideas than it ships.

Budget for the unglamorous costs that sink projects: data preparation and labelling, inference and compute spend at production scale, evaluation infrastructure, and ongoing maintenance of things already live. Running costs for systems built on foundation models can dwarf development costs, and a CoE that does not track cost per use case will be surprised by its cloud platform bill. Build a simple internal chargeback or at least a cost-visibility dashboard so that value can be weighed against spend for every deployment.

Scale, embed and keep the organisation learning

Once the first flagship deliveries land, the CoE's job shifts from doing to multiplying, and the teams that miss this transition ossify. Scaling means deliberately moving capability outward: rotating embedded engineers back into the hub and out again, running internal enablement programmes, publishing reusable templates, and creating a community of practice where practitioners across the business share what works. The aim is that in two to three years, a good portion of AI delivery happens in the business units on the CoE's rails, while the hub focuses on the hardest problems, the platform and the standards.

Invest continuously in people, because the field moves faster than any onboarding document. That means protected time for the team to evaluate new techniques, a clear internal channel for sharing lessons from failed experiments, and connections to the outside world so the organisation does not calcify around one way of doing things. Sending practitioners to hands-on industry gatherings pays for itself in avoided mistakes; teams working on this can meet peers, vendors and investors and go deeper on operating-model questions at World AI Technology Expo Dubai (17-19 November 2026, Millennium Airport Hotel, Dubai).

Finally, plan for the CoE to change shape as AI stops being exceptional and becomes ordinary infrastructure. In the most mature organisations the centralised team eventually shrinks, because the standards, platform and skills have diffused into the business. That is not failure; it is the goal. The AI CoE is a mechanism for concentrating scarce capability until the organisation can carry it broadly, and the best ones plan their own gradual dissolution from the start rather than defending their existence long after the rest of the company has caught up.

Inside the event

A glimpse of the atmosphere from previous editions — keynotes, the exhibition floor and the networking that defines World AI Technology Expo Dubai.

Networking at World AI Technology Expo Dubai

Panel discussion at World AI Technology Expo Dubai

Delegates at World AI Technology Expo Dubai

Live product demonstration at World AI Technology Expo Dubai

Keynote session at World AI Technology Expo Dubai

Exhibition floor at World AI Technology Expo Dubai

Key takeaways

An AI center of excellence exists to turn scattered pilots into a repeatable, governed capability; write a one-page mandate defining what it owns versus advises on before hiring.
Hub-and-spoke is the pragmatic default operating model: a central hub owns platform, standards and governance while embedded spokes deliver domain-specific work on shared rails.
Underweighted roles sink CoEs; invest early in a product owner who scopes and says no, and an evaluation specialist who can prove whether systems actually work.
Build the platform as a product with internal customers, standardising only where inconsistency creates real risk or duplicated cost, and leaving room to experiment elsewhere.
Make governance risk-tiered and embedded as gates in the delivery workflow, focusing scrutiny on high-impact use cases and enabling self-service for low-risk ones.
Prove value in the first two or three quarters with high-value, feasible use cases, and measure production impact, time-to-production and component reuse rather than pilot counts.

Frequently asked questions

An AI center of excellence, or AI CoE, is a dedicated organisational unit that concentrates scarce AI expertise, sets shared standards and platforms, and turns one-off experiments into a repeatable capability the wider business can use. It typically blends direct delivery, enablement of other teams, and governance of high-risk use cases. Its purpose is to avoid duplicated infrastructure and inconsistent quality while accelerating safe deployment.

Start small, with a tight founding team of roughly five to eight strong generalists who can each cover more than one role. A compact team ships faster, learns what the organisation actually needs, and avoids over-hiring narrow specialists before the platform and portfolio exist. Scale and specialise only once you have real production deliveries and validated patterns to build on.

For most mid-sized to large enterprises, a hub-and-spoke AI operating model is the pragmatic default. A small central hub owns the platform, standards, evaluation tooling and governance, while embedded spokes deliver domain-specific work in the business units using the hub's shared components. This balances consistency and risk control against the domain fluency and speed you only get close to the business.

Measure outcomes rather than activity. The metrics that matter are production deployments and their business impact, time-to-production for new use cases, reuse of shared components across teams, and cost per deployment. Vanity metrics like the number of pilots or models trained should be ignored, and a healthy portfolio openly reports the ideas it killed as well as the ones it shipped.

Make governance risk-tiered and proportionate: classify use cases by potential impact and apply heavier review only to high-risk, consequential or customer-facing systems. Embed lightweight checks and self-service checklists into the delivery platform so passing review is a step in shipping rather than a separate process. Assign a single accountable owner for AI risk supported by domain experts per case, instead of a slow standing committee.