Chapter 7.1: Team Structure Models
“Organizing a company around AI is like organizing around electricity. It’s not a department—it’s a capability that powers everything.” — Andrew Ng
The right team structure is essential for MLOps success. This chapter explores proven organizational models, their trade-offs, and when to use each.
7.1.1. The Organizational Challenge
ML organizations face a fundamental tension:
- Centralization enables consistency, governance, and efficiency.
- Decentralization enables speed, autonomy, and domain expertise.
The best structures balance both.
Common Anti-Patterns
| Anti-Pattern | Symptoms | Consequences |
|---|---|---|
| Ivory Tower | Central ML team isolated from business | Models built but never deployed |
| Wild West | Every team does ML their own way | Redundancy, technical debt, governance gaps |
| Understaffed Center | 1-2 people “supporting” 50 data scientists | Bottleneck, burnout, inconsistent support |
| Over-Centralized | Central team must approve everything | Speed killed, talent frustrated |
7.1.2. Model 1: Centralized ML Team
All data scientists and ML engineers in one team, serving the entire organization.
Structure
┌─────────────────────────────────────────────┐
│ Chief Data Officer / VP AI │
├─────────────────────────────────────────────┤
│ Data Science │ ML Engineering │ MLOps │
│ Team │ Team │ Team │
├─────────────────────────────────────────────┤
│ Serving Business Units │
│ (Sales, Marketing, Operations, Product) │
└─────────────────────────────────────────────┘
When It Works
- Early stage: <10 data scientists.
- Exploratory phase: ML use cases still being discovered.
- Regulated industries: Governance is critical.
- Resource-constrained: Can’t afford duplication.
Pros and Cons
| Pros | Cons |
|---|---|
| Consistent practices | Bottleneck for business units |
| Efficient resource allocation | Far from domain expertise |
| Strong governance | Prioritization conflicts |
| Career community for DS/ML | Business units feel underserved |
Key Success Factors
- Strong intake process for requests.
- Embedded liaisons in business units.
- Clear prioritization framework.
- Executive sponsorship for priorities.
7.1.3. Model 2: Embedded Data Scientists
Data scientists sit within business units, with dotted-line reporting to a central function.
Structure
┌─────────────────────────────────────────────┐
│ Chief Data Officer │
│ (Standards, Governance) │
├───────────┬───────────┬───────────┬─────────┤
│ Marketing │ Product │ Operations│ Finance │
│ Team │ Team │ Team │ Team │
│ 2 DS, 1 │ 3 DS, 1 │ 2 DS, 1 │ 1 DS │
│ MLE embed │ MLE embed │ MLE embed │ │
└───────────┴───────────┴───────────┴─────────┘
│ │ │
└───────────┴───────────┘
▼
Central ML Platform Team
(Tools, Infra, Standards)
When It Works
- Mature organization: Clear ML use cases per business unit.
- Domain-heavy problems: Deep business knowledge required.
- Fast-moving business: Speed more important than consistency.
- 15-50 data scientists: Large enough to embed.
Pros and Cons
| Pros | Cons |
|---|---|
| Close to business domain | Inconsistent practices |
| Fast iteration | Duplication of effort |
| Clear ownership | Career path challenges |
| Business trust in “their” DS | Governance harder |
Key Success Factors
- Central platform team sets standards.
- Community of practice connects embedded DS.
- Rotation programs prevent silos.
- Clear escalation path for cross-cutting needs.
7.1.4. Model 3: Hub-and-Spoke (Federated)
Central team provides platform and standards; business units provide domain-specific ML teams.
Structure
┌─────────────────────────────────────────────┐
│ ML Platform Team (Hub) │
│ Platform, Tools, Standards, Governance │
│ 5-10 people │
└────────────────┬────────────────────────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐
│ Spoke │ │ Spoke │ │ Spoke │
│ Team A │ │ Team B │ │ Team C │
│ BU DS │ │ BU DS │ │ BU DS │
└───────┘ └───────┘ └───────┘
When It Works
- Scale: 50+ data scientists.
- Diverse use cases: Different domains need different approaches.
- Mature platform: Central platform is stable and self-service.
- Strong governance need: Must balance autonomy with control.
Pros and Cons
| Pros | Cons |
|---|---|
| Best of both worlds | Requires mature platform |
| Scalable model | Hub team can become bottleneck |
| Domain expertise + standards | Coordination overhead |
| Clear governance | Spoke teams may resist standards |
Key Success Factors
- Hub team focused on enablement, not gatekeeping.
- Self-service platform reduces hub bottleneck.
- Clear interface contract between hub and spokes.
- Metrics for both hub (platform health) and spokes (business outcomes).
7.1.5. Model 4: Platform + Product Teams
ML Platform team provides infrastructure; ML Product teams build specific products.
Structure
┌─────────────────────────────────────────────┐
│ ML Product Teams │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │Recomm- │ │ Fraud │ │ Search │ ... │
│ │endation│ │Detection│ │ Team │ │
│ │ Team │ │ Team │ │ │ │
│ └────────┘ └────────┘ └────────┘ │
├─────────────────────────────────────────────┤
│ ML Platform Team │
│ Feature Store, Training, Serving, etc. │
├─────────────────────────────────────────────┤
│ Data Platform Team │
│ Data Lake, Streaming, Orchestration │
└─────────────────────────────────────────────┘
When It Works
- Product-led organization: Clear ML products (recommendations, search, fraud).
- Large scale: 100+ ML practitioners.
- Mission-critical ML: ML is the product, not a support function.
- Fast-moving market: Competitive pressure on ML capabilities.
Pros and Cons
| Pros | Cons |
|---|---|
| Full ownership by product teams | Requires large investment |
| Clear product accountability | Coordination across products |
| Deep expertise per product | Platform team can feel like “cost center” |
| Innovation at product level | Duplication between products |
Key Success Factors
- Platform team treated as product team (not cost center).
- Clear API contracts between layers.
- Strong product management for platform.
- Cross-team collaboration forums.
7.1.6. The MLOps Team Specifically
Regardless of overall model, you need a dedicated MLOps/ML Platform team.
MLOps Team Roles
| Role | Responsibilities | Typical Count |
|---|---|---|
| Platform Lead | Strategy, roadmap, stakeholder management | 1 |
| Platform Engineer | Build and maintain platform infrastructure | 2-5 |
| DevOps/SRE | Reliability, operations, monitoring | 1-2 |
| Developer Experience | Documentation, onboarding, support | 1 |
Sizing the MLOps Team
| Data Scientists | MLOps Team Size | Ratio |
|---|---|---|
| 5-15 | 2-3 | 1:5 to 1:7 |
| 15-50 | 4-8 | 1:6 to 1:8 |
| 50-100 | 8-15 | 1:7 to 1:10 |
| 100+ | 15-25+ | 1:8 to 1:12 |
Rule of thumb: 1 MLOps engineer per 6-10 data scientists/ML engineers.
MLOps Team Skills
| Skill | Priority | Notes |
|---|---|---|
| Kubernetes | High | Core infrastructure |
| Python | High | ML ecosystem |
| CI/CD | High | Automation |
| Cloud (AWS/GCP/Azure) | High | Infrastructure |
| ML fundamentals | Medium | Understand users |
| Data engineering | Medium | Pipelines, Feature Store |
| Security | Medium | Governance, compliance |
7.1.7. Transitioning Between Models
Organizations evolve. Here’s how to transition.
From Centralized to Hub-and-Spoke
| Phase | Actions | Duration |
|---|---|---|
| 1: Prepare | Build platform, define standards | 3-6 months |
| 2: Pilot | Embed 2-3 DS in one business unit | 3 months |
| 3: Expand | Expand to other business units | 6 months |
| 4: Stabilize | Refine governance, complete transition | 3 months |
From Embedded to Federated
| Phase | Actions | Duration |
|---|---|---|
| 1: Assess | Document current practices, identify gaps | 1-2 months |
| 2: Platform | Build/buy central platform | 4-6 months |
| 3: Standards | Define and communicate standards | 2 months |
| 4: Migration | Migrate teams to platform | 6-12 months |
7.1.8. Governance Structures
Model Risk Management
For regulated industries (banking, insurance, healthcare):
| Function | Role |
|---|---|
| Model Risk Management (2nd line) | Independent validation |
| Model Owners (1st line) | Development, monitoring |
| Internal Audit (3rd line) | Periodic review |
ML Steering Committee
| Member | Role |
|---|---|
| CTO/CDO | Executive sponsor |
| Business unit heads | Priority input |
| ML Platform Lead | Technical updates |
| Risk/Compliance | Governance oversight |
Meeting cadence: Monthly for steering, weekly for working group.
7.1.9. Key Takeaways
-
There’s no one-size-fits-all: Choose model based on size, maturity, and needs.
-
Plan for evolution: What works at 10 DS won’t work at 100.
-
Always have a platform team: The alternative is chaos.
-
Balance centralization and speed: Too much of either fails.
-
Governance is essential: Especially in regulated industries.
-
Invest in community: DS across teams need to connect.
-
Size MLOps at 1:6 to 1:10: Don’t understaff the platform.
Next: 7.2 Skills & Career Development — Growing ML talent.