Keyboard shortcuts

Press or to navigate between chapters

Press ? to show this help

Press Esc to hide this help

Chapter 7.1: Team Structure Models

“Organizing a company around AI is like organizing around electricity. It’s not a department—it’s a capability that powers everything.” — Andrew Ng

The right team structure is essential for MLOps success. This chapter explores proven organizational models, their trade-offs, and when to use each.


7.1.1. The Organizational Challenge

ML organizations face a fundamental tension:

  • Centralization enables consistency, governance, and efficiency.
  • Decentralization enables speed, autonomy, and domain expertise.

The best structures balance both.

Common Anti-Patterns

Anti-PatternSymptomsConsequences
Ivory TowerCentral ML team isolated from businessModels built but never deployed
Wild WestEvery team does ML their own wayRedundancy, technical debt, governance gaps
Understaffed Center1-2 people “supporting” 50 data scientistsBottleneck, burnout, inconsistent support
Over-CentralizedCentral team must approve everythingSpeed killed, talent frustrated

7.1.2. Model 1: Centralized ML Team

All data scientists and ML engineers in one team, serving the entire organization.

Structure

┌─────────────────────────────────────────────┐
│           Chief Data Officer / VP AI         │
├─────────────────────────────────────────────┤
│  Data Science  │  ML Engineering  │  MLOps   │
│     Team       │      Team        │  Team    │
├─────────────────────────────────────────────┤
│            Serving Business Units            │
│   (Sales, Marketing, Operations, Product)    │
└─────────────────────────────────────────────┘

When It Works

  • Early stage: <10 data scientists.
  • Exploratory phase: ML use cases still being discovered.
  • Regulated industries: Governance is critical.
  • Resource-constrained: Can’t afford duplication.

Pros and Cons

ProsCons
Consistent practicesBottleneck for business units
Efficient resource allocationFar from domain expertise
Strong governancePrioritization conflicts
Career community for DS/MLBusiness units feel underserved

Key Success Factors

  • Strong intake process for requests.
  • Embedded liaisons in business units.
  • Clear prioritization framework.
  • Executive sponsorship for priorities.

7.1.3. Model 2: Embedded Data Scientists

Data scientists sit within business units, with dotted-line reporting to a central function.

Structure

┌─────────────────────────────────────────────┐
│                Chief Data Officer            │
│           (Standards, Governance)            │
├───────────┬───────────┬───────────┬─────────┤
│ Marketing │  Product  │ Operations│ Finance │
│   Team    │   Team    │   Team    │  Team   │
│  2 DS, 1  │  3 DS, 1  │  2 DS, 1  │  1 DS   │
│ MLE embed │ MLE embed │ MLE embed │         │
└───────────┴───────────┴───────────┴─────────┘
           │           │           │
           └───────────┴───────────┘
                       ▼
            Central ML Platform Team
             (Tools, Infra, Standards)

When It Works

  • Mature organization: Clear ML use cases per business unit.
  • Domain-heavy problems: Deep business knowledge required.
  • Fast-moving business: Speed more important than consistency.
  • 15-50 data scientists: Large enough to embed.

Pros and Cons

ProsCons
Close to business domainInconsistent practices
Fast iterationDuplication of effort
Clear ownershipCareer path challenges
Business trust in “their” DSGovernance harder

Key Success Factors

  • Central platform team sets standards.
  • Community of practice connects embedded DS.
  • Rotation programs prevent silos.
  • Clear escalation path for cross-cutting needs.

7.1.4. Model 3: Hub-and-Spoke (Federated)

Central team provides platform and standards; business units provide domain-specific ML teams.

Structure

┌─────────────────────────────────────────────┐
│            ML Platform Team (Hub)            │
│   Platform, Tools, Standards, Governance     │
│         5-10 people                          │
└────────────────┬────────────────────────────┘
                 │
    ┌────────────┼────────────┐
    ▼            ▼            ▼
┌───────┐   ┌───────┐   ┌───────┐
│ Spoke │   │ Spoke │   │ Spoke │
│ Team A │   │ Team B │   │ Team C │
│ BU DS  │   │ BU DS  │   │ BU DS  │
└───────┘   └───────┘   └───────┘

When It Works

  • Scale: 50+ data scientists.
  • Diverse use cases: Different domains need different approaches.
  • Mature platform: Central platform is stable and self-service.
  • Strong governance need: Must balance autonomy with control.

Pros and Cons

ProsCons
Best of both worldsRequires mature platform
Scalable modelHub team can become bottleneck
Domain expertise + standardsCoordination overhead
Clear governanceSpoke teams may resist standards

Key Success Factors

  • Hub team focused on enablement, not gatekeeping.
  • Self-service platform reduces hub bottleneck.
  • Clear interface contract between hub and spokes.
  • Metrics for both hub (platform health) and spokes (business outcomes).

7.1.5. Model 4: Platform + Product Teams

ML Platform team provides infrastructure; ML Product teams build specific products.

Structure

┌─────────────────────────────────────────────┐
│         ML Product Teams                     │
│  ┌────────┐ ┌────────┐ ┌────────┐           │
│  │Recomm- │ │ Fraud  │ │ Search │  ...      │
│  │endation│ │Detection│ │ Team  │           │
│  │ Team   │ │ Team   │ │        │           │
│  └────────┘ └────────┘ └────────┘           │
├─────────────────────────────────────────────┤
│         ML Platform Team                     │
│   Feature Store, Training, Serving, etc.     │
├─────────────────────────────────────────────┤
│         Data Platform Team                   │
│   Data Lake, Streaming, Orchestration        │
└─────────────────────────────────────────────┘

When It Works

  • Product-led organization: Clear ML products (recommendations, search, fraud).
  • Large scale: 100+ ML practitioners.
  • Mission-critical ML: ML is the product, not a support function.
  • Fast-moving market: Competitive pressure on ML capabilities.

Pros and Cons

ProsCons
Full ownership by product teamsRequires large investment
Clear product accountabilityCoordination across products
Deep expertise per productPlatform team can feel like “cost center”
Innovation at product levelDuplication between products

Key Success Factors

  • Platform team treated as product team (not cost center).
  • Clear API contracts between layers.
  • Strong product management for platform.
  • Cross-team collaboration forums.

7.1.6. The MLOps Team Specifically

Regardless of overall model, you need a dedicated MLOps/ML Platform team.

MLOps Team Roles

RoleResponsibilitiesTypical Count
Platform LeadStrategy, roadmap, stakeholder management1
Platform EngineerBuild and maintain platform infrastructure2-5
DevOps/SREReliability, operations, monitoring1-2
Developer ExperienceDocumentation, onboarding, support1

Sizing the MLOps Team

Data ScientistsMLOps Team SizeRatio
5-152-31:5 to 1:7
15-504-81:6 to 1:8
50-1008-151:7 to 1:10
100+15-25+1:8 to 1:12

Rule of thumb: 1 MLOps engineer per 6-10 data scientists/ML engineers.

MLOps Team Skills

SkillPriorityNotes
KubernetesHighCore infrastructure
PythonHighML ecosystem
CI/CDHighAutomation
Cloud (AWS/GCP/Azure)HighInfrastructure
ML fundamentalsMediumUnderstand users
Data engineeringMediumPipelines, Feature Store
SecurityMediumGovernance, compliance

7.1.7. Transitioning Between Models

Organizations evolve. Here’s how to transition.

From Centralized to Hub-and-Spoke

PhaseActionsDuration
1: PrepareBuild platform, define standards3-6 months
2: PilotEmbed 2-3 DS in one business unit3 months
3: ExpandExpand to other business units6 months
4: StabilizeRefine governance, complete transition3 months

From Embedded to Federated

PhaseActionsDuration
1: AssessDocument current practices, identify gaps1-2 months
2: PlatformBuild/buy central platform4-6 months
3: StandardsDefine and communicate standards2 months
4: MigrationMigrate teams to platform6-12 months

7.1.8. Governance Structures

Model Risk Management

For regulated industries (banking, insurance, healthcare):

FunctionRole
Model Risk Management (2nd line)Independent validation
Model Owners (1st line)Development, monitoring
Internal Audit (3rd line)Periodic review

ML Steering Committee

MemberRole
CTO/CDOExecutive sponsor
Business unit headsPriority input
ML Platform LeadTechnical updates
Risk/ComplianceGovernance oversight

Meeting cadence: Monthly for steering, weekly for working group.


7.1.9. Key Takeaways

  1. There’s no one-size-fits-all: Choose model based on size, maturity, and needs.

  2. Plan for evolution: What works at 10 DS won’t work at 100.

  3. Always have a platform team: The alternative is chaos.

  4. Balance centralization and speed: Too much of either fails.

  5. Governance is essential: Especially in regulated industries.

  6. Invest in community: DS across teams need to connect.

  7. Size MLOps at 1:6 to 1:10: Don’t understaff the platform.


Next: 7.2 Skills & Career Development — Growing ML talent.