41.1. Unity/Unreal CI/CD (Headless Builds)
Status: Draft Version: 1.0.0 Tags: #Sim2Real, #Unity, #UnrealEngine, #CICD, #Docker Author: MLOps Team
Table of Contents
- The “Game” is actually a “Simulation”
- The Headless Build: Running Graphics without a Monitor
- Unity CI/CD Pipeline
- C# Implementation: Automated Build Script
- Unreal Engine: Pixel Streaming & Vulkan
- Determinism: The PhysX Problem
- Infrastructure: Dockerizing a 40GB Engine
- Troubleshooting: Common Rendering Crashes
- Future Trends: NeRF-based Simulation
- MLOps Interview Questions
- Glossary
- Summary Checklist
Prerequisites
Before diving into this chapter, ensure you have the following installed:
- Unity Hub / Unreal Engine 5: For local testing.
- GameCI: A community toolset for Unity Actions.
- Docker: With NVIDIA Container Toolkit support.
The “Game” is actually a “Simulation”
In Traditional MLOps, “Environment” means a Python venv or Docker container.
In Embodied AI (Robotics), “Environment” means a 3D World with physics, lighting, and collision.
This world is usually built in a Game Engine (Unity or Unreal). The problem? Game Engines are GUI-heavy, Windows-centric, and hostile to CLI automation.
Sim2Real Pipeline:
- Artist updates the 3D model of the warehouse (adds a shelf).
- Commit
.fbxand.prefabfiles to Git (LFS). - CI triggers a “Headless Build” of the Linux Server binary.
- Deploy to a fleet of 1000 simulation pods.
- Train the Robot Policy (RL) in these parallel worlds.
The Headless Build: Running Graphics without a Monitor
You cannot just run unity.exe on a simplified EC2 instance. It will crash looking for a Display.
You must run in Batch Mode with Headless flags.
The Command Line:
/opt/unity/Editor/Unity \
-batchmode \
-nographics \
-silent-crashes \
-logFile /var/log/unity.log \
-projectPath /app/MySimProject \
-executeMethod MyEditor.BuildScript.PerformBuild \
-quit
-batchmode: Don’t pop up windows.-nographics: Don’t initialize the GPU for display (GPU is still used for compute/rendering if configured for offscreen).-executeMethod: Run a C# static function.
Unity CI/CD Pipeline
Using GitHub Actions and game-ci.
# .github/workflows/build-sim.yaml
name: Build Simulation
on: [push]
jobs:
build:
name: Build for Linux
runs-on: ubuntu-latest
container: unityci/editor:ubuntu-2022.3.10f1-linux-il2cpp
steps:
- name: Checkout
uses: actions/checkout@v4
with:
lfs: true # Critical for 3D assets
- name: Cache Library
uses: actions/cache@v3
with:
path: Library
key: Library-${{ hashFiles('Packages/manifest.json') }}
- name: Activate License
# You need a valid Unity Serial (PRO/PLUS) for headless builds
env:
UNITY_SERIAL: ${{ secrets.UNITY_SERIAL }}
UNITY_USERNAME: ${{ secrets.UNITY_USERNAME }}
UNITY_PASSWORD: ${{ secrets.UNITY_PASSWORD }}
run: |
/opt/unity/Editor/Unity \
-quit \
-batchmode \
-nographics \
-serial $UNITY_SERIAL \
-username $UNITY_USERNAME \
-password $UNITY_PASSWORD
- name: Build
run: |
/opt/unity/Editor/Unity \
-batchmode \
-nographics \
-projectPath . \
-executeMethod BuildScript.BuildLinuxServer \
-quit
- name: Upload Artifact
uses: actions/upload-artifact@v3
with:
name: SimBuild
path: Builds/Linux/
Git LFS Note:
Unity projects are huge. Library/ folder is cache, Assets/ is source.
Never commit Library/. Always cache it.
C# Implementation: Automated Build Script
You need a C# script inside an Editor folder to handle the build logic.
Project Structure
MySimProject/
├── Assets/
│ ├── Editor/
│ │ └── BuildScript.cs
│ └── Scenes/
│ └── Warehouse.unity
└── ProjectSettings/
Assets/Editor/BuildScript.cs:
using UnityEditor;
using UnityEngine;
using System;
using System.Linq;
// This class must be public for Unity's CLI to find it via reflection.
public class BuildScript
{
/// <summary>
/// The entry point for our CI/CD pipeline.
/// Usage: -executeMethod BuildScript.BuildLinuxServer
/// </summary>
public static void BuildLinuxServer()
{
Console.WriteLine("---------------------------------------------");
Console.WriteLine(" Starting Build for Linux Server ");
Console.WriteLine("---------------------------------------------");
// 1. Define Scenes
// We only fetch scenes that are enabled in the Build Settings UI.
string[] scenes = EditorBuildSettings.scenes
.Where(s => s.enabled)
.Select(s => s.path)
.ToArray();
if (scenes.Length == 0)
{
Console.WriteLine("Error: No scenes selected for build.");
EditorApplication.Exit(1);
}
// 2. Configure Options
// Just like clicking File -> Build Settings -> Build
BuildPlayerOptions buildPlayerOptions = new BuildPlayerOptions();
buildPlayerOptions.scenes = scenes;
buildPlayerOptions.locationPathName = "Builds/Linux/SimServer.x86_64";
buildPlayerOptions.target = BuildTarget.StandaloneLinux64;
// Critical for RL: "Server Build" removes Audio/GUI overhead
// This makes the binary smaller and faster.
// Also enables the "BatchMode" friendly initialization.
buildPlayerOptions.subtarget = (int)StandaloneBuildSubtarget.Server;
// Fail if compiler errors exist. Don't produce a broken binary.
buildPlayerOptions.options = BuildOptions.StrictMode;
// 3. Execute
Console.WriteLine("Invoking BuildPipeline...");
BuildReport report = BuildPipeline.BuildPlayer(buildPlayerOptions);
BuildSummary summary = report.summary;
// 4. Report Results
if (summary.result == BuildResult.Succeeded)
{
Console.WriteLine("---------------------------------------------");
Console.WriteLine($"Build succeeded: {summary.totalSize} bytes");
Console.WriteLine($"Time: {summary.totalTime}");
Console.WriteLine("---------------------------------------------");
}
if (summary.result == BuildResult.Failed)
{
Console.WriteLine("---------------------------------------------");
Console.WriteLine("Build failed");
foreach (var step in report.steps)
{
foreach (var msg in step.messages)
{
// Print compiler errors to stdout so CI logs capture it
Console.WriteLine($"[{msg.type}] {msg.content}");
}
}
Console.WriteLine("---------------------------------------------");
// Exit code 1 so CI fails
EditorApplication.Exit(1);
}
}
}
Unreal Engine: Pixel Streaming & Vulkan
Unreal (UE5) is heavier but more photorealistic. Ops for Unreal involves compiling C++ shaders.
Shader Compilation Hell:
UE5 compiles shaders on startup.
In a Docker container, this can take 20 minutes and consume 32GB RAM.
Fix:
Compile shaders once and commit the DerivedDataCache (DDC) to a shared NFS or S3 bucket.
Configure UE5 to read DDC from there.
Pixel Streaming: For debugging the Robot, you often want to see what it sees. Unreal Pixel Streaming creates a WebRTC server. You can view the simulation in Chrome.
- Ops: Deploy a separate “Observer” pod with GPU rendering enabled, strictly for human debugging.
Determinism: The PhysX Problem
RL requires Determinism. Run 1: Robot moves forward 1m. Run 2: Robot moves forward 1m. If Run 2 moves 1.0001m, the policy gradient becomes noisy.
Sources of Non-Determinism:
- Floating Point Math: $a + b + c \neq a + (b + c)$.
- Physics Engine (PhysX): Often sacrifices determinism for speed.
- Variable Timestep: If FPS drops,
Time.deltaTimechanges, integration changes.
Fix:
- Fix Timestep: Set
Time.fixedDeltaTime = 0.02(50Hz). - Seeding: Set
Random.InitState(42). - Physics: Enable “Deterministic Mode” in Project Settings (Unity Physics / Havok).
Infrastructure: Dockerizing a 40GB Engine
You don’t want to install Unity on every Jenkins agent. You use Docker. But the Docker image is 15GB.
# Dockerfile for Unity Simulation
# Stage 1: Editor (Huge Image, 15GB+)
FROM unityci/editor:ubuntu-2022.3.10f1-linux-il2cpp as builder
WORKDIR /project
# 1. Copy Manifest (for Package Manager resolution)
# We copy this first to leverage Docker Layer Caching for dependencies
COPY Packages/manifest.json Packages/manifest.json
COPY Packages/packages-lock.json Packages/packages-lock.json
# 2. Copy Source
COPY Assets/ Assets/
COPY ProjectSettings/ ProjectSettings/
# 3. Build
# We pipe logs to build.log AND cat it, because Unity swallows stdout sometimes
RUN /opt/unity/Editor/Unity \
-batchmode \
-nographics \
-projectPath . \
-executeMethod BuildScript.BuildLinuxServer \
-quit \
-logFile build.log || (cat build.log && exit 1)
# Stage 2: Runtime (Small Image, <1GB)
FROM ubuntu:22.04
WORKDIR /app
COPY --from=builder /project/Builds/Linux/ .
# Libraries needed for Unity Player (Vulkan/OpenGL drivers)
RUN apt-get update && apt-get install -y \
libglu1-mesa \
libxcursor1 \
libxrandr2 \
vulkan-utils \
&& rm -rf /var/lib/apt/lists/*
# Run in Server Mode (Headless)
ENTRYPOINT ["./SimServer.x86_64", "-batchmode", "-nographics"]
Troubleshooting: Common Rendering Crashes
Scenario 1: “Display not found”
- Symptom:
[HeadlessRender] Failed to open display. - Cause: You forgot
-batchmodeor-nographics. Or your code is trying to accessScreen.widthin a static constructor. - Fix: Ensure you strictly use Headless flags. Wrap GUI code in
#if !UNITY_SERVER.
Scenario 2: The Shader Compilation Hang
- Symptom: CI hangs for 6 hours at “Compiling Shaders…”.
- Cause: Linux builder has no GPU. Software compilation of 10,000 shaders is slow.
- Fix: Pre-compile shaders on a Windows machine with a GPU, commit the
Library/ShaderCache, or use a Shared DDC.
Scenario 3: Memory Leaks in Simulation
- Symptom: Pod crashes after 1000 episodes.
- Cause: You are instantiating GameObjects (
Instantiate(Bullet)) but never destroying them (Destroy(Bullet)). - Fix: Use Object Pooling. Never allocate memory during gameplay loops.
Scenario 4: License Activation Failure
- Symptom:
User has no authorization to use Unity. - Cause: The Docker container cannot reach Unity Licensing Servers, or the
.ulffile is invalid. - Fix: Use “Manual Activation” via
.ulffile in secrets, or set up a local Unity Floating License Server.
Future Trends: NeRF-based Simulation
Traditional Sim uses polygons (Triangles). Reality is not made of triangles. Neural Radiance Fields (NeRFs) and Gaussian Splatting allow reconstructing real environments (scan a room) and using that as the simulation.
- Ops Challenge: NeRF rendering is $O(N)$ heavier than Polygons. Requires massive GPU inference just to render the background.
MLOps Interview Questions
-
Q: Why not just run the simulation on the training node (GPU)? A: CPU bottleneck. Physics runs on CPU. Rendering runs on GPU. If you run both on the training node, the GPU waits for Physics. It’s better to Scale Out simulation (1000 CPU pods) and feed one Training GPU pod over the network.
-
Q: How do you handle “Asset Versioning”? A: 3D assets are binary blobs. Git is bad at diffing them. We use Git LFS (Large File Storage) and Lock mechanisms (“I am editing the MainMenu.unity, nobody else touch it”).
-
Q: What is “Isaac Gym”? A: NVIDIA’s simulator that runs Physics entirely on the GPU. This avoids the CPU-GPU bottleneck. It can run 10,000 agents in parallel on a single A100.
-
Q: Explain “Time Scaling” in Simulation. A: In Sim, we can run
Time.timeScale = 100.0. 100 seconds of experience happen in 1 second of wall-clock time. This is the superpower of RL. Ops must verify that physics remains stable at high speed. -
Q: How do you test a Headless build? A: You can’t see it. You must add Application Metrics (Prometheus).
sim_fpssim_episode_rewardsim_collisionsIfsim_collisionsspikes to infinity, the floor collider is missing.
Glossary
- Headless: Running software without a Graphical User Interface (GUI).
- Prefab: A reusable Unity asset (template for a GameObject).
- IL2CPP: Intermediate Language to C++. Unity’s compiler tech to turn C# into native C++ for performance.
- Git LFS: Git extension for versioning large files.
- Pixel Streaming: Rendering frames on a server and streaming video to a web client.
Summary Checklist
- License: Unity requires a Pro License for Headless CI. Ensure you activate the serial number via environment variable
$UNITY_SERIAL. - Caching: Cache the
Libraryfolder (Unity) orDerivedDataCache(Unreal). It saves 30+ minutes per build. - Tests: Write Unity Test Runner tests (
PlayMode) to verify physics stability before building. - Artifacts: Store the built binary in S3/Artifactory with a version tag (
sim-v1.0.2). RL training jobs should pull specific versions. - Logs: Redirect logs to stdout (
-logFile /dev/stdout) so Kubernetes/Datadog can scrape them.