Introduction

Earlier this year, Unity made the first official release of their ML-Agents Toolkit, a framework for developing, training and testing Artificial Intelligence algorithms in a Unity environment. It’s open-source and comes with a plenty of examples to help get started and there are lots of cool demos built by the community.

One of the main features of Unity ML-Agents is Reinforcement Learning, where AI Agents are free to observe their environment, take actions, and are rewarded according to the outcomes of their actions. The training process takes the form of a feedback loop where agents experience many different environments, make actions and record the reward that they receive and repeat. Over time, the training policy adapts the actions so as to maximize the reward that the agent receives. By careful selection of the reward and tuning of the training policy, agents can be trained to perform a wide range of tasks such as balancing balls, solving puzzles, or playing tennis.

Alongside the training policy and the reward function, the third key component is the environment. Normally, environments have to be built by hand in Unity for a specific task. What if we could use the WRLD map as a ready-made general purpose training environment for real-world tasks? This post shows you how to do just that!

Getting Started

First of all, we’re going to need set up our environment. Unity provides a full installation guide but we’ll cover the basics here. You’ll need:

If you don’t already have a WRLD Game Development account you can start a 10 day free trial - plenty of time to work through this tutorial!

Setting up your project

Let’s get started by setting up Unity.

  1. Create a new project in the Unity Editor using the “3D” template. Give it a name like “ML Agents Tutorial”.
  2. Import the WRLD Unity SDK by selecting Assets → Import Package → Custom Package…, choose the WRLD .unitypackage file you downloaded earlier and hit Import. We recommend you increase the default shadow distance setting when prompted. You’ll find detailed instructions here.
  3. Open the UnityWorldSpace scene under Assets/Wrld/Scenes/ in the Project panel. Select the WrldMap GameObject and paste your API Key into the field shown in the Inspector.
  4. Add the Unity ML Agents package. You can find it under Window → Package Manager. Look for “ML Agents” in the Unity Registry and click Install. Unity provide further instructions here if needed.

With the UnityWorldSpace scene open, press play to run the project in the editor. If you’ve followed the steps correctly you’ll see a map of San Francisco stream in and you can pan and zoom the camera around with the mouse.

Configuring ML Agents

Now that we have our environment - the WRLD map - we need to set up the ML Agents framework to learn how to perform a task. In this case we’re going to train a model to navigate to a specific location on the map. We’ll be basing our setup on Unity’s RollerBall tutorial, so it’s worth running through that first to understand all of the concepts involved.

Adding a Target object

First we’re going to add the target that our agent should navigate towards.

  1. To make life easier, we’re going to start with the camera at the location we want to navigate to. Open the WRLDMap GameObject and in the Inspector, set the Start Latitude to 56.45987 and the Start Longitude to -2.977954. This is a location just outside the WRLD offices in Dundee, Scotland. Make sure the Terrain Collisions and Building Collisions settings are checked.
  2. Right-click on the WrldMap in the Hierarchy view and choose 3D Object → Cube.
  3. Click on the Cube in the Hierarchy view and in the Inspector, name it “Target”.
  4. To make it a bit easier to see on the map, set the Scale to (1, 3, 1).
  5. To position our target correctly, we’re going to attach a Positioner script. Select Assets → Import New Asset… and select the TargetPositioner.cs file from the tutorial zip file. The script does two key things - firstly it calls WorldToGeographicPoint() to retrieve the lat/lon coordinates of the origin in the Unity Space (which is the location the camera initially is looking at). Secondly, it registers an OnTransformedPointchanged callback so that when the map terrain is loaded, the object is moved to the correct elevation on the ground.
  6. Drag the TargetPositioner script from the Assets panel onto the Target object in the Hierarchy view.

Now if you run the project you should see the map of Dundee load, and if you zoom in you’ll see the target object standing outside the WRLD offices.

Adding an agent

Next we’re going to add the agent. Like in Unity’s RollerBall example, we will use a sphere that the agent will apply forces to in order to roll it around the map towards the target.

  1. Right-click on the WrldMap in the Hierarchy view and choose 3D Object → Sphere. Name it “AgentBall”
  2. Click on AgentBall in the Heirarchy view, and in the Inspector click Add Component and choose Physics → Rigidbody
  3. To make it a bit easier to see on the map, set the Scale to (2, 2, 2).
  4. All of the logic which defines our Agent’s behaviour is included in the provided AgentBall.cs script. Select Assets → Import New Asset… and select the AgentBall.cs file from the tutorial zip file. Similarly to the TargetPositioner script, we use a positioner to make sure the ball is placed on the surface of the map. To set up the navigation task, at the start of each training episode the ball is placed at a random outdoors location within 0.002 degrees (roughly 200m) from the target. The observations which are collected by the agent are the position and velocity of the ball (the target is alwas positioned at the origin) and the actions received by the agent are converted into forces which are applied to the ball. For simplicity, we ignore the vertical (Y) direction. A reward is applied accumulated by the agent as follows:

    • If the agent is > 300m from the target, it is penalized by -500 and the training episode ends.
    • If the agent reaches the target, it gains a reward of 500 and the training episode ends.
    • Each time the agent receives an action, it receives a penalty proportional to the distance to the target (maximum of -1).
  5. Drag the AgentBall script from the Assets panel onto the AgentBall object in the Hierarchy view. Set the Max Step parameter to 5000. NB: This will also add a Behaviour Parameters component that we will configure later.
  6. Drag the Target GameObject in the Hierarchy into the Target field in AgentBall Script.
  7. Click Add Component in the Inspector and select ML Agents → Decision Requestor. Set the Decision Period parameter to 10.
  8. Set the Behavior Parameters in the Inspector as follows: Behaviour Name = AgentBall Vector Observation → Space Size = 4 Vector Action → Space Type = Continuous Vector Action → Space Size = 2

Now when you run the project you’ll find that the ball has been spawned somewhere nearby on the map (you might have to move the camera around a bit to locate it as it might be behind a building). You can use the cursor keys to apply a force and move it around until you reach the target, where it will disappear and respawn.

Training the agent

Before we can train our agent, we need to configure the learning process. A detailed explanation of all the options is outside the scope of this tutorial, but you can find detailed descriptions here if you’re interested. We found that the following parameters, in the provided rollerballwrld_config.yaml file, work reasonably well for our scenario:

behaviors:
  AgentBall:
    trainer_type: ppo
    hyperparameters:
      batch_size: 500 
      buffer_size: 5000 
      learning_rate: 3.0e-4 
      beta: 5.0e-3 
      epsilon: 0.2
      lambd: 0.99 
      num_epoch: 3
      learning_rate_schedule: linear
    network_settings:
      normalize: true 
      hidden_units: 256
      num_layers: 2 
    reward_signals:
      extrinsic:
        gamma: 0.99
        strength: 1.0
    max_steps: 500000
    time_horizon: 500 
    summary_freq: 1000

If you haven’t already, you’ll need in install the mlagents Python package into your python environment. Once that’s done you can start training by running:

mlagents-learn rollerballwrld_config.yaml --run-id=FirstRun

And then pressing play in the Unity Editor when prompted.

If all is well, you should see the agent moving by itself as it explores the environment and learns to maximize its reward as shown in the video below. You can monitor the training process using TensorBoard.

Next steps

Hopefully this tutorial has showed you how easy it is to get a simple ML Agent up and running with a WRLD environment, which opens up a wide range of possibilities. Here are some of the things we’ve thought of:

  • Adding more realistic agent physics - for example a car with steering and acceleration / braking, or a pedestrian model.
  • Adding more observations such as raycasts to enable to agent to perceive its environment without having to crash into buildings.
  • Modifying the reward function and training parameters to improve the training speed and agent performance.
  • Training an agent to follow a moving target - which could be used as the basis of a simple WRLD hide-and-seek game.

I’m sure there are lots of other applications that you’ll want to try out that we haven’t even thought of yet! If you have any questions, just get in touch with our support team and we’ll do our best to help.

Have fun building with WRLD, and don’t forget to share your results by tagging @wrld3d in your social posts!