Lars Augustin

RealityKit Basics - Part 2

2019 12 27

This tutorial was originally written for and published in Futureproofd on Medium

After disappearing for five months, I finally got around to write part two.

In this part of my series on RealityKit, we’ll look at loading models and interactions.

Loading models

In part one of this series, I went over RealityKit’s entity system and generating meshes. This time, instead of generating meshes we’ll load custom models as entities from a usdz file in our app’s bundle.

Let’s get started with the simplest of loading methods:

let model = try! Entity.loadModel(named: "modelName")

Unlike the generating a mesh, loading a model from a file, creates an entity, not a mesh. We can add this entity to our anchor:


This way, we can load a single model once. If we need the same model more than once, we can clone it instead of loading it twice. Cloning creates an identical copy, which references the same model.

let clone = model.clone(recursive: true)

Now we can add the clone to our anchor, the same way, we added the original model.

When loading models this way, the scene freezes until the model is fully loaded. If you want to load a model after the experience has already started, you should use the asynchronous loader.

Quick note about async loading

If you start a scene, you shouldn’t show the camera feed, until all objects are loaded. Starting with an empty scene can be confusing and disorienting for the user.

The asynchronous loader loader uses a similar syntax to the normal one:

_ = Entity.loadModelAsync(named: "modelName")

This time we don’t need to give the loader variable a name since the loader uses Apple’s new Combine framework. To receive the model once it’s fully loaded, use sink. If we need to add another model to the load request, we can append another request:

_ = Entity.loadModelAsync(named: "modelName")
       .append(Entity.loadModelAsync(named: "secondModelName"))

Before processing all of the models, we need to collect them, by calling collect. This time we also need to handle multiple models in the sink callback.

You can append multiple load requests to load lots of models at the same time.

All of this might have been a bit complex, but in general, you only need the first loading function to start out. If you have any further questions about the loading system, please put them in the comments to this post.


When building AR experiences, you’ll probably need to observe taps on objects. To determine which object a user tapped on, the 2D position of the tap on the screen needs to be converted to a 3D position in the AR scene. To determine, which object was tapped on, RealityKit uses ray tracing. RealityKit sends out a ray from the position the user tapped to the virtual scene. The first object, which is hit by the ray, is the one, being tapped on.

To make an entity tappable, we first need to generate a collision shape for it:

entity.generateCollisionShapes(recursive: true)

To let the user tap on an object, we first need to add a UITapGestureRecognizer to our ARView. Within the callback function, we get the position of the touch. Now we can convert it using the following RealityKit function:

if let entity = arView.entity(at: tapLocation) {
    // do stuff

This function saves us the step of determining the entity at a given 3D position. We get direct access to the entity and can modify it, or call any other function in our app.

RealityKit’s input system is pretty basic for now. The only other option is to use ARView.entities(at: tapLocation) to get an array of objects that are at the position of the tap of the user. These two functions will cover most use cases, but having more options would definitely be nice.

Wrapping up

Since the explanation of RealityKit’s loading mechanism turned out to be way longer than expected, I’ll end this part here. In the next parts, I’ll cover animation, physics, and audio.