my coworkers and i are guessing at what data defines an anchor. i tried searching but struggled to find anything helpful.
our best guess was a combination Triangular Irregular Networks (TIN), gps, magnetic compass direction and maybe elevation sensors.
is this documented anywhere? if not, can a definition or description be provided?
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Apple published a set of examples for using system gestures to interact with RealityKit entities. I've been using DragGesture a lot in my apps and noticed an issue when using it in an immersive space.
When dragging an entity, if I turn my body to face another direction, the dragged entity does not stay relative to my hand. This can lead to situations where the entity is pulled very close to me, or pushed far way, or even ends up behind me.
In the examples linked above, there are two versions of how they use drag.
handleFixedDrag: This is similar to what I'm doing now. It uses the value from value.gestureValue.translation3D as the basis for the drag
handlePivotDrag: This version aims to solve the problem I described above by using value.inputDevicePose3D as the basis of the gesture.
I've tried the example from handlePivotDrag, but it has one limitation. Using this version, I can move the entity around me as if it were on the inside of an arc or sphere. However, I can no longer move the entity further or closer. It stays within a similar (though not exact) distance relative to me while I drag.
Is there a way to combine these concepts? Ideally, I would like to use a gesture that behaves the same way that visionOS windows do. When we drag windows, I can move them around relative to myself, pull them closer, push them further, all while avoiding the issues described above.
Example from handleFixedDrag
mutating private func handleFixedDrag(value: EntityTargetValue<DragGesture.Value>) {
let state = EntityGestureState.shared
guard let entity = state.targetedEntity else { fatalError("Gesture contained no entity") }
if !state.isDragging {
state.isDragging = true
state.dragStartPosition = entity.scenePosition
}
let translation3D = value.convert(value.gestureValue.translation3D, from: .local, to: .scene)
let offset = SIMD3<Float>(x: Float(translation3D.x),
y: Float(translation3D.y),
z: Float(translation3D.z))
entity.scenePosition = state.dragStartPosition + offset
if let initialOrientation = state.initialOrientation {
state.targetedEntity?.setOrientation(initialOrientation, relativeTo: nil)
}
}
Example from handlePivotDrag
mutating private func handlePivotDrag(value: EntityTargetValue<DragGesture.Value>) {
let state = EntityGestureState.shared
guard let entity = state.targetedEntity else { fatalError("Gesture contained no entity") }
// The transform that the pivot will be moved to.
var targetPivotTransform = Transform()
// Set the target pivot transform depending on the input source.
if let inputDevicePose = value.inputDevicePose3D {
// If there is an input device pose, use it for positioning and rotating the pivot.
targetPivotTransform.scale = .one
targetPivotTransform.translation = value.convert(inputDevicePose.position, from: .local, to: .scene)
targetPivotTransform.rotation = value.convert(AffineTransform3D(rotation: inputDevicePose.rotation), from: .local, to: .scene).rotation
} else {
// If there is not an input device pose, use the location of the drag for positioning the pivot.
targetPivotTransform.translation = value.convert(value.location3D, from: .local, to: .scene)
}
if !state.isDragging {
// If this drag just started, create the pivot entity.
let pivotEntity = Entity()
guard let parent = entity.parent else { fatalError("Non-root entity is missing a parent.") }
// Add the pivot entity into the scene.
parent.addChild(pivotEntity)
// Move the pivot entity to the target transform.
pivotEntity.move(to: targetPivotTransform, relativeTo: nil)
// Add the targeted entity as a child of the pivot without changing the targeted entity's world transform.
pivotEntity.addChild(entity, preservingWorldTransform: true)
// Store the pivot entity.
state.pivotEntity = pivotEntity
// Indicate that a drag has started.
state.isDragging = true
} else {
// If this drag is ongoing, move the pivot entity to the target transform.
// The animation duration smooths the noise in the target transform across frames.
state.pivotEntity?.move(to: targetPivotTransform, relativeTo: nil, duration: 0.2)
}
if preserveOrientationOnPivotDrag, let initialOrientation = state.initialOrientation {
state.targetedEntity?.setOrientation(initialOrientation, relativeTo: nil)
}
}
Hi all,
I'm working on an ARKit-based iOS app where I need to accurately determine the direction the device is facing to localize objects in the real world. I'm using:
let config = ARWorldTrackingConfiguration()
config.worldAlignment = .gravityAndHeading
Thus, I would expect the world alignment to behave as given in the gravityAndHeading page.
The AR session is started after verifying that CLLocationManager.headingAccuracy <= 20, and the compass appears to be calibrated.
However, I'm seeing a major inconsistency:
When the rear camera is physically pointed toward true North, I would expect:
cameraTransform.columns.2.z ≈ -1 // (i.e. ARKit's -Z pointing North)
But instead, I'm consistently seeing:
cameraTransform.columns.2.z ≈ +0.97 // Implies camera is facing South
Meanwhile, the translation vector behaves as expected:
As I physically move North, cameraTransform.columns.3.z becomes more negative, matching the world’s +Z = South assumption.
For example, let's say I have the device in landscapeRight (or landscapeLeft for UIDeviceOrientation). Let's say the device rear camera is pointing towards True North, and I start moving towards True North. I get something like this:
Camera Transform = simd_float4x4(
[
[0.98446155, -0.030119859, 0.172998, 0.0],
[0.023979114, 0.9990097, 0.037477385, 0.0],
[-0.17395553, -0.032746706, 0.98420894, 0.0],
[0.024039675, -0.037087332, -0.22780673, 0.99999994]
])
As you can see, the cameraTransform.columns.2.z is positive despite the rear camera pointing towards True North, while cameraTransform.columns.3.z is correctly positive as the device is moving towards True North.
So here is my question:
Why is cameraTransform.columns.2.z positive when the rear camera is physically facing North?
Any clarity would be deeply appreciated. I've read the documentation and tested with different heading accuracies and AR session resets, but I keep running into this orientation mismatch.
Thanks in advance!
hi, I'm trying to create a virtual movie theater, but after running computeDiffuseReflectionUVs.py and applying attenuation map, I noticed the light falloff effect just covers over the objects. I used apple provided attenuation map (did not specify the attenuation map name on python script) with sample size of 6000. I thought the python script would calculate vertices and create shadow for, say, back of the chairs. Am I understanding this wrong?
I am a newby of spatial computing and I am using ARKit and RealityKit to develop a visionPro app.
I want to accomplish such a goal: If the user's hand touchs an object(an entity in RealityView) on the table, it will post a Window. But I do not know how to handle the event "the user's hand touchs the object". Should I use hand tracking feature to do some computing by myself? Or is there some api to use directly?
Thank you!
SharePlay objects are not placed in the same place in the same space. I hope they can be placed in the same place. (Vision Pro)
Topic:
Spatial Computing
SubTopic:
ARKit
I have been referencing the Object Tracking Tutorial from WWDC 2024 on Vision OS, how Create ML is used to create a reference object, and we can track them in the ARSession.
I am looking forward to building this feature on an AR app for iPhone, I am using iPhone 13 Pro Max. I have created couple of reference objects from the Create ML.
Hi, I am a new developer. I want to add articulated objects and deformable objects into my AR game. I haven't found any tutorial on this, I hope to interact with these objects. Please let me know if this is available in visionOS.
I’m working on a Vision Pro app using Metal and need to implement multi-pass rendering. Specifically, I want to render intermediate results to a texture, then use that texture in a second pass for post-processing before presenting the final output.
What’s the best approach in visionOS? Should I use multiple render passes in a single command buffer or separate command buffers? Any insights on efficiently handling this in RealityKit or Metal?
Thanks!
I thought the ARCoachingOverlayView was a nice touch, so each apps ARKit coaching was recognizable and I used it in my ARView/ARSCNView based apps.
Now with RealityView, is there any replacement planned?
Or should we just use UIViewRepresentable and wrap ARCoachingOverlayView?
That title would have made a great WWDC Sessions. Unfortunately, it seems like nothing is new in Reality Composer Pro this year. I've noticed at all versions of the Xcode Beta this summer have shipped with Reality Composer Pro version 2.0. There have been slight bumps in the build number. I haven't found any new features or seen any documentation to indicate that anything has changed.
So the question is, what is the state of Reality Composer Pro? Should we continue to use this tool or start doing everything in code? A huge number of Sample Projects use Reality Composer Pro, so it seems like Apple is still using it even if they didn't update it this year.
Topic:
Spatial Computing
SubTopic:
Reality Composer Pro
So it seems to be that there is a contradiction between how ARKit defines UIDeviceOrientation.landscapeRight, and the actual definition of UIDeviceOrientation.landscapeRight in the UIKit documentation.
In the ARKit documentation for ARCamera.transform, it says the following:
This transform creates a local coordinate space for the camera that is constant with respect to device orientation. In camera space, the x-axis points to the right when the device is in UIDeviceOrientation.landscapeRight orientation—that is, the x-axis always points along the long axis of the device, from the front-facing camera toward the Home button. The y-axis points upward (with respect to UIDeviceOrientation.landscapeRight orientation), and the z-axis points away from the device on the screen side.
Going through the same link, we see the definition of UIDeviceOrientation.landscapeRight given as:
The device is in landscape mode, with the device held upright and the front-facing camera on the right side.
There seems to be a conflict in the two definitions, that has already been asked and visualized in this StackOverflow thread
The resolution of that answer says that ARKit landscapeRight, unlike what is given in UIDeviceOrientation.landscapeRight, has home button on the right, as stated in the ARCamera.transform documentation.
It says that more details are given in this StackOverflow thread, but this thread talks about the discrepancy between the definitions of landscapeRight in UIDeviceOrientation and UIInterfaceOrientation, and not anything related to ARKit.
So I am wondering, why does ARKit definition of landscapeRight contradict with that of UIDeviceOrientation despite explicitly mentioning it? Is it just a mistake by Apple developers that hasn't been resolved even after so long?
Hi, we would like to create something where you can open multiple volumetric windows and place them in a room, our biggest issue is that we want these windows to be persistent, so when I close and reopen the app, the windows to be in the same position. We can't use immersive spaces because we also want to have the possibility to access the shared space.
Is it possible with the current features and capabilities to do that? If yes do you have some advices how can we achieve this?
The alternative is if is it possible to open the virtual display in immersive spaces or if we have the possibility to implement our own virtual display.
When I've made an animated UDSZ, at what framerate will the animation be rendered in QuickLook? Is it the same across all devices? (iPhone, Apple Vision Pro, etc.) and viewing environments? (QuickLook, inside an ARView, etc.)
Suppose I export my file at 30fps and the device draws at 60fps, does the device interpolate between frames automatically, animate at a lower frame rate, or play it at twice the speed? What if it were 24fps?
My primary concern with understanding frame rates is a bit of trouble I've had making perfectly looping animations. There always seems to be the slightest stutter between iterations.
Thanks in advance for any insights you're able to provide!
Hi there,
I’m building a workplace experience that requires using virtual desktop, is there a way to launch it in my code, so user doesn’t have to do it manually?
Thanks in advance!
I have a visionOS 2 project created on Xcode 16, when I updated to Xcode 26 beta5, I can't build it any more, every time it stuck in process like the picture shows below:
Already tried many methods to fix this issue, such as clear build folders, but don't work.
MacBook Air M2 / MacOS 26 beta5 / Xcode 26 beta5
I’ve been having some issues removing anchors. I can add anchors with no issue. They will be there the next time I run the scene. I can also get updates when ARKit sends them. I can remove anchors, but not all the time. The method I’m using is to call removeAnchor() on the data provider.
worldTracking.removeAnchor(forID: uuid)
// Yes, I have also tried `removeAnchor(_ worldAnchor: WorldAnchor)`
This works if there are more than one anchor in a scene. When I’m down to one remaining anchor, I can remove it. It seems to succeed (does not raise an error) but the next time I run the scene the removed anchor is back. This only happens when there is only one remaining anchor.
do {
// This always run, but it doesn't seem to "save" the removal when there is only one anchor left.
try await worldTracking.removeAnchor(forID: uuid)
} catch {
// I have never seen this block fire!
print("Failed to remove world anchor \(uuid) with error: \(error).")
}
I posted a video on my website if you want to see it happening.
https://stepinto.vision/labs/lab-051-issues-with-world-tracking/
Here is the full code. Can you see if I’m doing something wrong? Is this a bug?
struct Lab051: View {
@State var session = ARKitSession()
@State var worldTracking = WorldTrackingProvider()
@State var worldAnchorEntities: [UUID: Entity] = [:]
@State var placement = Entity()
@State var subject : ModelEntity = {
let subject = ModelEntity(
mesh: .generateSphere(radius: 0.06),
materials: [SimpleMaterial(color: .stepRed, isMetallic: false)])
subject.setPosition([0, 0, 0], relativeTo: nil)
let collision = CollisionComponent(shapes: [.generateSphere(radius: 0.06)])
let input = InputTargetComponent()
subject.components.set([collision, input])
return subject
}()
var body: some View {
RealityView { content in
guard let scene = try? await Entity(named: "WorldTracking", in: realityKitContentBundle) else { return }
content.add(scene)
if let placementEntity = scene.findEntity(named: "PlacementPreview") {
placement = placementEntity
}
} update: { content in
for (_, entity) in worldAnchorEntities {
if !content.entities.contains(entity) {
content.add(entity)
}
}
}
.modifier(DragGestureImproved())
.gesture(tapGesture)
.task {
try! await setupAndRunWorldTracking()
}
}
var tapGesture: some Gesture {
TapGesture()
.targetedToAnyEntity()
.onEnded { value in
if value.entity.name == "PlacementPreview" {
// If we tapped the placement preview cube, create an anchor
Task {
let anchor = WorldAnchor(originFromAnchorTransform: value.entity.transformMatrix(relativeTo: nil))
try await worldTracking.addAnchor(anchor)
}
} else {
Task {
// Get the UUID we stored on the entity
let uuid = UUID(uuidString: value.entity.name) ?? UUID()
do {
try await worldTracking.removeAnchor(forID: uuid)
} catch {
print("Failed to remove world anchor \(uuid) with error: \(error).")
}
}
}
}
}
func setupAndRunWorldTracking() async throws {
if WorldTrackingProvider.isSupported {
do {
try await session.run([worldTracking])
for await update in worldTracking.anchorUpdates {
switch update.event {
case .added:
let subjectClone = subject.clone(recursive: true)
subjectClone.isEnabled = true
subjectClone.name = update.anchor.id.uuidString
subjectClone.transform = Transform(matrix: update.anchor.originFromAnchorTransform)
worldAnchorEntities[update.anchor.id] = subjectClone
print("🟢 Anchor added \(update.anchor.id)")
case .updated:
guard let entity = worldAnchorEntities[update.anchor.id] else {
print("No entity found to update for anchor \(update.anchor.id)")
return
}
entity.transform = Transform(matrix: update.anchor.originFromAnchorTransform)
print("🔵 Anchor updated \(update.anchor.id)")
case .removed:
worldAnchorEntities[update.anchor.id]?.removeFromParent()
worldAnchorEntities.removeValue(forKey: update.anchor.id)
print("🔴 Anchor removed \(update.anchor.id)")
if let remainingAnchors = await worldTracking.allAnchors {
print("Remaining Anchors: \(remainingAnchors.count)")
}
}
}
} catch {
print("ARKit session error \(error)")
}
}
}
}
On Xcode 26 and visionOS 26, apple provides observable property for Entity, so we can easily interact with Entity between RealityScene and SwiftUI, but there is a issue:
It's fine to observe Entity's position and scale properties in Slider, but can't observe orientation properties in Slider.
MacBook Air M2 / Xcode 26 beta6
I am encountering an issue while using the multiview video demo provided at this link "https://developer.apple.com/documentation/avkit/creating-a-multiview-video-playback-experience-in-visionos/". Specifically, when running on versions of visionOS prior to 2.2, navigating back results in a blank screen. Has anyone else experienced this problem and found a solution? Any advice or workaround would be greatly appreciated.
Hello,
I'm developing a LiDAR-based scanning app using Swift, where I can successfully perform scans and export the results as .obj files. My goal is to have the scan's colors and textures closely resemble real-world visuals as captured by the camera, similar to the results shown in this repository.
In the referenced repository, the result is demonstrated with a single screenshot, but I want to display the textures and colors throughout the entire scanning process, not just at the final result. To clarify, I'm not focused on scanning individual objects but rather larger environments like rooms, houses, or outdoor spaces such as streets.
Here’s what I’m aiming for:
Realistic colors and textures that match what the camera sees during the scan.
Continuous texture rendering during the scanning process, not just in the final exported model.
Could anyone share guidance, sample code, or point me to relevant documentation to achieve this? Any help would be greatly appreciated!
Thank you!