Hello,
I checked following documentations.
Vision | Apple Developer Documentation
Discover Swift enhancements in the Vision framework - WWDC24 - Videos - Apple Developer
I saw Vision Framework is available on visionOS.
So I want to know that if it's possible using Vision Framework on visionOS for tracking human and animal body poses. Or are there some limits to use this on visionOS?
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi,
I am in the process of implementing SharePlay into our app. The shared experience opens an Immersive Space and we set systemCoordinator.configuration.supportsGroupImmersiveSpace = true
Now visionOS establishes a shared coordinate space for the immersive space.
From the docs:
To achieve consistent positioning of RealityKit entities across multiple devices in an immersive space during a SharePlay session
There are cases where we want to position content in front of the user (independent of the shared session, and for each user individually). Normally to do that we use the transform retrieved via worldTrackingProvider.queryDeviceAnchor.originFromAnchorTransform
to position content in front of the user (plus some Z Offset and smooth interpolation).
This works fine in non-SharePlay instances and the device transform is where I would expect it to be but during the FaceTime call deviceAnchor.originFromAnchorTransform seems to use the shared origin of the immersive space and then I end up with a transform that might be offset.
Here is a video of the issue in action: https://streamable.com/205r2p
The blue rect is place using AnchorEntity(.head, trackingMode: .continuous). This works regardless of the call and the entity is always placed based on the head position.
The green rect is adjusted on every frame using the transform I get from worldTrackingProvider.queryDeviceAnchor. As you can see it's offset.
Is there any way I can query query this transform locally for the user during a FaceTime call?
Also I would like to know if it's possible to disable this automatic entity transform syncing behavior?
Setting entity.synchronization = nil results in the entity not showing up at all.
https://developer.apple.com/documentation/realitykit/synchronizationcomponent
Is SynchronizationComponent only relevant for the legacy MultiPeerConnectivity approach?
Thank you!
I created a new Spatial Rendering App from the template in Xcode 26.0.1. When I run the app, click 'Show Immersive Space' and select my Vision Pro from the pop-up dialog, the content in the dialog flickers (which seems to indicate something crashed) and nothing appears on my Vision Pro.
I'm running the released macOS 26.0 (25A354) and visionOS 26.0 (23M336). Filed as FB20397093.
I'm working on a project that uses imageTrackingProvider through ARKit on VisionPro, and I want to detect multiple images(about 5) and show info at the same time.
However, I found that it seems only 1 image could be detected by device at one time.
And the api of maximumNumberOfTrackedImages doing this seems not available for visionOS but only iOS.
Anyone knows possible ways to detect multiple images at the same time on VisionPro?
Topic:
Spatial Computing
SubTopic:
ARKit
Hello,
I have downloaded and run the sample object tracking app for visionos.
Now I'm working on my own objects for tracking. I have made a model using Create ML using images of my object.
However, I cannot see how to convert the Create ML output file (xxx.mlmodel) into a reference object like the files in the sample project.
is there a tool for converting them?
TIA
Topic:
Spatial Computing
SubTopic:
ARKit
it looks like one week after accepting as a nearby other AVP device... it expires
since we are providing our clients for a timeless app to walk inside archtiecture, it's a shame that not technical staff should connect every week 5 devices to work together
is there any roundabout for this issue or straight to the wishlist ?
thanks for the support !!
Hello, I am currently developing a Vision Pro VR application with Unreal Engine 5.5. Is it possible to interact with objects (grabbing, clicking on buttons)? I cannot find any information on this. Thank you.
Topic:
Spatial Computing
SubTopic:
General
How do I configure a Unity project for a fully immersive VR app on Apple Vision Pro using Metal Rendering, and add a simple pinch-to-teleport-where-looking feature? I've tried the available samples and docs, but they don't cover this clearly (to me).
So far, I've reviewed Unity XR docs, Apple dev guides, and tutorials, but most emphasize spatial apps. Metal examples exist but don't include teleportation. Specifically:
visionOS sample "XRI_SimpleRig" – Deploys to device/simulator, but no full immersion or teleport.
XRI Toolkit sample "XR Origin Hands (XR Rig)" – Pinch gestures detect, but not linked to movement.
visionOS "XR Plugin" sample "Metal Sample URP" – Metal setup works, but static scene without locomotion.
I'm new in Unity XR development and would appreciate a simple, standalone scene or document focused only on the essentials for "teleport to gaze on pinch" in VR mode—no extra features. I do have some experience in unreal, world toolkit, cosmo, etc from the 90's and I'm ok with code.
Please include steps for:
Setting up immersive VR (disabling spatial defaults if needed).
Integrating pinch detection with ray-based teleport.
Any config changes or basic scripts.
Project Configuration:
Unity Editor Version: 6000.2.5f1.2588.7373 (Revision: 6000.2/staging 43d04cd1df69)
Installed Packages:
Apple visionOS XR Plugin: 2.3.1
AR Foundation: 6.2.0
PolySpatial XR: 2.3.1
XR Core Utilities: 2.5.3
XR Hands: 1.6.1
XR Interaction Toolkit: 3.2.1
XR Legacy Input Helpers: 2.1.12
XR Plugin Management: 4.5.1
Imported Samples:
Apple visionOS XR Plugin 2.3.1: Metal Sample - URP
XR Hands 1.6.1
XR Interaction Toolkit 3.2.1: Hands Interaction Demo, Starter Assets, visionOS
Build Platform Settings:
Target: Apple visionOS
App Mode: Metal Rendering with Compositor Services
Selected Validation Profiles: visionOS Metal
Documentation: Enabled
Xcode Version: 26.01
visionOS SDK: 26
Mac Hardware: Apple M1 Max
Target visionOS Version: 20 or 26
Test Environment: Model: Apple Vision Pro, visionOS 26.0.1 (23M341), Apple M1 Max
No errors in builds so far; just missing the desired functionality.
Thanks for a complete response with actionable steps.
Is there any way to convert TextureResource to Image
I am running a Spatial Rendering App template demo, it shows “No People Found ” “There is no one nearby to share with”.
How can I stream videos rendered by Mac to my vision pro
I am using macOS 26.0, visionOS 26, Xcode 26
Topic:
Spatial Computing
SubTopic:
General
Dear Apple Engineers,
I am working on a project in visionOS and need to implement a curved surface effect for video playback, where the width of the surface can be dynamically adjusted. Specifically, I want the video to be displayed on a curved surface (similar to a scroll unfolding), and the user should be able to adjust the width of this surface.
I have the following specific questions:
How can I implement a curved surface for video playback and ensure the video content is not stretched or distorted on the surface?
How can I create a dynamic curved surface (such as a bending plane) in RealityKit or visionOS, where the width can be adjusted by the user?
Is it possible to achieve more complex curved surface effects (such as scroll unfolding or bending) using Shaders or other techniques?
Thank you very much for your help!
When I run my app from Xcode on a device running iOS 26, the roomplan capture is corrupted and the recording is green and purple. This issue does not occur when I use an older version of iOS or when I run the app via testFlight or the App Store.
While using Screen Mirroring in developer mode within my immersive space, I noticed an alignment issue with the computer cursor (transparent circle). When I move it toward an attachment view, the cursor remains horizontal instead of aligning with the surface of the attachment view. It shows correctly on a 2D window only wrong on attachment view.
Is this behavior a bug, or could it be caused by a missing or incorrect configuration on the attachment view?
Want help, thanks.
I'm placing sphere at finger tip and updating its position as hand move.
Finger joint tracking functions correctly, but I’ve observed noticeable latency in hand tracking updates whenever a UITextView becomes active. This lag happens intermittently during app usage, lasting about 5–10 seconds, after which the latency disappears and the sphere starts following the finger joints immediately.
When I open the immersive space for the first time, the profiler shows a large performance spike upto 328%. After that, it stabilizes and runs smoothly.
Note: I don’t observe any lag when CPU usage spikes to 300% (upon immersive view load)
yet the lag still occurs even when CPU usage remains below 100%.
I’m using the following code for hand tracking:
private func processHandTrackingUpdates() async {
for await update in handTracking.anchorUpdates {
let handAnchor = update.anchor
if handAnchor.isTracked {
switch handAnchor.chirality {
case .left:
leftHandAnchor = handAnchor
updateHandJoints(for: handAnchor, with: leftHandJointEntities)
case .right:
rightHandAnchor = handAnchor
updateHandJoints(for: handAnchor, with: rightHandJointEntities)
}
} else {
switch handAnchor.chirality {
case .left:
leftHandAnchor = nil
hideAllJoints(in: leftHandJointEntities)
case .right:
rightHandAnchor = nil
hideAllJoints(in: rightHandJointEntities)
}
}
await MainActor.run {
handTrackingData.processNewHandAnchors(
leftHand: self.leftHandAnchor,
rightHand: self.rightHandAnchor
)
}
}
}
And here’s the function I’m using to update the joint positions:
private func updateHandJoints(
for handAnchor: HandAnchor,
with jointEntities: [HandSkeleton.JointName: Entity]
) {
guard handAnchor.isTracked else {
hideAllJoints(in: jointEntities)
return
}
// Check if the little finger tip and intermediate base are both tracked.
if let tipJoint = handAnchor.handSkeleton?.joint(.littleFingerTip),
let intermediateBaseJoint = handAnchor.handSkeleton?.joint(.littleFingerIntermediateTip),
tipJoint.isTracked,
intermediateBaseJoint.isTracked,
let pinkySphere = jointEntities[.littleFingerTip] {
// Convert joint transforms to world space.
let tipTransform = handAnchor.originFromAnchorTransform * tipJoint.anchorFromJointTransform
let intermediateBaseTransform = handAnchor.originFromAnchorTransform * intermediateBaseJoint.anchorFromJointTransform
// Extract positions from the transforms.
let tipPosition = SIMD3<Float>(tipTransform.columns.3.x,
tipTransform.columns.3.y,
tipTransform.columns.3.z)
let intermediateBasePosition = SIMD3<Float>(intermediateBaseTransform.columns.3.x,
intermediateBaseTransform.columns.3.y,
intermediateBaseTransform.columns.3.z)
// Calculate the midpoint.
let midpointPosition = (tipPosition + intermediateBasePosition) / 2.0
// Position the sphere at the midpoint and make it visible.
pinkySphere.isEnabled = true
pinkySphere.transform.translation = midpointPosition
} else {
// If either joint is not tracked, hide the sphere.
jointEntities[.littleFingerTip]?.isEnabled = false
}
// Update the positions of all other hand joint spheres.
for (jointName, entity) in jointEntities {
if jointName == .littleFingerTip {
// Already handled the pinky above.
continue
}
guard let joint = handAnchor.handSkeleton?.joint(jointName),
joint.isTracked else {
entity.isEnabled = false
continue
}
entity.isEnabled = true
let jointTransform = handAnchor.originFromAnchorTransform * joint.anchorFromJointTransform
entity.transform.translation = SIMD3<Float>(jointTransform.columns.3.x,
jointTransform.columns.3.y,
jointTransform.columns.3.z)
}
}
I’ve attached both a profiler trace and a video recording from Vision Pro that clearly demonstrate the issue.
Profiler: https://drive.google.com/file/d/1fDWyGj_fgxud2ngkGH_IVmuH_kO-z0XZ
Vision Pro Recordings:
https://drive.google.com/file/d/17qo3U9ivwYBsbaSm26fjaOokkJApbkz-
https://drive.google.com/file/d/1LxTxgudMvWDhOqKVuhc3QaHfY_1x8iA0
Has anyone else experienced this behavior? My thought is that there might be some background calculations happening at the OS level causing this latency. Any guidance would be greatly appreciated.
Thanks!
Hello,
We discovered that a bunch of our old animated models were no longer animated on iOS15 and onwards.
After a few days of playing spot the difference between usda files I noticed that all the broken models had an xform called "Scene". Lo and behold, changing the name of that xform fixed the issue on all the models. Even lowercase "scene" makes the animations work again. Is "Scene" a reserved keyword or something? What other keywords do we need to avoid so we can create more robust USDZ files?
I'm surprised this issue isn't more widespread considering Blender wraps models in a "Scene" node.
At the drive link below you can find two animated cube USDZs. The only difference is the name of one of the xforms. The one with a "Scene" xform is not animated in quicklook (replicated on iPhone 13 iOS v15.2, iPhone 13 iOS v 18.3, and various devices on Browserstack including iPhone 16 iOS v18.3).
https://drive.google.com/drive/folders/1dch1WaM9O6mbHy29S6NGWgnSHkZkPiBf?usp=sharing
Hello,
In my project, I have attached a ManipulationComponent to Entity A and as expected, I'm able interact with it using the built-in gestures. I have another Entity B which is a child of A that I would like to interact with as well, so I attempted to add a ManipulationComponent to B. However, no gestures seem to be registered on B; I can still interact with A but B cannot be interacted with despite having ManipulationComponents on both entities.
So I'm wondering if I'm just doing something wrong, if this is an issue with the ManipulationComponent, or if this is a limitation of the API.
Attached is the code used to add the ManipulationComponent to an Entity and it was done on both A and B:
let mc = ManipulationComponent()
model.components.set(mc)
var boxShape = ShapeResource.generateBox(width: 0.25, height: 0.05, depth: 0.25)
boxShape = boxShape.offsetBy(translation: simd_float3(0, -0.05, -0.25))
ManipulationComponent.configureEntity(model, collisionShapes: [boxShape])
if var mc = model.components[ManipulationComponent.self] {
mc.releaseBehavior = .stay
mc.dynamics.inertia = .low
model.components.set(mc)
}
I am using visionOS 26.0; let me know if there's any additional information needed.
Here is my code in visionOS 2.3
NavigationSplitView {
List {
}
.navigationTitle("Passwords")
} detail: {
Text("Hello")
.navigationTitle("All")
}
The font size of "Passwords" and "All" are smaller than the ones in Passwords app.
iOS currently restricts background Bluetooth advertising and scanning in order to preserve battery life and protect user privacy. While these restrictions serve important purposes, they also limit legitimate use cases where users have explicitly opted in to proximity-based experiences.
The core challenge is that modern social applications need a way to detect when users are physically present at the same location or event without requiring every participant to keep their app in the foreground. Under the current system, background BLE advertising is heavily throttled and can only transmit a limited payload, background scanning intervals are sparse and unpredictable, peer-to-peer proximity detection cannot be maintained reliably when apps are in the background, and Background App Refresh is non-deterministic, making any kind of time-based proximity validation impossible.
A proposed enhancement would be to introduce an “Enhanced Proximity Permission.” This would allow developers to enable reliable background BLE advertising and scanning for declared time windows, such as a maximum of eight hours. It would also allow devices running the same app to detect each other’s proximity using ephemeral, rotating identifiers that preserve privacy, with clear user consent and prominent indicators whenever the feature is active.
Unlocking this capability would open up new categories of applications. Live events could offer automatic attendance tracking at concerts, conferences, or sports venues. Retail environments could support opt-in foot traffic analysis and dwell-time insights. Social apps could allow users to find friends at festivals, campuses, or other large venues. Safety applications could extend to crowd density monitoring and contact tracing beyond COVID-era needs. Gaming could offer real-world multiplayer experiences based on physical proximity, and transportation providers could verify rideshare pickups or measure public transit flows automatically.
Privacy safeguards would remain central. Permissions would be time-boxed and expire after an event or session. A mandatory visual indicator would be displayed whenever proximity tracking is active. A user-facing dashboard would show all apps granted enhanced proximity access. Permissions would automatically be revoked after a period of non-use, and only ephemeral tokens not permanent identifiers would be broadcast.
The industry impact would be significant. With this enhancement, iOS could power the next generation of location-aware social platforms while maintaining Apple’s leadership in privacy through explicit user control and transparency. Current alternatives, such as requiring users to keep apps in the foreground or deploying dedicated hardware beacons, produce poor user experiences and constrain innovation in spatial computing and social applications.
Can anyone from Apple consider this change? Having to buy iBeacons is brutal and means slower adoption. Please reconsider this for users who opt in.
I want to render a 3d/stereoscopic video in an Apple Vision Pro window using RealityKit/RealityView. The video is a left-right stereo. The straight forward approach would be to spawn a quad, and give it a custom Shader Graph material, which has a CameraIndexSwitch. The CameraIndexSwitch chooses between the right texture vs the left texture.
https://i.sstatic.net/XawqjNcg.png
The issue I have here is that I have to extract the video frames from my AVSampleBufferVideoRenderer. This should work ok, but not if I'm playing FairPlay content.
So, my question is, how to render stereo FairPlay videos in a SwiftUI RealityView?
Topic:
Spatial Computing
SubTopic:
Reality Composer Pro
Tags:
Metal
MetalKit
RealityKit
AVFoundation
I am creating a vision pro app with a 3D model, it has a mesh hierarchy of head, hands, feet etc. I want the character to look towards the camera, but am not able to access head of character through sceneKit nor reality kit. when I try to print names of the child meshes, it only prints till the character, it does iterate through all the body parts. Can anyone help?