Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

All subtopics
Posts under Media Technologies topic

Post

Replies

Boosts

Views

Created

Hosting x86 Audio Units on Silicon Mac
My app encountered problems when trying to open an x86 audioUnit v2 on a Silicon Mac (although Rosetta is installed). There seems to be a XPC connection issue with the AUHostingService that I don't know how to fix. I observed other host apps opening the same plugins without problem, so there is probably something wrong or incompatible in my codes. I noticed that: The issue occurs whether or not the app is sandboxed. The issue does no longer occur when the app itself runs under Rosetta. There is no error reported by CoreAudio during allocation and initialization of the audio unit. The first notified errors appears when the unit calls AudioUnitRender from the rendering callback. With most x86 plugins, the error is on first call: kAudioUnitErr_RenderTimeout and on any subsequent call: kAudioComponentErr_InstanceInvalidated On the UI side, when the Cocoa View is loaded, it appears shortly, then disappears immediately leaving its superview empty. With another x86 plugin, the Cocoa View is loaded normally, but CoreAudio still emits kAudioUnitErr_NoConnection from AudioUnitRender, whether the view has been loaded or not, and the plugin produces no sound. I also find these messages in the console (printed in that order): CLIENT ERROR: RemoteAUv2ViewController does not override - and thus cannot react to catastrophic errors beyond logging them AUAudioUnit_XPC.mm:641 Crashed AU possible component description: aumu/Helm/Tyte My app uses the AUv2 API and I suspect that working with the AUv3 API would spare me these problems. However, considering how my audio system is built (audio units are wrapped into C++ classes and most connections between units are managed on the fly from the rendering callback), it would be a lot of work to convert, and I’m even not sure that all I do with the AUv2 API would be possible with the AUv3 API. I could possibly find an intermediate solution, but in the immediate future I'm looking for the simplest and fastest possible fix. If I cannot find better, I see two fallback options: In this part of the doc: “Beginning with macOS 11, the system loads audio units into a separate process that depends on the architecture or host preference”, does “host preference” means that it would be possible to disable the “out of process” behavior, for example from the app entitlements or info.plist? Otherwise, as a last resort, I could completely disable the use of x86 audioUnits when my app runs under ARM64, for at least making things cleaner. But the Audio Component API doesn’t give any info about the plugin architecture, how could I found it? Any tip or idea about this issue will be much appreciated. Thanks in advance!
2
0
424
Nov ’25
Error when capturing a high-resolution frame with depth data enabled in ARKit
Problem Description (1) I am using ARKit in an iOS app to provide AR capabilities. Specifically, I'm trying to use the ARSession's captureHighResolutionFrame(using:) method to capture a high-resolution frame along with its corresponding depth data: open func captureHighResolutionFrame(using photoSettings: AVCapturePhotoSettings?) async throws -> ARFrame (2) However, when I attempt to do so, the call fails at runtime with the following error, which I captured from the Xcode debugger: [AVCapturePhotoOutput capturePhotoWithSettings:delegate:] settings.depthDataDeliveryEnabled must be NO if self.isDepthDataDeliveryEnabled is NO Code Snippet Explanation (1) ARConfig and ARSession Initialization The following code configures the ARConfiguration and ARSession. A key part of this setup is setting the videoFormat to the one recommended for high-resolution frame capturing, as suggested by the documentation. func start(imagesDirectory: URL, configuration: Configuration = Configuration()) { // ... basic setup ... let arConfig = ARWorldTrackingConfiguration() arConfig.planeDetection = [.horizontal, .vertical] // Enable various frame semantics for depth and segmentation if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) { arConfig.frameSemantics.insert(.smoothedSceneDepth) } if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) { arConfig.frameSemantics.insert(.sceneDepth) } if ARWorldTrackingConfiguration.supportsFrameSemantics(.personSegmentationWithDepth) { arConfig.frameSemantics.insert(.personSegmentationWithDepth) } // Set the recommended video format for high-resolution captures if let videoFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing { arConfig.videoFormat = videoFormat print("Enabled: High-Resolution Frame Capturing by selecting recommended video format.") } arSession.run(arConfig, options: [.resetTracking, .removeExistingAnchors]) // ... } (2) Capturing the High-Resolution Frame The code below is intended to manually trigger the capture of a high-resolution frame. The goal is to obtain both a high-resolution color image and its associated high-resolution depth data. To achieve this, I explicitly set the isDepthDataDeliveryEnabled property of the AVCapturePhotoSettings object to true. func requestImageCapture() async { // ... guard statements ... print("Manual image capture requested.") if #available(iOS 16.0, *) { // Assuming 16.0+ for this API if let defaultSettings = arSession.configuration?.videoFormat.defaultPhotoSettings { // Create a mutable copy from the default settings, as recommended let photoSettings = AVCapturePhotoSettings(from: defaultSettings) // Explicitly enable depth data delivery for this capture request photoSettings.isDepthDataDeliveryEnabled = true do { let highResFrame = try await arSession.captureHighResolutionFrame(using: photoSettings) print("Successfully captured a high-resolution frame.") if let initialDepthData = highResFrame.capturedDepthData { // Process depth data... } else { print("High-resolution frame was captured, but it contains no depth data.") } } catch { // The exception is caught here print("Error capturing high-resolution frame: \(error.localizedDescription)") } } } // ... } Issue Confirmation & Question (1) Through debugging, I have confirmed the following behavior: If I call captureHighResolutionFrame without providing the photoSettings parameter, or if photoSettings.isDepthDataDeliveryEnabled is set to false, the method successfully returns a high-resolution ARFrame, but its capturedDepthData is nil. (2) The error message clearly indicates that settings.depthDataDeliveryEnabled can only be true if the underlying AVCapturePhotoOutput instance's own isDepthDataDeliveryEnabled property is also true. (3) However, within the context of ARKit and ARSession, I cannot find any public API that would allow me to explicitly access and configure the underlying AVCapturePhotoOutput instance that ARSession manages. (4) My question is: Is there a way to configure the ARSession's internal AVCapturePhotoOutput to enable its isDepthDataDeliveryEnabled property? Or, is simultaneously capturing a high-resolution frame and its associated depth data simply not a supported use case in the current ARKit framework?
1
0
347
Nov ’25
Video Audio + Speech To Text
Hello, I am wondering if it is possible to have audio from my AirPods be sent to my speech to text service and at the same time have the built in mic audio input be sent to recording a video? I ask because I want my users to be able to say "CAPTURE" and I start recording a video (with audio from the built in mic) and then when the user says "STOP" I stop the recording.
1
0
467
Oct ’25
AVAssetExportSession ignores frameDuration 60fps and exports at 30fps, but AVPlayer playback is correct
Hey everyone, I'm stuck on a really frustrating AVFoundation problem. I'm building a video editor that uses a custom AVVideoCompositor to add effects, and I need the final output to be 60 FPS. So basically, I create an AVMutableComposition to sequence my video clips. I create an AVMutableVideoComposition and set the frame rate to 60 FPS: videoComposition.frameDuration = CMTime(value: 1, timescale: 60) I assign my CustomVideoCompositor class to the videoComposition. I create an AVPlayerItem with the composition and video composition. The Problem: Playback Works: When I play the AVPlayerItem in an AVPlayer, it's perfect. It plays at a smooth 60 FPS, and my custom compositor's startRequest method is called 60 times per second. Export Fails: When I try to export the exact same composition and video composition using AVAssetExportSession, the final .mp4 file is always 30 FPS (or 29.97). I've logged inside my custom compositor during the export, and it's definitely being called 30 times per second, so it's generating the 30 frames. It seems like AVAssetExportSession is just dropping every other frame when it encodes the video. My source videos are screen recordings which I recorded using ScreenCaptureKit itself with the minimum frame interval to be 60. Here is my export function. I'm using the AVAssetExportPresetHighestQuality preset :- func exportVideo(to outputURL: URL) async throws { guard let composition = composition, let videoComposition = videoComposition else { throw VideoCompositionError.noValidVideos } try? FileManager.default.removeItem(at: outputURL) guard let exportSession = AVAssetExportSession( asset: composition, presetName: AVAssetExportPresetHighestQuality // Is this the problem? ) else { throw VideoCompositionError.trackCreationFailed } exportSession.outputFileType = .mp4 exportSession.videoComposition = videoComposition // This has the 60fps setting try await exportSession.export(to: outputURL, as: .mp4) } I've created a bare bones sample project that shows this exact bug in action. The resulting video is 60fps during playback, but only 30fps during the export. https://github.com/zaidbren/SimpleEditor My Question: Why is AVAssetExportSession ignoring my 60 FPS frameDuration and defaulting to 30 FPS, even though AVPlayer respects it?
1
0
395
Oct ’25
Control system video effects support for CMIO extension
We're distributing a virtual camera with our app that does not profit in the slightest from automatically applied system video effects both to the video going in (physical camera device) or out (virtual camera device). I'm aware of setting NSCameraReactionEffectGesturesEnabledDefault in Info.plist and determining active video effects via AVCaptureDevice API. Those are obviously crutches, because having to tell users to go look for and click around in menu bar apps is the opposite of a great UX. To make our product's video output more deterministic, I'm looking for a way to tell the CMIO subsystem that our virtual camera does not support any of the system video effects. I'm seeing properties like AVCaptureDevice.Format.isPortraitEffectSupported and AVCaptureDevice.Format.isStudioLightSupported whose documentation refers to the format's ability to support these effects. Since we're setting a CMFormatDescription via CMIOExtensionStreamSource.formats I was hoping to find something in the extensions, but wasn't successful so far. Can this be done?
2
0
225
Oct ’25
AVAudioEngine installTap stops working after phone call interruption on iPhone 16e
Environment Device: iPhone 16e iOS Version: 18.4.1 - 18.7.1 Framework: AVFoundation (AVAudioEngine) Problem Summary On iPhone 16e (iOS 18.4.1-18.7.1), the installTap callback stops being invoked after resuming from a phone call interruption. This issue is specific to phone call interruptions and does not occur on iPhone 14, iPhone SE 3, or earlier devices. Expected Behavior After a phone call interruption ends and audioEngine.start() is called, the previously installed tap should continue receiving audio buffers. Actual Behavior After resuming from phone call interruption: Tap callback is no longer invoked No audio data is captured No errors are thrown Engine appears to be running normally Note: Normal pause/resume (without phone call interruption) works correctly. Steps to Reproduce Start audio recording on iPhone 16e Receive or make a phone call (triggers AVAudioSession interruption) End the phone call Resume recording with audioEngine.start() Result: Tap callback is not invoked Tested devices: iPhone 16e (iOS 18.4.1-18.7.1): Issue reproduces ✗ iPhone 14 (iOS 18.x): Works correctly ✓ iPhone SE 3 (iOS 18.x): Works correctly ✓ Code Initial Setup (Works) let inputNode = audioEngine.inputNode inputNode.installTap(onBus: 0, bufferSize: 4096, format: nil) { buffer, time in self.processAudioBuffer(buffer, at: time) } audioEngine.prepare() try audioEngine.start() Interruption Handling NotificationCenter.default.addObserver( forName: AVAudioSession.interruptionNotification, object: AVAudioSession.sharedInstance(), queue: nil ) { notification in guard let userInfo = notification.userInfo, let typeValue = userInfo[AVAudioSessionInterruptionTypeKey] as? UInt, let type = AVAudioSession.InterruptionType(rawValue: typeValue) else { return } if type == .began { self.audioEngine.pause() } else if type == .ended { try? self.audioSession.setActive(true) try? self.audioEngine.start() // Tap callback doesn't work after this on iPhone 16e } } Workaround Full engine restart is required on iPhone 16e: func resumeAfterInterruption() { audioEngine.stop() inputNode.removeTap(onBus: 0) inputNode.installTap(onBus: 0, bufferSize: 4096, format: nil) { buffer, time in self.processAudioBuffer(buffer, at: time) } audioEngine.prepare() try audioSession.setActive(true) try audioEngine.start() } This works but adds latency and complexity compared to simple resume. Questions Is this expected behavior on iPhone 16e? What is the recommended way to handle phone call interruptions? Why does this only affect iPhone 16e and not iPhone 14 or SE 3? Any guidance would be appreciated!
0
0
161
Oct ’25
Mute behavior of Volume button on AVPlayerViewController iOS 26
With older iOS versions, when user taps Mute/Volume button on AVPLayerViewController to unmute, the system restores the sound volume of device to the level when user muted before. On iOS 26, when user taps unmute button on screen, the volume starts from 0 (not restore). (but it still restores if user unmutes by pressing physical volume buttons). As I understand, the Volume bar/button on AVPlayerViewController is MPVolumeView, and I can not control it. So this is a feature of the system. But I got complaints that this is a bug. I did not find documents that describe this change of Mute button behavior. I need some bases to explain this situation. Thank you.
0
1
167
Oct ’25
AVSpeechSynthesizer pulls words out of thin air.
Hi, I'm working on a project that uses the AVSpeechSynthesizer and AVSpeechUtterance. I discovered by chance that the AVSpeechSynthesizer automatically completes some words instead of just outputting what it's supposed to. These are abbreviations for days of the week or months, but not all of them. I don't want either of them automatically completed, but only the specified text. The completion transcends languages. I have written a short example program for demonstration purposes. import SwiftUI import AVFoundation import Foundation let synthesizer: AVSpeechSynthesizer = AVSpeechSynthesizer() struct ContentView: View { var body: some View { VStack { Button { utter("mon") } label: { Text("mon") } .buttonStyle(.borderedProminent) Button { utter("tue") } label: { Text("tue") } .buttonStyle(.borderedProminent) Button { utter("thu") } label: { Text("thu") } .buttonStyle(.borderedProminent) Button { utter("feb") } label: { Text("feb") } .buttonStyle(.borderedProminent) Button { utter("feb", lang: "de-DE") } label: { Text("feb DE") } .buttonStyle(.borderedProminent) Button { utter("wed") } label: { Text("wed") } .buttonStyle(.borderedProminent) } .padding() } private func utter(_ text: String, lang: String = "en-US") { let utterance = AVSpeechUtterance(string: text) let voice = AVSpeechSynthesisVoice(language: lang) utterance.voice = voice synthesizer.speak(utterance) } } #Preview { ContentView() } Thank you Christian
1
0
214
Oct ’25
AVFoundation Custom Video Compositor Skipping Frames During AVPlayer Playback Despite 60 FPS Frame Duration
I'm building a Swift video editor with AVFoundation and a custom compositor. Despite setting AVVideoComposition.frameDuration to 60 FPS, I'm seeing significant frame skipping during playback. Console Output Shows Frame Skipping Frame #0 at 0.0 ms (fps: 60.0) Frame #2 at 33.333333333333336 ms (fps: 60.0) Frame #6 at 100.0 ms (fps: 60.0) Frame #10 at 166.66666666666666 ms (fps: 60.0) Frame #32 at 533.3333333333334 ms (fps: 60.0) Frame #62 at 1033.3333333333335 ms (fps: 60.0) Frame #96 at 1600.0 ms (fps: 60.0) Instead of frames every ~16.67ms (60 FPS), I'm getting irregular intervals, sometimes 33ms, 67ms, or hundreds of milliseconds apart. Renderer.swift (Key Parts) @MainActor class Renderer: ObservableObject { @Published var playerItem: AVPlayerItem? private let assetManager: ProjectAssetManager? private let compositorId: String func buildComposition() async { // ... load mouse moves/clicks data ... let composition = AVMutableComposition() let videoTrack = composition.addMutableTrack( withMediaType: .video, preferredTrackID: kCMPersistentTrackID_Invalid ) var currentTime = CMTime.zero var layerInstructions: [AVMutableVideoCompositionLayerInstruction] = [] // Insert video segments for videoURL in videoURLs { let asset = AVAsset(url: videoURL) let tracks = try await asset.loadTracks(withMediaType: .video) let assetVideoTrack = tracks.first let duration = try await asset.load(.duration) try videoTrack.insertTimeRange( CMTimeRange(start: .zero, duration: duration), of: assetVideoTrack, at: currentTime ) let layerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoTrack) let transform = try await assetVideoTrack.load(.preferredTransform) layerInstruction.setTransform(transform, at: currentTime) layerInstructions.append(layerInstruction) currentTime = CMTimeAdd(currentTime, duration) } let videoComposition = AVMutableVideoComposition() videoComposition.frameDuration = CMTime(value: 1, timescale: 60) // 60 FPS // Set render size from first video if let firstURL = videoURLs.first { let firstAsset = AVAsset(url: firstURL) let firstTrack = try await firstAsset.loadTracks(withMediaType: .video).first let naturalSize = try await firstTrack.load(.naturalSize) let transform = try await firstTrack.load(.preferredTransform) videoComposition.renderSize = CGSize( width: abs(naturalSize.applying(transform).width), height: abs(naturalSize.applying(transform).height) ) } let instruction = CompositorInstruction() instruction.timeRange = CMTimeRange(start: .zero, duration: currentTime) instruction.layerInstructions = layerInstructions instruction.compositorId = compositorId videoComposition.instructions = [instruction] videoComposition.customVideoCompositorClass = CustomVideoCompositor.self let playerItem = AVPlayerItem(asset: composition) playerItem.videoComposition = videoComposition self.playerItem = playerItem } } class CompositorInstruction: NSObject, AVVideoCompositionInstructionProtocol { var timeRange: CMTimeRange = .zero var enablePostProcessing: Bool = false var containsTweening: Bool = false var requiredSourceTrackIDs: [NSValue]? var passthroughTrackID: CMPersistentTrackID = kCMPersistentTrackID_Invalid var layerInstructions: [AVVideoCompositionLayerInstruction] = [] var compositorId: String = "" } class CustomVideoCompositor: NSObject, AVVideoCompositing { var sourcePixelBufferAttributes: [String : Any]? = [ kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_32BGRA) ] var requiredPixelBufferAttributesForRenderContext: [String : Any] = [ kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_32BGRA) ] func renderContextChanged(_ newRenderContext: AVVideoCompositionRenderContext) {} func startRequest(_ asyncVideoCompositionRequest: AVAsynchronousVideoCompositionRequest) { guard let sourceTrackID = asyncVideoCompositionRequest.sourceTrackIDs.first?.int32Value, let sourcePixelBuffer = asyncVideoCompositionRequest.sourceFrame(byTrackID: sourceTrackID), let outputBuffer = asyncVideoCompositionRequest.renderContext.newPixelBuffer() else { asyncVideoCompositionRequest.finish(with: NSError(domain: "VideoCompositor", code: -1)) return } let videoComposition = asyncVideoCompositionRequest.renderContext.videoComposition let frameDuration = videoComposition.frameDuration let fps = Double(frameDuration.timescale) / Double(frameDuration.value) let compositionTime = asyncVideoCompositionRequest.compositionTime let seconds = CMTimeGetSeconds(compositionTime) let frameInMilliseconds = seconds * 1000 let frameNumber = Int(round(seconds * fps)) print("Frame #\(frameNumber) at \(frameInMilliseconds) ms (fps: \(fps))") asyncVideoCompositionRequest.finish(withComposedVideoFrame: outputBuffer) } func cancelAllPendingVideoCompositionRequests() {} } VideoPlayerViewModel @MainActor class VideoPlayerViewModel: ObservableObject { let player = AVPlayer() private let renderer: Renderer func loadVideo() async { await renderer.buildComposition() if let playerItem = renderer.playerItem { player.replaceCurrentItem(with: playerItem) } } } What I've Tried Frame skipping is consistent—exact same timestamps on every playback Issue persists even with minimal processing (just passing through buffers) Occurs regardless of compositor complexity Please note that I need every frame at exact millisecond intervals for my application. Frame loss or inconsistent frameInMillisecond values are not acceptable.
1
0
314
Oct ’25
APMP & Photography?
Hi, I'm a fan of the gallery in vision pro which has video as well as still photography but I'm wondering if Apple has considered adding the projected media tags to heic so that we can go that next step from Spatial photos to Immersive photos. I have a device that can give me 12k x 6k fisheye images in HDR, but it can't do it at a framerate or resolution that's good enough for video, so I want to cut my losses and show off immersive photos instead. Is there something Apple is already working on for APMP stills or should I create my own app that reads metadata inside a HEIC that I infer in a similar way to the demo "ProjectedMediaConversion" is doing for Video. It would be great to have 180VR photos, which could show as Spatial in a gallery view, but going immersive would half-surround you instead of floating in the blurred view. I think that would be a pretty amazing effect.
2
0
317
Oct ’25
FaceTime Hang Up
When I’m on FaceTime, my phone will randomly end my call? I have an iPhone 17, iOS 26.1. Some times I won’t even be touching my phone screen and it’ll hang up. I’m not sure if this is a universal issue or just a me problem. It’s getting really annoying.
1
0
175
Oct ’25
VTLowLatencyFrameInterpolationConfiguration supported dimensions
Is there limits on the supported dimension for VTLowLatencyFrameInterpolationConfiguration. Querying VTLowLatencyFrameInterpolationConfiguration.maximumDimensions and VTLowLatencyFrameInterpolationConfiguration.minimumDimensions returns nil. When I try the WWDC sample project EnhancingYourAppWithMachineLearningBasedVideoEffects with a 4k video this statement try frameProcessor.startSession(configuration: configuration) executes but try await frameProcessor.process(parameters: parameters) throws error Error Domain=VTFrameProcessorErrorDomain Code=-19730 "Processor is not initialized" UserInfo={NSLocalizedDescription=Processor is not initialized}. Also, why is VTLowLatencyFrameInterpolationConfiguration able to run while app is backgrounded but VTFrameRateConversionParameters can't (due to gpu usage)?
2
0
352
Oct ’25
CoreMIDI driver - flow control
Hi, when a CoreMIDI driver controls physical HW it is probably quite commune to have to control the amount of MIDI data received from the system. What comes to mind is to just delay returning control of the MIDIDriverInterface::Send() callback to the calling process. While the application trying to send MIDI really stalls until the callback returns it seems only to be a side effect of a generally stalled CoreMIDI server. Between the callbacks the application can send as much MIDI data as it wants to CoreMIDI, it's buffering seems to be endless... However the HW might not be able to play out all the data. It seems there is no way to indicate an overflow/full buffer situation back the application/CoreMIDI. How is this supposed to work? Thanks, any hints or pointers are highly appreciated! Hagen.
0
0
217
Oct ’25
New API to control front camera orientation like native Camera app on iPhone 17 with iOS 26?
The iPhone 17’s front camera with the new 18MP square sensor and iOS 26’s Center Stage feature can auto-rotate between portrait and landscape like the native iOS Camera app. Is there a Swift or AVFoundation API that allows developers to manually control front camera orientation in the same way the native Camera app does? Or is this auto-rotation strictly handled by the system without public API access?
0
0
299
Oct ’25
Email "Updates to FairPlay Streaming Server SDK" makes no sense
Hi, While I don't normally use FairPlay, I got this email that is so strangely worded I am wondering if it makes sense to people who do know it or if it has some typo or what. You can only generate certificates for SDK 4, also SDK 4 is no longer supported? (Also "will not expire" is imprecise phrasing, certificates presumably will expire, I think it meant to say are still valid / are not invalidated.)
3
0
347
Oct ’25
How-to highlight people in a Vision Pro app using Compositor Services
Fundamentally, my questions are: is there a known transform I can apply onto a given (pixel) position (passed into a Metal Fragment Function) to correctly sample a texture provided by the main cameras + processed by a Vision request. If so, what is it? If not, how can I accurately sample my masks? My goal is to highlight people in a Vision Pro app using Compositor Services. To start, I asynchronously receive camera frames for the main left and right cameras. This is the breakdown of the specific CameraVideoFormat I pass along to the CameraFrameProvider: minFrameDuration: 0.03 maxFrameDuration: 0.033333335 frameSize: (1920.0, 1080.0) pixelFormat: 875704422 cameraType: main cameraPositions: [left, right] cameraRectification: mono From each camera frame sample, I extract the left and right buffers (CVReadOnlyPixelBuffer.withUnsafebuffer ==> CVPixelBuffer). I asynchronously process the extracted buffers by performing a VNGeneratePersonSegmentationRequest on both of them: // NOTE: This block of code and all following code blocks contain simplified representations of my code for clarity's sake. var request = VNGeneratePersonSegmentationRequest() request.qualityLevel = .balanced request.outputPixelFormat = kCVPixelFormatType_OneComponent8 ... let lHandler = VNSequenceRequestHandler() let rHandler = VNSequenceRequestHandler() ... func processBuffers() async { try lHandler.perform([request], on: lBuffer) guard let lMask = request.results?.first?.pixelBuffer else {...} try rHandler.perform([request], on: rBuffer) guard let rMask = request.results?.first?.pixelBuffer else {...} appModel.latestPersonMasks = (lMask, rMask) } I store the two resulting CVPixelBuffers in my appModel. For both of these buffers aka grayscale masks: width (in pixels) = 512 height (in pixels) = 384 byters per row = 512 plane count = 0 pixel format type = 1278226488 I am using Compositor Services to render my content in Immersive Space. My implementation of Compositor Services is based off of the same code from Interacting with virtual content blended with passthrough. Within the Shaders.metal, the tint's Fragment Shader is now passed the grayscale masks (converted from CVPixelBuffer to MTLTexture via CVMetalTextureCacheCreateTextureFromImage() at the beginning of the main render pipeline). fragment float4 tintFragmentShader( TintInOut in [[stage_in]], ushort amp_id [[amplification_id]], texture2d<uint> leftMask [[texture(0)]], texture2d<uint> rightMask [[texture(1)]] ) { if (in.color.a <= 0.0) { discard_fragment(); } float2 uv; if (amp_id == 0) { // LEFT uv = ??????????????????????; } else { // RIGHT uv = ??????????????????????; } constexpr sampler linearSampler (mip_filter::linear, mag_filter::linear, min_filter::linear); // Sample the PersonSegmentation grayscale mask float maskValue = 0.0; if (amp_id == 0) { // LEFT if (leftMask.get_width() > 0) { maskValue = rightMask.sample(linearSampler, uv).r; } } else { // RIGHT if (rightMask.get_width() > 0) { maskValue = rightMask.sample(linearSampler, uv).r; } } if (maskValue > 250) { return (1.0, 1.0, 1.0, 0.5) } return in.color; } I need to correctly sample the masks for a given fragment. The LayerRenderer.Layout is set to .layered. From Developer Documentation. A layout that specifies each view’s content as a slice of a single texture. Using the Metal debugger, I know that the final render target texture for each view / eye is 1888 x 1792 pixels, giving an aspect ratio of 59:56. The initial CVPixelBuffer provided by the main left and right cameras is 1920x1080 (16:9). The grayscale CVPixelBuffer returned by the VNPersonSegmentationRequest is 512x384 (4:3). All of these aspect ratios are different. My questions come down to: is there a known transform I can apply onto a given (pixel) position to correctly sample a texture provided by the main cameras + processed by a Vision request. If so, what is it? If not, how can I accurately sample my masks? Within the tint's Vertex Shader, after applying the modelViewProjectionMatrix, I have tried every version I have been able to find that takes the pixel space position (= vertices[vertexID].position.xy) and the viewport size (1888x1792) to compute the correct clip space position (maybe = pixel space position.xy / (viewport size * 0.5)???) of the grayscale masks but nothing has worked. The "highlight" of the person segmentations is off: scaled a little too big, offset little to far up and off to the side.
1
0
440
Oct ’25
Process to request the restricted entitlement behind “DJ with Apple Music” (tempo control / time-stretch on Apple Music streams)?
Hi, I’m an iOS developer building an app with an use case that needs advanced playback on Apple Music subscription streams, specifically: • Real-time tempo change (BPM) during playback — i.e., time-stretch with key-lock, not just crossfade. • Beat-matched transitions between tracks. From what I can tell, this capability seems to exist only for approved partners and isn’t available through public MusicKit. Question: What’s the official request path to be evaluated for that restricted partner entitlement (application form, questionnaire, NDA, or internal team/BD contact)? If the entitlement identifier is internal, how can I get my account routed to the right Apple Music team? For reference, publicly announced partners include Algoriddim djay, Serato DJ Pro, rekordbox (AlphaTheta), and Engine DJ—all of which appear to implement mixing features that imply advanced playback (tempo/beat-matching) on Apple Music content. I’d prefer not to share product details publicly for the moment and can provide specifics privately if needed. Thanks in advance!
0
1
280
Oct ’25