My current app implements a custom video player, based on a AVSampleBufferRenderSynchronizer synchronising two renderers:
an AVSampleBufferDisplayLayer receiving decoded CVPixelBuffer-based video CMSampleBuffers,
and an AVSampleBufferAudioRenderer receiving decoded lpcm-based audio CMSampleBuffers.
The AVSampleBufferRenderSynchronizer is started when the first image (in presentation order) is decoded and enqueued, using avSynchronizer.setRate(_ rate: Float, time: CMTime), with rate = 1 and time the presentation timestamp of the first decoded image.
Presentation timestamps of video and audio sample buffers are consistent, and on most streams, the audio and video are correctly synchronized.
However on some network streams, on iOS, the audio and video aren't synchronized, with a time difference that seems to increase with time.
On the other hand, with the same player code and network streams on macOS, the synchronization always works fine.
This reminds me of something I've read, about cases where an AVSampleBufferRenderSynchronizer could not synchronize audio and video, causing them to run with independent and potentially drifting clocks, but I cannot find it again.
So, any help / hints on this sync problem will be greatly appreciated! :)
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
The app registers a periodic time observer to the AVPlayer when the playback starts and it works fine. When switching to AirPlay during playback, the periodic time observation continues working as expected.
However, when switching back to local playback, the periodic time observer does not fire anymore until a seek is performed. The app removes the periodic time observer only when the playback stops.
I can see that when switching back to local playback, the timeControlStatus successively changes
to .waitingToPlayAtSpecifiedRate (reason: .evaluatingBufferingRate)
then to .waitingToPlayAtSpecifiedRate (reason: .toMinimizeStalls)
and finally to .playing
But the time observation does not work anymore.
Also, the issue is systematic with Live and VOD streams providing a program date (with HLS property #EXT-X-PROGRAM-DATE-TIME), with or without any DRM, and is never reproduced with other VOD streams.
iPhoneで撮影した映像をブラウザのアプリへ送信して画面に映す機能を持ったアプリを開発しています。
iPhone 17 Pro, 17 Pro Maxでこのアプリを利用するとブラウザ側に表示される映像が緑一色や、緑がメインのカラフルな映像になってしまいます。
調べてみると17Proと17ProMaxで超広角カメラと望遠カメラの画素数が変更になっている(1200万画素→4800万画素)ためエンコーディングで失敗しているのではないかと疑っています。
なんでも情報下さい。
環境情報
WebRTCライブラリ: GoogleWebRTC バージョン 1.1 (CocoaPodsで導入)
シグナリングサーバー: AWS Kinesis Video Streams
問題が発生するデバイス:
モデル名: iPhone18,1, OS: 26.0
モデル名: iPhone18,1, OS: 26.1
問題が発生しないデバイス:
iPhone17,5 以前の多数のモデル
モデル名: iPhone18,1, OS: 26.0
モデル名: iPhone18,3, OS: 26.0
Hi everyone,
We are working on a prototype app for Apple Vision Pro that is similar in functionality to Omegle or Chatroulette, but exclusively for Vision Pro owners.
The core idea is:
– a matching system where one user connects to another through a virtual persona;
– real-time video and audio transmission;
– time limits for sessions with the ability to extend them;
– users can skip a match and move on to the next one.
We have explored WebRTC and Twilio, but unfortunately, they don’t fit our use case.
Question:
What alternative services or SDKs are available for implementing real-time video/audio communication on Vision Pro that would work with this scenario?
Has anyone encountered a similar challenge and can recommend which technologies or tools to use?
Thanks in advance!
Starting in iOS 18.4, (and still in the iOS 18.5 beta), the AVPlayer seems to freeze when we:
Replace the current AVPlayerItem, ReplaceCurrentItemWithPlayerItem and then:
Call Seek very shortly afterwards (seekToTime:toleranceBefore:toleranceAfter: / seek(to:))
And then subsequent calls to play after have no effect. However, it feels scrubbing to see after works and also changing the playback rate (i.e. fast forward) tends to clear up the frozen state.
Our primary workflow involves video playback, replacing video to show new clips and in some cases seeking to specific frames. This appears to only be occurring while streaming video, reports are all that local downloaded video playback remains fine.
This same code path has worked without issue on 17.x and 18.3.2 and for years before that.
What is particularly strange is that time observers log that video is still playing or feeding frames. The reported status is ReadyToPlay, IsLikelyToKeepUp is true, and there are no indications of stalling or buffering.
A similar issue is true for our web application in Safari. While on Sonoma and Safari 17.x, there is no issue. When you update to macOS Sequoia 15.4.1 and Safari 18.4, you begin observing a similar freezing. The same does not occur on Chrome or other tested browsers.
There appears to be in the release notes for Safari 18.4, an interesting "fix" note that seems similar to what we are now experiencing:
https://developer.apple.com/documentation/safari-release-notes/safari-18_4-release-notes
"Fixed an issue where playback doesn’t always resume after a seek. (140097993)"
"Fixed playing video generating non-monotonic ‘timeupdate’ events. (142275184) (FB16222910)"
"Fixed websites calling play() during a seek() is allowed by the specification so that the play event is fired even if the seek hasn’t completed. (142517488)"
"Fixed seek not completing for WebM under some circumstances. (143372794)"
"Fixed MediaRecorderPrivateEncoder writing frames out of order. (143956063)"
Hi all!
I have been experiencing some issues when using the AVAudioEngine to play audio and record input while doing a voice chat (through the PTT Interface).
I noticed if I connect any players to the AudioGraph OR call start that the audio session becomes active (this is on iOS).
I don't see anything in the docs or the header files in the AVFoundation, but is it possible that calling the stop method on an engine deactivates the audio session too?
In a normal app this behavior seems logical, but when using PTT all activation and deactivation of the audio session must go through the framework and its delegate methods.
The issue I am debugging is that when the engine with the input node tapped gets stopped, and there is a gap between the input and when the server replies with inbound audio to be played and something seems to be getting the hardware/audio session into a jammed state.
Thanks for any feedback and/or confirmation on this behavior!
We have the necessary background recording entitlements, and for many users... do not run into any issues.
However, there is a subset of users that routinely get recordings ending.. we have narrowed this down and believe it to be the work of the watch dog.
First we removed the entire view hierarchy when app is backgrounded. There is just 'Text("Recording")'
This got the CPU usage in profiler down to 0%. We saw massive improvements to recording success rate.
We walked away assuming that was enough. However we are still seeing the same sort of crashes. All in the background. We're using Observation to drive audio state changes to a Live Activity.
Are those Observations causing the problem? Why doesn't apple provide a better API to background audio? The internet is full of weird issues
https://stackoverflow.com/questions/76010213/why-is-my-react-native-app-sometimes-terminated-in-the-background-while-tracking
https://stackoverflow.com/questions/71656047/why-is-my-react-native-app-terminating-in-the-background-while-recording-ios-r
https://github.com/expo/expo/issues/16807
This is such a terrible user experience. And we have very little visibility into what is happening and why.
No where in apple documentation states that in order for background recording to work, the app can only be 'Text("Recording")'
It does not outline a CPU or memory threshold. It just kills us.
I am creating an app that decodes H.265 elementary streams on iOS.
I use VideoToolBox to decode from H.265 to NV12.
The decoded data is enqueued in the CMSampleBufferDisplayLayer as a CMSampleBuffer.
However, nothing is displayed in the VideoPlayerView. It remains black.
The decoding in VideoToolBox is successful. I confirmed this by saving the NV12 data in the CMSampleBuffer to a file and displaying it using a tool.
Why is nothing displayed in the VideoPlayerView?
I can provide other source code as well.
//
// ContentView.swift
// H265Decoder
//
// Created by Kohshin Tokunaga on 2025/02/15.
//
import SwiftUI
struct ContentView: View {
var body: some View {
VStack {
Text("H.265 Player (temp.h265)")
.font(.headline)
VideoPlayerView()
.frame(width: 360, height: 640) // Adjust or make it responsive for iOS
}
.padding()
}
}
#Preview {
ContentView()
}
//
// VideoPlayerView.swift
// H265Decoder
//
// Created by Kohshin Tokunaga on 2025/02/15.
//
import SwiftUI
import AVFoundation
struct VideoPlayerView: UIViewRepresentable {
// Return an H265Player as the coordinator, and start playback there.
func makeCoordinator() -> H265Player {
H265Player()
}
func makeUIView(context: Context) -> UIView {
let uiView = UIView(frame: .zero)
// Base layer for attaching sublayers
uiView.backgroundColor = .black // Screen background color (for iOS)
// Create the display layer and add it to uiView.layer
let displayLayer = context.coordinator.displayLayer
displayLayer.frame = uiView.bounds
displayLayer.backgroundColor = UIColor.clear.cgColor
uiView.layer.addSublayer(displayLayer)
// Start playback
context.coordinator.startPlayback()
return uiView
}
func updateUIView(_ uiView: UIView, context: Context) {
// Reset the frame of the AVSampleBufferDisplayLayer when the view's size changes.
let displayLayer = context.coordinator.displayLayer
displayLayer.frame = uiView.layer.bounds
// Optionally update the layer's background color, etc.
uiView.backgroundColor = .black
displayLayer.backgroundColor = UIColor.clear.cgColor
// Flush transactions if necessary
CATransaction.flush()
}
}
//
// H265Player.swift
// H265Decoder
//
// Created by Kohshin Tokunaga on 2025/02/15.
//
import Foundation
import AVFoundation
import CoreMedia
class H265Player: NSObject, VideoDecoderDelegate {
let displayLayer = AVSampleBufferDisplayLayer()
private var decoder: H265Decoder?
override init() {
super.init()
// Initial configuration for the display layer
displayLayer.videoGravity = .resizeAspect
// Initialize the decoder (delegate = self)
decoder = H265Decoder(delegate: self)
// For simple playback, set isBaseline to true
decoder?.isBaseline = true
}
func startPlayback() {
// Load the file "cars_320x240.h265"
guard let url = Bundle.main.url(forResource: "temp2", withExtension: "h265") else {
print("File not found")
return
}
do {
let data = try Data(contentsOf: url)
// Set FPS and video size as needed
let packet = VideoPacket(data: data,
type: .h265,
fps: 30,
videoSize: CGSize(width: 1080, height: 1920))
// Decode as a single packet
decoder?.decodeOnePacket(packet)
} catch {
print("Failed to load file: \(error)")
}
}
// MARK: - VideoDecoderDelegate
func decodeOutput(video: CMSampleBuffer) {
// When decoding is complete, send the output to AVSampleBufferDisplayLayer
displayLayer.enqueue(video)
}
func decodeOutput(error: DecodeError) {
print("Decoding error: \(error)")
}
}
Movies taken with Android phones store their location metadata (and probably others) in ways that are ignored by Apple's ecosystem (QuickTime Player, Photos.app).
I am considering creating a Spotlight importer so that this metadata is available to the sytem. But I have a couple of questions:
Can a Spotlight importer add new data (like location) to the data that the standard importer already captured? Or would the new importer need to take over the whole data gathering? If so, would macOS allow that?
Would that Spotlight importer be somehow used by e.g. Photos.app and QT Player to capture the location? Or would this end up in Spotlight "knowing" the location but Photos.app ignoring it?
If so, maybe there is something more broadly useful than a Spotlight importer?
On an iPhone running iOS 26 beta 5, url(for: FilePath("subdir/asset.mov")) most always throws this error:
The URL for “subdir/asset.mov” couldn’t be retrieved: “asset.mov” couldn’t be copied to “subdir” because an item with the same name already exists.
Yet, contents(at: FilePath("subdir/asset.mov")) always returns Data for a playable AVMovie.
How can I avoid this url(for:) error?
The asset pack in question is downloaded. The error persists even after pack deletion, redownload, relaunch, and combinations of that.
// Assets repo root
subdir.aar
subdir/asset.mov
subdir/asset_thumb.heic
subdir/Manifest.json
// Manifest.json
{
"assetPackID": "subdir",
"downloadPolicy": {
"onDemand": {}
},
"fileSelectors": [
{
"directory": "subdir",
},
],
"platforms": [
"iOS",
"visionOS"
]
}
xcrun ba-package subdir/Manifest.json -o subdir.aar
xcrun ba-serve --host 192.168.0.10 -p 443 subdir.aar
Capturing more than one display is no longer working with macOS Sequoia.
We have a product that allows users to capture up to 2 displays/screens. Our application is using gstreamer which in turn is based on AVFoundation.
I found a quick way to replicate the issue by just running 2 captures from separate terminals. Assuming display 1 has device index 0, and display 2 has device index 1, here are the steps:
install gstreamer with
brew install gstreamer
Then open 2 terminal windows and launch the following processes:
terminal 1 (device-index:0):
gst-launch-1.0 avfvideosrc -e device-index=0 capture-screen=true ! queue ! videoscale ! video/x-raw,width=640,height=360 ! videoconvert ! osxvideosink
terminal 2 (device-index:1):
gst-launch-1.0 avfvideosrc -e device-index=1 capture-screen=true ! queue ! videoscale ! video/x-raw,width=640,height=360 ! videoconvert ! osxvideosink
The first process that is launched will show the screen, the second process launched will not.
Testing this on macOS Ventura and Sonoma works as expected, showing both screens.
I submitted the same issue on Feedback Assistant: FB15900976
I have a crash related to playing video in AVPlayerViewController and AVQueuePlayer. I download the video locally from the network and then initialize it using AVAsset and AVPlayerItem. Can't reproduce locally, but crashes occur from firebase crashlytics only for users starting with iOS 18.4.0 with this trace:
Crashed: com.apple.avkit.playerControllerBackgroundQueue
0 libobjc.A.dylib 0x1458 objc_retain + 16
1 libobjc.A.dylib 0x1458 objc_retain_x0 + 16
2 AVKit 0x12afdc __77-[AVPlayerController currentEnabledAssetTrackForMediaType:completionHandler:]_block_invoke + 108
3 libdispatch.dylib 0x1aac _dispatch_call_block_and_release + 32
4 libdispatch.dylib 0x1b584 _dispatch_client_callout + 16
5 libdispatch.dylib 0x6560 _dispatch_continuation_pop + 596
6 libdispatch.dylib 0x5bd4 _dispatch_async_redirect_invoke + 580
7 libdispatch.dylib 0x13db0 _dispatch_root_queue_drain + 364
8 libdispatch.dylib 0x1454c _dispatch_worker_thread2 + 156
9 libsystem_pthread.dylib 0x4624 _pthread_wqthread + 232
10 libsystem_pthread.dylib 0x19f8 start_wqthread + 8
Hello Apple Developer Community,
I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag.
In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so:
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8"
Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English").
We would like to understand the official or intended behavior regarding this.
Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label?
If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI?
Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute?
Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated.
Thank you for your time and assistance.
So,
I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be.
Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS.
E.g.
InteractionStatistics:
- listeningStarted: 21:24:23 4480 2423
- timeTillFirstAboveNoiseFloor: 01.794
- timeTillLastNoiseAboveFloor: 02.383
- timeTillFirstSpeechDetected: 02.399
- timeTillTranscriptFinalized: 04.510
- timeTillFirstMLModelResponse: 04.938
- timeTillMLModelResponse: 05.379
- timeTillTTSStarted: 04.962
- timeTillTTSFinished: 11.016
- speechLength: 06.054
- timeToResponse: 02.578
- transcript: This is a test.
- mlModelResponse: Sure! I'm ready to help with your test. What do you need help with?
Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s.
Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly).
I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now?
I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
I'm developing a photo backup app.
To detect newly added or edited photos since the app launched, I keep a local dictionary in the format [localIdentifier: modification_date].
However, PHAsset.modificationDate is not reliable.
It often changes unexpectedly, possibly due to system operations like iCloud metadata updates.
Is there a more reliable way to detect whether a photo has been modified by user since the last app launch?
I'm thinking about using content hash instead, but I'm not sure how heavy this operation is in terms of performance.
Hi
This is one of our top crashes. It does not contain any of our code in the stacktrace and we can't reproduce it. Those points make this crash very hard to understand and fix. We know that most of the crashes are happening on iPhone 13 with iOS 18.x.x. Also we see that a lot of cases happen when app goes into background (stacktrace contains -[UIApplication _applicationDidEnterBackground]).
2025-03-04_16-06-00.3670_-0500-6a273c7d5da97f098b5cc24898bb9761dc45208e.crash
2025-03-04_20-21-08.6609_-0500-2c08f640900f8a62c4f8a4f6f2a61feb052e66dd.crash
2025-03-04_20-46-27.7138_+0000-4d7ea89b1b564eda22ca63e708f7ad3909c7b768.crash
When using AVSampleBufferDisplayLayer to play uncompressed H.264 and H.265 video with B-frames more than 7, frame drops occur. The more B-frames there are, the more noticeable the frame drops become, for example 15 bframes.
Use FFmpeg to transcode a video file with visible timestamps and frame numbers (x264 or x265 ):
ffmpeg -i test.mp4 -vf "drawtext=fontsize=45:text=%{pts} %{n}:y=400" -c:v libx264 -x264-params "bframes=15:b-adapt=0" -crf 30 -y x264_bf15.mp4
ffmpeg -i test.mp4 -vf "drawtext=fontsize=45:text=%{pts} %{n}:y=400" -c:v libx265 -x265-params "bframes=15:b-adapt=0" -crf 30 -y x265_bf15.mp4
Use the demo player from this repository to reproduce the issue: https://github.com/msfrms/CustomPlayer
frame drops can be observed. And following log can be found in devices console.
mediaserverd <<<< IQ-CA >>>> piqca_gmstats_dump: FIQCA(0x1266f4000) recent frames: enqueued: 184, displayed: 138, dropped: 42, flushed: 0, evicted: 3, >16ms late: 2
PS. I was using iphone11 iOS14.6, to replay this issue.
May I ask why frame drops occur in this case?
Is there any configuration or API usage change that could help fix the frame drop issue?
Many thanks!
Add RPSystemBroadcastPickerView to the app,
After clicking, no method of SampleHandler is triggered
AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right?
It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"?
Thank you.
The documentation for the Apple Music API indicates that the genreNames field for a given artist (see https://developer.apple.com/documentation/applemusicapi/artists/attributes-data.dictionary) is an array of strings. However, it only appears as though you return ONE SINGLE GENRE per Artist, regardless of how many genres might be attached to that artist's albums.
Am I missing something? Is there an artist where multiple genres may be returned, or is this a bug in the documentation?