I am trying to learn Metal development on my MacBook Pro M1 Pro (Sequoia 15.3.1) on Xcode Playground, but when I write these two lines of code:
import Metal
let device = MTLCreateSystemDefaultDevice()!
I get the error The LLDB RPC server has crashed. Any ideas as to what I can do to solve this? I have rebooted the machine and reinstalled Xcode...
Metal
RSS for tagRender advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Recently, I adopted MetalFX for Upscale feature.
However, I have encountered a persistent build failure for the iOS Simulator with the error message, 'MetalFX is not available when building for iOS Simulator.'
To address this, I modified the MetalFX.framework status to 'Optional' within Build Phases > Link Binary With Libraries, adding the linker option (-weak_framework). Despite this adjustment, the build process continues to fail.
Furthermore, I observed that the MetalFX sample application provided by Apple, specifically the one found at https://developer.apple.com/documentation/metalfx/applying-temporal-antialiasing-and-upscaling-using-metalfx, also fails to build for the iOS Simulator target.
Has anyone encountered this issue?
View Layout
Add the following views in a view controller:
Label
View A, with a subview of the same size: MTKView A
View B, with a subview of the same size: MTKView B
Refresh Rates of Each View
The label view refreshes at 60fps (driven by CADisplayLink).
MTKView A and B refresh at 15fps.
MTKView Implementation Details
The corresponding CAMetalLayer's maximumDrawableCount is set to 2, changed to double buffering.
The scheduling mechanism is modified; drawing is not driven by the internal loop but is done manually. The draw call is triggered immediately upon receiving a frame.
self.metalView.enableSetNeedsDisplay = NO;
self.metalView.paused = YES;
A new high-priority queue is created for drawing, instead of handling it on the main queue.
MTKView Latency Tracking
The GPU completion time T1 is observed through the addCompletedHandler callback of the CommandBuffer.
The presentation time T2 of the frame is observed through the addPresentedHandler callback of the currentDrawable in MTKView.
Testing shows that T2 - T1 > 16.6ms (the Vsync period at 60Hz). This means that after the GPU rendering in MTLView is finished, the frame is not actually displayed at the next Vsync instruction but only at the Vsync instruction after that.
I believe there is an extra 16.6ms of latency here, which I want to eliminate by adjusting the rendering mechanism.
Observation from Instruments
From Instruments, the Surface presentation aligns with the above test results. After the Metal encoder finishes, the Surface in Display switches only after the next-next Vsync instruction. See the image in the link for details.
Questions
According to a beginner's understanding, after MTKView's GPU rendering is finished, the next Vsync instruction should officially display (make it visible). However, this is not what is observed. Does the subview MTKView need to wait for another Vsync cycle to be drawn to the actual display buffer?
The label updates its text at 60fps, so the entire interface should be displayed at 60fps. Is the content of MTKView not synchronized when the display happens?
Explanation of the Reasoning Behind Some MTKView Code Details
Changing from the default triple buffering to double buffering helps reduce the latency introduced by rendering.
Not using MTKView's own scheduling mechanism but using manual triggering of the draw method is because MTKView's own scheduling mechanism is driven by CADisplayLink. Therefore, if a frame falls within a Vsync window, it needs to wait for the next Vsync window to trigger the draw operation, which introduces waiting latency.
Description:
In the official visionOS 26 Hover Effect sample code project , I encountered an issue where the event.trackingAreaIdentifier returned by onSpatialEvent does not reset as expected.
Steps to Reproduce:
Select an object with trackingAreaID = 6 in the sample app.
Look at a blank space (outside any tracking area) and perform a pinch gesture .
Expected Behavior:
The event.trackingAreaIdentifier should return 0 when interacting with a non-tracking area.
Actual Behavior:
The event.trackingAreaIdentifier still returns 6, even after restarting the app or killing the process. This persists regardless of where the pinch gesture is performed
Hi ,
My application meet below crash backtrace at very low repro rate from the public users, i do not see it relate to a specific iOS version or iPhone model. The last code line from my application is calling CAMetalLayer nextDrawable API.
I did some basic studying, suppose it may relate to the wrong CAMetaLayer configuration, like
frame property w or h <= 0.0
bounds property w or h <= 0.0
drawableSize w or h <= 0.0 or w or h > max value (like 16384)
Not sure my above thinking is right or not? Will the UIView which my CAMetaLayer attached will cause such nextDrawable crash or not ?
Thanks a lot
Main Thread - Crashed
libsystem_kernel.dylib
__pthread_kill
libsystem_c.dylib
abort
libsystem_c.dylib
__assert_rtn
Metal
MTLReportFailure.cold.1
Metal
MTLReportFailure
Metal
_MTLMessageContextEnd
Metal
-[MTLTextureDescriptorInternal validateWithDevice:]
AGXMetalA13
0x245b1a000 + 4522096
QuartzCore
allocate_drawable_texture(id<MTLDevice>, __IOSurface*, unsigned int, unsigned int, MTLPixelFormat, unsigned long long, CAMetalLayerRotation, bool, NSString*, unsigned long)
QuartzCore
get_unused_drawable(_CAMetalLayerPrivate*, CAMetalLayerRotation, bool, bool)
QuartzCore
CAMetalLayerPrivateNextDrawableLocked(CAMetalLayer*, CAMetalDrawable**, unsigned long*)
QuartzCore
-[CAMetalLayer nextDrawable]
SpaceApp
-[MetalRender renderFrame:] MetalRenderer.mm:167
SpaceApp
-[FrameBuffer acceptFrame:] VideoRender.mm:173
QuartzCore
CA::Display::DisplayLinkItem::dispatch_(CA::SignPost::Interval<(CA::SignPost::CAEventCode)835322056>&)
QuartzCore
CA::Display::DisplayLink::dispatch_items(unsigned long long, unsigned long long, unsigned long long)
QuartzCore
CA::Display::DisplayLink::dispatch_deferred_display_links(unsigned int)
UIKitCore
_UIUpdateSequenceRun
UIKitCore
schedulerStepScheduledMainSection
UIKitCore
runloopSourceCallback
CoreFoundation
__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__
CoreFoundation
__CFRunLoopDoSource0
CoreFoundation
__CFRunLoopDoSources0
CoreFoundation
__CFRunLoopRun
CoreFoundation
CFRunLoopRunSpecific
GraphicsServices
GSEventRunModal
UIKitCore
-[UIApplication _run]
UIKitCore
UIApplicationMain
Hello, I'm tracking down a bug where useResource doesn't seem to apply proper synchronization when a resource is produced by the render pass then consumed by the compute pass, but when I use MTLFence between the to signal and wait between the render/compute encoders, the artifact goes away.
The resource is created with MTLHazardTrackingModeTracked and useResource is called on the compute encoder after the render pass. Metal API Validation doesn't report any warnings/errors.
Am I misunderstanding the difference between the two APIs? I dug through the Metal documentation and it looks like useResource should handle synchronization given the resource has MTLHazardTrackingModeTracked but on the other hand, MTLFence should be used to ensure proper synchronization between command encoders. Can someone can clarify the difference between the two APIs and when to use them.
Hi there,
Is it possible to customize the Metal Performance HUD on Apple TV, similar to how it can be done on iPhone & iPad?
Would like to see things like Compiled Shaders for my Apps on tvOS
.
I mean…I want to use defaults rather than launching apps via open with the saved environment variables.
This is pretty easy on iOS and other platforms. So what about in macOS?
I am trying to learn the new Metal Peformance Primitives APIs. I have added the MetalPeformancePrimitives framework and included the header in my shader code as per documentation
#include <MetalPeformancePrimitives/MetalPeformancePrimitives.h>
Unfortunately, Xcode complains that the header cannot be found. How do I include it properly?
I am using Xcode 26 on Tahoe. The MetalPeformancePrimitives framework is present on my machine and I can inspect the headers in the filesystem.
Topic:
Graphics & Games
SubTopic:
Metal
I work on a Qt/QML app that uses Esri Maps SDK for Qt and that is deployed to both Windows and iPads. With a recent iPad OS upgrade to 26.1, many iPad users are reporting the application freezing after panning and/or identifying features in the map. It runs fine for our Windows users.
I was able to reproduce this and grabbed the following error messages when the freeze happens:
IOGPUMetalError: Caused GPU Address Fault Error (0000000b:kIOGPUCommandBufferCallbackErrorPageFault)
IOGPUMetalError: Invalid Resource (00000009:kIOGPUCommandBufferCallbackErrorInvalidResource)
Environment:
Qt 6.5.4 (Qt for iOS)
Esri Maps SDK for Qt 200.3
iPadOS 26.1
Because it appears to be a Metal error, I tried using OpenGL (Qt offers a way to easily set hte target graphics api):
QQuickWindow::setGraphicsApi(QSGRendererInterface::GraphicsApi::OpenGL)
Which worked! No more freezing. But I'm seeing many posts that OpenGL has been deprecated by Apple.
I've seen posts that Apple deprecated OpenGL ES. But it seems to still be available with iPadOS 26.1. If so, will this fix (above) just cause problems with a future iPadOS update?
Any other suggestions to address this issue? Upgrading our version of Qt + Esri SDK to the latest version is not an option for us. We are in the process to upgrade the full application, but it is a year or two out. So, we just need a fix to buy us some time for now.
Appreciate any thoughts/insights....
Now the examples of metal-cpp are target on desktop and using AppKit which is not supported on iOS. Is there any tips for developing with metal-cpp on mobile device?
There is a sample project from Apple here. It has a scene of a city at night and you can move in it.
It basically has 2 parts:
application code written in what looks like Objective-C (I am more familiar with C++), which inherits from things like NSObject, MTKView, NSViewController and so on - it processes input and all app-related and window-related stuff.
rendering code that also looks like Objective-C. Btw both parts are mostly in .mm files (Obj-C++ AFAIK). The application part directly uses only one class from the rendering part - AAPLRenderer.
I want to move the rendering part to C++ using metal-cpp. For that I need to link metal-cpp to the project. I did it successfully with blank projects several times before using this tutorial. But with this sample project Xcode can't find Foundation/Foundation.hpp (and other metal-cpp headers). The error says this:
Did not find header 'Foundation.hpp' in framework 'Foundation' (loaded from '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.0.sdk/System/Library/Frameworks')
Pls help
I am making a framework in C++ using metal-cpp, basically a small game engine. I am also consequently using metal-cpp-extensions provided in LearnMetalCPP to make applications work.
For one of my classes, I needed to add AppKit.hpp inside a public header file, so I moved it and its associate headers(NSApplication.hpp, NSMenu.hpp, etc.) from Project headers to Public in Build Phases' Headers, however, it started giving me the error "cast of C pointer type 'void *' to Objective-C pointer type 'Class' requires a bridged cast" at several points in the AppKit headers. They don't appear when AppKit and its associates are in the Project headers, or when they are in the Private headers and no headers import it.
I imagined that disabling Objective-C ARC and Using __bridge casts outside of ARC in Build Settings would solve it, but it didn't budge.
I imagined it wouldn't involve actively changing the headers would be the answer, but even if I try to put __bridge before the problematic casts, it didn't recognize __bridge.
How do I solve this? And why is it only happening in Public and not Project headers?
I have an M1 Pro with a 16-core GPU. When I run a shader with 8193 threads, atomic_thread_fence is violated across the boundary between thread 8191 (the last thread in the 7th threadgroup) and 8192 (the first thread in the 9th threadgroup).
I've attached the Metal and Swift files, but I'll repost the relevant kernel here. It's a function that launches N threads to iterate through a binary tree from the leaves, where the first thread to reach the parent terminates and the second one populates it with the sum of the nodes two children.
// clang-format off
void sum(device const int& size,
device const int* __restrict__ in,
device int* __restrict__ out,
device atomic_int* visited,
uint i [[thread_position_in_grid]]) {
// clang-format on
int val = in[i];
uint cur = (size + i - 1);
out[cur] = val;
atomic_thread_fence(mem_flags::mem_device, memory_order_seq_cst);
cur = (cur - 1) / 2;
int proceed = atomic_fetch_add_explicit(&visited[cur], 1, memory_order_relaxed);
while (proceed == 1) {
uint left = 2 * cur + 1;
uint right = 2 * cur + 2;
uint val_left = out[left];
uint val_right = out[right];
uint val_cur = val_left + val_right;
out[cur] = val_cur;
if (cur == 0) {
break;
}
cur = (cur - 1) / 2;
atomic_thread_fence(mem_flags::mem_device, memory_order_seq_cst);
proceed = atomic_fetch_add_explicit(&visited[cur], 1, memory_order_relaxed);
}
}
What I'm observing is that thread 8192 hits the atomic_fetch_add first and terminates, while thread 8191 hits it second (observes that thread 8192 had incremented it by 1) and proceeds into the loop. Thread 8191 reads out[16383] (which it populated with 8191) and out[16384] (which thread 8192 populated with 8192 prior to the atomic_thread_fence). Instead of reading 8192 from out[16384] though, it reads 0.
Maybe I'm missing something but this seems like a pretty clear violation of the atomic_thread_fence which (I thought) was supposed to guarantee that the write from thread 8192 to out[16384] would be visible to any thread observing the effects of the following atomic_fetch_add. Is atomic_fetch_add not a store operation? Modifying it to an atomic_store or atomic_exchange still results in the bug. Adding another atomic_thread_fence between the atomic_fetch_add and reading of out also doesn't change anything.
I only begin to observe this on grid sizes of 8193 and upwards. That's 9 threadgroups per grid, which I assume could be related to my M1 Pro GPU having 16 cores.
Running the same example on an A17 Pro GPU doesn't show any of this behavior up through a tested grid size of 4194303 (2^22-1), at which point testing larger grid sizes starts to run into other issues so I can't test anything larger.
Removing the atomic_thread_fences on both the M1 and A17 cause the test to fail at much smaller grid sizes, as expected.
sum.metal
main.swift
I'm trying to pass a buffer of float2 items from CPU to GPU.
In the kernel, I can provide a parameter for the buffer:
device const float2* values
for example.
How do I specify float2 as the type for the MTL::Buffer?
I managed to get the code to work by "cheating" by defining a simple class that has the same data members as a float2, but there is probably a better way.
class Coord_f { public: float x{0.0f}; float y{0.0f}; };
then using code to allocate like this:
NS::TransferPtr(device->newBuffer(n_elements * sizeof(Coord_f), MTL::ResourceStorageModeManaged))
The headers for metal-cpp do not appear to define vector objects like float2, but I'm doubtless missing something.
Thanks.
Hello, Apple!
This post is a bug report for Metal driver in MacOS Sequoia.
I'm working on opensource game engine and one of my users reported a bug, on "MacOS Sequoia" + "AMD Radeon RX 6900 XT". Engine crashes, when liking a compute pipeline, with following NSError:
"Compiler encountered an internal error: I"
Offended shader (depth aware blur): https://shader-playground.timjones.io/27565de40391f62f078c891077ba758c
On my end, compiling same shader on M1 (Sonoma) or with offline compiler doesn't reproduce the issue.
Post with bug-report on github: https://github.com/Try/OpenGothic/issues/712
Looking forward for your help and driver fix ;)
Topic:
Graphics & Games
SubTopic:
Metal
I am interested in learning the Metal framework for rendering development. However, most of Apple’s official documentation uses Objective-C code. Therefore, I am seeking guidance on whether it is more advantageous for me to focus solely on learning Swift to gain proficiency in Metal.
Hey everyone,
I’m trying to run Kingdom Come: Deliverance 2 using the Game Porting Toolkit, but I’m encountering a black screen when launching the game. From what I know about the game’s requirements, it might be using Shader Model 6.5, which supports advanced features like DirectX Raytracing (DXR) Tier 1.1. This leads me to suspect that the issue could be related to missing support for DirectX 12.1 Features or Shader Model 6.5 in GPTK.
Does anyone know if these features are currently supported by GPTK? If not, are there any plans to implement them in future updates? Alternatively, is there any workaround for games that rely on Shader Model 6.5 and ray tracing?
Thanks a lot for your help!
Our app uses Metal for image processing. We have found that if our app (and its possible intensive image processing) is started quickly after user is logged in, then calls to Metal may be hanging/stuck for a good while.
Example: it can take 1-2 minutes for something that usually takes 3-5 seconds! Metal threads are just hanging in a memmove...
In Activity Monitor we see a lot of things are happening right after log-in. But why Metal calls are blocking for so long is unknown to us...
The workaround is to wait a minute before we start our app and start intensive image processing using Metal. But hard to explain this workaround to end-users...
It doesn't happen on all computers but fairly easy to reproduce on some computers.
We are using macOS 15.3.1. M1/M3 Max.
Any good ideas for how to proceed with this problem and possible reach out to Apple engineers?
Thanks! :)
My app is running Compute Shaders that use non-uniform thread groups.
When I run the app in the debugger with a simulator target the app crashes on encoder.dispatchThreads and the error message is:
Dispatch Threads with Non-Uniform Threadgroup Size is not supported on this device.
Previously the log output states that:
Metal Shader Validation is unsupported for Simulator.
However:
When I stop the debugger and just run the app in the simulator without the debugger attached, the app just runs fine and does not crash.
The SwiftUI Preview that also triggers the Compute Shader when preparing data also just runs fine without a crash.
I can run and debug on a real device no problem - I just don't have all sizes available.
Is there anything I need to check in my lldb/simulator configuration? It obviously does work, just the debugger cannot really deal with it?
Any input would be nice as this really slows my down as I have to be extremely careful when debugging on the simulator.