Hello! I'm currently porting a videogame console emulator to iOS and I'm trying to make the renderer (tested on MacOS) work on iOS as well.
The emulator core is written in C++ and uses metal-cpp for rendering, whereas the iOS frontend is written in Swift with SwiftUI. I have an Objective-C++ bridging header for bridging the Swift and C++ sides.
On the Swift side, I create an MTKView. Inside the MTKView delegate, I run the emulator for 1 video frame and pass it the view's backing layer for it to render the final output image with. The emulator runs and returns, but when it returns I get a crash in Swift land (callstack attached below), inside objc_release, which indicates I'm doing something wrong with memory management.
My bridging interface (ios_driver.h):
#pragma once
#include <Foundation/Foundation.h>
#include <QuartzCore/QuartzCore.h>
void iosCreateEmulator();
void iosRunFrame(CAMetalLayer* layer);
Bridge implementation (ios_driver.mm):
#import <Foundation/Foundation.h>
extern "C" {
#include "ios_driver.h"
}
<...>
#define IOS_EXPORT extern "C" __attribute__((visibility("default")))
std::unique_ptr<Emulator> emulator = nullptr;
IOS_EXPORT void iosCreateEmulator() { ... }
// Runs 1 video frame of the emulator and
IOS_EXPORT void iosRunFrame(CAMetalLayer* layer) {
void* layerBridged = (__bridge void*)layer;
// Pass the CAMetalLayer to the emulator
emulator->getRenderer()->setMTKLayer(layerBridged);
// Runs the emulator for 1 frame and renders the output image using our layer
emulator->runFrame();
}
My MTKView delegate:
class Renderer: NSObject, MTKViewDelegate {
var parent: ContentView
var device: MTLDevice!
init(_ parent: ContentView) {
self.parent = parent
if let device = MTLCreateSystemDefaultDevice() {
self.device = device
}
super.init()
}
func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {}
func draw(in view: MTKView) {
var metalLayer = view.layer as! CAMetalLayer
// Run the emulator for 1 frame & display the output image
iosRunFrame(metalLayer)
}
}
Finally, the emulator's render function that interacts with the layer:
void RendererMTL::setMTKLayer(void* layer) {
metalLayer = (CA::MetalLayer*)layer;
}
void RendererMTL::display() {
CA::MetalDrawable* drawable = metalLayer->nextDrawable();
if (!drawable) {
return;
}
MTL::Texture* texture = drawable->texture();
<rest of rendering follows here using the drawable & its texture>
}
This is the Swift callstack at the time of the crash:
To my understanding, I shouldn't be violating ARC rules as my bridging header uses CAMetalLayer* instead of void* and Swift will automatically account for ARC when passing CoreFoundation objects to Objective-C. However I don't have any other idea as to what might be causing this. I've been trying to debug this code for a couple of days without much success.
If you need more info, the emulator code is also on Github
Metal renderer: https://github.com/wheremyfoodat/Panda3DS/blob/ios/src/core/renderer_mtl/renderer_mtl.cpp#L58-L68
Bridge implementation: https://github.com/wheremyfoodat/Panda3DS/blob/ios/src/ios_driver.mm
Bridging header: https://github.com/wheremyfoodat/Panda3DS/blob/ios/include/ios_driver.h
Any help is more than appreciated. Thank you for your time in advance.
Metal
RSS for tagRender advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I notice some metal-cpp classes have static funtion like
static URL* fileURLWithPath(const class String* pPath);
static class ComputePassDescriptor* computePassDescriptor();
static class AccelerationStructurePassDescriptor* accelerationStructurePassDescriptor();
which return a new object.
these classes also provide 'alloc' and 'init' function to create object by default.
for object created by 'alloc' and 'init', I use something like NS::Shaderd_Ptr or call release directly to free memory. Because 'alloc' and 'init' not explicit call on these static function.
I wonder how to correctly free object created by these static function? did they managed by autorelease pool?
Hello ladies and gentlemen, I'm writing a simple renderer on the main actor using Metal and Swift 6. I am at the stage now where I want to create a render pipeline state using asynchronous API:
@MainActor
class Renderer {
let opaqueMeshRPS: MTLRenderPipelineState
init(/*...*/) async throws {
let descriptor = MTLRenderPipelineDescriptor()
// ...
opaqueMeshRPS = try await device.makeRenderPipelineState(descriptor: descriptor)
}
}
I get a compilation error if try to use the asynchronous version of the makeRenderPipelineState method:
Non-sendable type 'any MTLRenderPipelineState' returned by implicitly asynchronous call to nonisolated function cannot cross actor boundary
Which is understandable, since MTLRenderPipelineState is not Sendable. But it looks like no matter where or how I try to access this method, I just can't do it - you have this API, but you can't use it, you can only use the synchronous versions.
Am I missing something or is Metal just not usable with Swift 6 right now?
Hey all! I'm got my hands on a refurbished mac mini m1 and already diving into metal. At the moment, i'm currently studying graphics programming with opengl and got to a point where I can almost create a 3d cube. However, I noticed there aren't many tutorials for metal cpp but rather demos. One thing I love about graphic programming, is skinning/skeletal animation. At the moment, I can't find any sources or tutorials on how to load skeletal animations into metal-cpp. So, if I create my character in blender and had all types of animations all loaded into a .FBX or maybe .DAE and load this into metal api with metal-cpp, how can I go on about how this works?
How many 32-bit variables can I use concurrently in a single thread of a Metal compute kernel without worrying about the variables getting spilled into the device memory? Alternatively: how many 32-bit registers does a single thread have available for itself?
Let's say that each thread of my compute kernel needs to store and work with its own array of N float variables, where N can be 128, 256, 512 or more. To achieve maximum possible performance, I do not want to the local thread variables to get spilled into the slow device memory. I want all N variables to be stored "on-chip", in the thread memory space.
To make my question more concrete, let's say there is an array thread float localArray[N]. Assuming an unrealistic hypothetical scenario where localArray is the only variable in the whole kernel, what is the maximum value of N for which no portion of localArray would get spilled into the device memory?
I searched in the Metal feature set tables, but I could not find any details.
After following the instructions here:
https://developer.apple.com/metal/cpp/
I attempted building my project and Xcode presented several errors. In essence it's complaining about some redeclarations in the Metal-CPP headers.
NSBundle.hpp and NSError.hpp are included in the metal-cpp/foundation directory from the metal-cpp download.
Any help in getting these issues resolved is appreciated.
Thanks!
*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[NSBundle allFrameworks]: unrecognized selector sent to instance
NS::Bundle* Bundle = NS::Bundle::mainBundle(); Bundle->allFrameworks();
call to allFrameworks() and allBundles() will throw exception, but other functions work well.
I am working on a custom resolve tile shader for a client. I see a big difference in performance depending on where we write to:
1- the resolve texture of the color attachment
2- a rw tile shader texture set via [renderEncoder setTileTexture: myResolvedTexture]
Option 2 is more than twice as slow than option 1.
Our compute shader writes to 4 UAVs so just using the resolve texture entry is not possible.
Why such a difference as there is no more data being written? Can option 2 be as fast as option 1?
I can demonstrate the issue in a modified version of the Multisample code sample.
In my project I need to do the following:
In runtime create metal Dynamic library from source.
In runtime create metal Executable library from source and Link it with my previous created Dynamic library.
Create compute pipeline using those two libraries created above.
But I get the following error at the third step:
Error Domain=AGXMetalG15X_M1 Code=2 "Undefined symbols:
_Z5noisev, referenced from: OnTheFlyKernel
" UserInfo={NSLocalizedDescription=Undefined symbols:
_Z5noisev, referenced from: OnTheFlyKernel
}
import Foundation
import Metal
class MetalShaderCompiler {
let device = MTLCreateSystemDefaultDevice()!
var pipeline: MTLComputePipelineState!
func compileDylib() -> MTLDynamicLibrary {
let source = """
#include <metal_stdlib>
using namespace metal;
half3 noise() {
return half3(1, 0, 1);
}
"""
let option = MTLCompileOptions()
option.libraryType = .dynamic
option.installName = "@executable_path/libFoundation.metallib"
let library = try! device.makeLibrary(source: source, options: option)
let dylib = try! device.makeDynamicLibrary(library: library)
return dylib
}
func compileExlib(dylib: MTLDynamicLibrary) -> MTLLibrary {
let source = """
#include <metal_stdlib>
using namespace metal;
extern half3 noise();
kernel void OnTheFlyKernel(texture2d<half, access::read> src [[texture(0)]],
texture2d<half, access::write> dst [[texture(1)]],
ushort2 gid [[thread_position_in_grid]]) {
half4 rgba = src.read(gid);
rgba.rgb += noise();
dst.write(rgba, gid);
}
"""
let option = MTLCompileOptions()
option.libraryType = .executable
option.libraries = [dylib]
let library = try! self.device.makeLibrary(source: source, options: option)
return library
}
func runtime() {
let dylib = self.compileDylib()
let exlib = self.compileExlib(dylib: dylib)
let pipelineDescriptor = MTLComputePipelineDescriptor()
pipelineDescriptor.computeFunction = exlib.makeFunction(name: "OnTheFlyKernel")
pipelineDescriptor.preloadedLibraries = [dylib]
pipeline = try! device.makeComputePipelineState(descriptor: pipelineDescriptor, options: .bindingInfo, reflection: nil)
}
}
There is a sample project from Apple here. It has a scene of a city at night and you can move in it.
It basically has 2 parts:
application code written in what looks like Objective-C (I am more familiar with C++), which inherits from things like NSObject, MTKView, NSViewController and so on - it processes input and all app-related and window-related stuff.
rendering code that also looks like Objective-C. Btw both parts are mostly in .mm files (Obj-C++ AFAIK). The application part directly uses only one class from the rendering part - AAPLRenderer.
I want to move the rendering part to C++ using metal-cpp. For that I need to link metal-cpp to the project. I did it successfully with blank projects several times before using this tutorial. But with this sample project Xcode can't find Foundation/Foundation.hpp (and other metal-cpp headers). The error says this:
Did not find header 'Foundation.hpp' in framework 'Foundation' (loaded from '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.0.sdk/System/Library/Frameworks')
Pls help
Hey everyone,
I’m trying to run Kingdom Come: Deliverance 2 using the Game Porting Toolkit, but I’m encountering a black screen when launching the game. From what I know about the game’s requirements, it might be using Shader Model 6.5, which supports advanced features like DirectX Raytracing (DXR) Tier 1.1. This leads me to suspect that the issue could be related to missing support for DirectX 12.1 Features or Shader Model 6.5 in GPTK.
Does anyone know if these features are currently supported by GPTK? If not, are there any plans to implement them in future updates? Alternatively, is there any workaround for games that rely on Shader Model 6.5 and ray tracing?
Thanks a lot for your help!
Our app uses Metal for image processing. We have found that if our app (and its possible intensive image processing) is started quickly after user is logged in, then calls to Metal may be hanging/stuck for a good while.
Example: it can take 1-2 minutes for something that usually takes 3-5 seconds! Metal threads are just hanging in a memmove...
In Activity Monitor we see a lot of things are happening right after log-in. But why Metal calls are blocking for so long is unknown to us...
The workaround is to wait a minute before we start our app and start intensive image processing using Metal. But hard to explain this workaround to end-users...
It doesn't happen on all computers but fairly easy to reproduce on some computers.
We are using macOS 15.3.1. M1/M3 Max.
Any good ideas for how to proceed with this problem and possible reach out to Apple engineers?
Thanks! :)
The title is self-exploratory. I wasn't able to find the CAMetalDisplayLink on the most recent metal-cpp release (metal-cpp_macOS15_iOS18-beta). Are there any plans to include it in the next release?
Now the examples of metal-cpp are target on desktop and using AppKit which is not supported on iOS. Is there any tips for developing with metal-cpp on mobile device?
When inspecting the geometry in Xcode's metal debugger, I noticed that the shown "frustrum box" didn't make sense. Since Metal uses depth range 0,1 in NDC space, I would expect a vertex that is projected to z:0 to be on the front clipping plane of the frustrum shown in the geometry inspector. This is however not the case. A vertex with ndc z:0 is shown halfway inside the frustrum. Vertices with ndc z less than 0 are correctly culled during rendering, while the geometry inspector's frustrum shows that the vertex is stil inside the frustrum.
The image shows vertices that are visually in the middle of the frustrum on z axis, but at the same time the out position shows that they are projected to z:0. How is this possible, unless there's a bug in the geometry inspector?
Hello!
I need to "draw" a set of particles into the texture. It would be trivial in render encoder of course. However, I would like to implement the task in compute kernel. Every particle draw operation is expected to set 5 texels - "center" one and left/right/upper/lower. Particles can and will overlap, so concurrent draws are to be expected.
I tried using texture atomics - atomic_store() to be more precise. This worked, albeit pretty slowly - too slow for my purpose.
Just to test what would happen, I tried using normal texture write(). I was expecting to see some kind of visual artefacts, but to my surprise, it worked very well (and much faster).
My question: is it safe? I understand that calling write() doesn't guarantee any ordering of the operations, so if multiple threads write to the same texel, the final value may come from any of those threads. But suppose all the threads were to write the very same color? Can I assume that the texel in question will have said color after the compute kernel finishes?
I am using M2 Pro MacBook, but ideally I would love to get the answer for the all Apple Silicon devices. My texture format is R32Int (so as to be able to use atomics), but I could do with any single-channel format, the purpose of the texture is to be binary mask of sorts.
Thanks!
I have a bare-bones Metal app setup where I attach a CAMetalLayer to a window that inherits from a NSWindow with a custom delegate. Everything else is vanilla. I'm also using metal-cpp and metal shader converter.
I'm running into a issue where the application runs fine in the beginning, but once I resize the window, it starts hitching. It turns out that [CAMetalLayer nextDrawable:] frequently (but not always) takes around a full second (plus or minus a few milliseconds) to return once drawableSize has been updated.
I've tried setting allowsNextDrawableTimeout to false which doesn't work; it returns a valid drawable after a second instead of nil. Setting displaySyncEnabled to false reduces the likelihood of this happening to around 50% from 90%+ but does not eliminate it. Setting maximumDrawableCount to 2 or 3 does not seem to make a difference.
By dumping the resource IDs of the returned textures I've noticed something interesting: Before resizing, the layer seems to shuffle between 2 textures or at least 2 resource IDs, but after resizing it starts to create new textures for each returned drawable. Occasionally it seems to reuse a previous resource ID, but it does not seem to have anything to do with whether the method returns quickly or not.
Why does this happen, and how can I fix it? Should I create a new CAMetalLayer when resizing the window instead of updating drawableSize?
I am interested in learning the Metal framework for rendering development. However, most of Apple’s official documentation uses Objective-C code. Therefore, I am seeking guidance on whether it is more advantageous for me to focus solely on learning Swift to gain proficiency in Metal.
I am making a framework in C++ using metal-cpp, basically a small game engine. I am also consequently using metal-cpp-extensions provided in LearnMetalCPP to make applications work.
For one of my classes, I needed to add AppKit.hpp inside a public header file, so I moved it and its associate headers(NSApplication.hpp, NSMenu.hpp, etc.) from Project headers to Public in Build Phases' Headers, however, it started giving me the error "cast of C pointer type 'void *' to Objective-C pointer type 'Class' requires a bridged cast" at several points in the AppKit headers. They don't appear when AppKit and its associates are in the Project headers, or when they are in the Private headers and no headers import it.
I imagined that disabling Objective-C ARC and Using __bridge casts outside of ARC in Build Settings would solve it, but it didn't budge.
I imagined it wouldn't involve actively changing the headers would be the answer, but even if I try to put __bridge before the problematic casts, it didn't recognize __bridge.
How do I solve this? And why is it only happening in Public and not Project headers?
Hello, Apple!
This post is a bug report for Metal driver in MacOS Sequoia.
I'm working on opensource game engine and one of my users reported a bug, on "MacOS Sequoia" + "AMD Radeon RX 6900 XT". Engine crashes, when liking a compute pipeline, with following NSError:
"Compiler encountered an internal error: I"
Offended shader (depth aware blur): https://shader-playground.timjones.io/27565de40391f62f078c891077ba758c
On my end, compiling same shader on M1 (Sonoma) or with offline compiler doesn't reproduce the issue.
Post with bug-report on github: https://github.com/Try/OpenGothic/issues/712
Looking forward for your help and driver fix ;)
Topic:
Graphics & Games
SubTopic:
Metal