Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

Updated DetectHandPoseRequest revision from WWDC25 doesn't exist
I watched this year WWDC25 "Read Documents using the Vision framework". At the end of video there is mention of new DetectHandPoseRequest model for hand pose detection in Vision API. I looked Apple documentation and I don't see new revision. Moreover probably typo in video because there is only DetectHumanPoseRequst (swift based) and VNDetectHumanHandPoseRequest (obj-c based) (notice lack of Human prefix in WWDC video) First one have revision only added in iOS 18+: https://developer.apple.com/documentation/vision/detecthumanhandposerequest/revision-swift.enum/revision1 Second one have revision only added in iOS14+: https://developer.apple.com/documentation/vision/vndetecthumanhandposerequestrevision1 I don't see any new revision targeting iOS26+
0
0
122
Oct ’25
Code along with the Foundation Models framework
In this online session, you can code along with us as we build generative AI features into a sample app live in Xcode. We'll guide you through implementing core features like basic text generation, as well as advanced topics like guided generation for structured data output, streaming responses for dynamic UI updates, and tool calling to retrieve data or take an action. Check out these resources to get started: Download the project files: https://developer.apple.com/events/re... Explore the code along guide: https://developer.apple.com/events/re... Join the live Q&A: https://developer.apple.com/videos/pl... Agenda – All times PDT 10 a.m.: Welcome and Xcode setup 10:15 a.m.: Framework basics, guided generation, and building prompts 11 a.m.: Break 11:10 a.m.: UI streaming, tool calling, and performance optimization 11:50 a.m.: Wrap up All are welcome to attend the session. To actively code along, you'll need a Mac with Apple silicon that supports Apple Intelligence running the latest release of macOS Tahoe 26 and Xcode 26. If you have questions after the code along concludes please share a post here in the forums and engage with the community.
0
0
279
Sep ’25
VNDetectFaceRectanglesRequest does not use the Neural Engine?
I'm on Tahoe 26.1 / M3 Macbook Air. I'm using VNDetectFaceRectanglesRequest as properly as possible, as in the minimal command line program attached below. For some reason, I always get: MLE5Engine is disabled through the configuration printed. I couldn't find any notes on developer docs saying that VNDetectFaceRectanglesRequest can not use the Apple Neural Engine. I'm assuming there is something wrong with my code however I wasn't able to find any remarks from documentation where it might be. I wasn't able to find the above error message online either. I would appreciate your help a lot and thank you in advance. The code below accesses the video from AVCaptureDevice.DeviceType.builtInWideAngleCamera. Currently it directly chooses the 0th format which has the largest resolution (Full HD on my M3 MBA) and "4:2:0" color "v" reduced color component spectrum encoding ("420v"). After accessing video, it performs a VNDetectFaceRectanglesRequest. It prints "VNDetectFaceRectanglesRequest completion Handler called" many times, then prints the error message above, then continues printing "VNDetectFaceRectanglesRequest completion Handler called" until the user quits it. To run it in Xcode, File > New project > Mac command line tool. Pasting the code below, then click on the root file > Targets > Signing & Capabilities > Hardened Runtime > Resource Access > Camera. A possible explanation could be that either Apple's internal CoreML code for this function works on GPU/CPU only or it doesn't accept 420v as supplied by the Macbook Air camera import AVKit import Vision var videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() var detectionRequests: [VNDetectFaceRectanglesRequest]? var videoDataOutputQueue: DispatchQueue = DispatchQueue(label: "queue") class XYZ: /*NSViewController or NSObject*/NSObject, AVCaptureVideoDataOutputSampleBufferDelegate { func viewDidLoad() { //super.viewDidLoad() let session = AVCaptureSession() let inputDevice = try! self.configureFrontCamera(for: session) self.configureVideoDataOutput(for: inputDevice.device, resolution: inputDevice.resolution, captureSession: session) self.prepareVisionRequest() session.startRunning() } fileprivate func highestResolution420Format(for device: AVCaptureDevice) -> (format: AVCaptureDevice.Format, resolution: CGSize)? { let deviceFormat = device.formats[0] print(deviceFormat) let dims = CMVideoFormatDescriptionGetDimensions(deviceFormat.formatDescription) let resolution = CGSize(width: CGFloat(dims.width), height: CGFloat(dims.height)) return (deviceFormat, resolution) } fileprivate func configureFrontCamera(for captureSession: AVCaptureSession) throws -> (device: AVCaptureDevice, resolution: CGSize) { let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [AVCaptureDevice.DeviceType.builtInWideAngleCamera], mediaType: .video, position: AVCaptureDevice.Position.unspecified) let device = deviceDiscoverySession.devices.first! let deviceInput = try! AVCaptureDeviceInput(device: device) captureSession.addInput(deviceInput) let highestResolution = self.highestResolution420Format(for: device)! try! device.lockForConfiguration() device.activeFormat = highestResolution.format device.unlockForConfiguration() return (device, highestResolution.resolution) } fileprivate func configureVideoDataOutput(for inputDevice: AVCaptureDevice, resolution: CGSize, captureSession: AVCaptureSession) { videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue) captureSession.addOutput(videoDataOutput) } fileprivate func prepareVisionRequest() { let faceDetectionRequest: VNDetectFaceRectanglesRequest = VNDetectFaceRectanglesRequest(completionHandler: { (request, error) in print("VNDetectFaceRectanglesRequest completion Handler called") }) // Start with detection detectionRequests = [faceDetectionRequest] } // MARK: AVCaptureVideoDataOutputSampleBufferDelegate // Handle delegate method callback on receiving a sample buffer. public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { var requestHandlerOptions: [VNImageOption: AnyObject] = [:] let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil) if cameraIntrinsicData != nil { requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData } let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)! // No tracking object detected, so perform initial detection let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation.up, options: requestHandlerOptions) try! imageRequestHandler.perform(detectionRequests!) } } let X = XYZ() X.viewDidLoad() sleep(9999999)
0
0
332
Nov ’25
Foundational Model - Image as Input? Timeline
Hi all, I am interested in unlocking unique applications with the new foundational models. I have a few questions regarding the availability of the following features: Image Input: The update in June 2025 mentions "image" 44 times (https://machinelearning.apple.com/research/apple-foundation-models-2025-updates) - however I can't seem to find any information about having images as the input/prompt for the foundational models. When will this be available? I understand that there are existing Vision ML APIs, but I want image input into a multimodal on-device LLM (VLM) instead for features like "Which player is holding the ball in the image", etc (image understanding) Cloud Foundational Model - when will this be available? Thanks! Clement :)
1
0
494
Sep ’25
Mistral/LLaMa Core ML Conversion
Hi, I am new to developing on Apple’s platform yet I want to familiarize myself with Core ML and Core ML Tools. I was watching the WWDC24: Bring your machine learning and AI models to Apple Silicon video and was trying to follow along. After multiple attempts and much reading up on documentation, I am still unable to get a coherent script running that will convert the Mistral model that the host used and convert it to a valid Core ML model. here is a pastebin to what i have currently: https://pastebin.com/04cVjF1v if you require the output as well please let me know
0
0
119
Apr ’25
CoreML Model Conversion Help
I’m trying to follow Apple’s “WWDC24: Bring your machine learning and AI models to Apple Silicon” session to convert the Mistral-7B-Instruct-v0.2 model into a Core ML package, but I’ve run into a roadblock that I can’t seem to overcome. I’ve uploaded my full conversion script here for reference: https://pastebin.com/T7Zchzfc When I run the script, it progresses through tracing and MIL conversion but then fails at the backend_mlprogram stage with this error: https://pastebin.com/fUdEzzKM The core of the error is: ValueError: Op "keyCache_tmp" (op_type: identity) Input x="keyCache" expects list, tensor, or scalar but got state[tensor[1,32,8,2048,128,fp16]] I’ve registered my KV-cache buffers in a StatefulMistralWrapper subclass of nn.Module, matching the keyCache and valueCache state names in my ct.StateType definitions, but Core ML’s backend pass reports the state tensor as an invalid input. I’m using Core ML Tools 8.3.0 on Python 3.9.6, targeting iOS18, and forcing CPU conversion (MPS wasn’t available). Any pointers on how to satisfy the handle_unused_inputs pass or properly declare/cache state for GQA models in Core ML would be greatly appreciated! Thanks in advance for your help, Usman Khan
0
0
179
May ’25
How to test for VisualIntelligence available on device?
I'm adding Visual Intelligence support to my app, and now want to add a Tip using TipKit to guide users to this feature from within my app. I want to add a Rule to my Tip which will only show this Tip on devices where Visual Intelligence is supported (ex. not iPhone 14 Pro Max). What is the best way for me to determine availability to set this TipKit rule? Here's the documentation I'm following for Visual Intelligence: https://developer.apple.com/documentation/visualintelligence/integrating-your-app-with-visual-intelligence
0
0
605
Sep ’25
Embedding model missing once transferred to Xcode
I've created a "Transfer Learning BERT Embeddings" model with the default "Latin" language family and "Automatic" Language setting. This model performs exceptionally well against the test data set and functions as expected when I preview it in Create ML. However, when I add it to the Xcode project of the application to which I am deploying it, I am getting runtime errors that suggest it can't find the embedding resources: Failed to locate assets for 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' embedding model Note, I am adding the model to the app project the same way that I added an earlier "Maximum Entropy" model. That model had no runtime issues. So it seems there is an issue getting hold of the embeddings at runtime. For now, "runtime" means in the Simulator. I intend to deploy my application to iOS devices once GM 26 is released (the app also uses AFM). I'm developing on Tahoe 26 beta, running on iOS 26 beta, using Xcode 26 beta. Is this a known/expected issue? Are the embeddings expected to be a resource in the model? Is there a workaround? I did try opening the model in Xcode and saving it as an mlpackage, then adding that to my app project, but that also didn't resolve the issue.
1
0
369
Sep ’25
ImagePlayground: Programmatic Creation Error
Hardware: Macbook Pro M4 Nov 2024 Software: macOS Tahoe 26.0 & xcode 26.0 Apple Intelligence is activated and the Image playground macOS app works Running the following on xcode throws ImagePlayground.ImageCreator.Error.creationFailed Any suggestions on how to make this work? import Foundation import ImagePlayground Task { let creator = try await ImageCreator() guard let style = creator.availableStyles.first else { print("No styles available") exit(1) } let images = creator.images( for: [.text("A cat wearing mittens.")], style: style, limit: 1) for try await image in images { print("Generated image: \(image)") } exit(0) } RunLoop.main.run()
0
0
274
Sep ’25
A specific mlmodelc model runs on iPhone 15, but not on iPhone 16
As we described on the title, the model that I have built completely works on iPhone 15 / A16 Bionic, on the other hand it does not run on iPhone 16 / A18 chip with the following error message. E5RT encountered an STL exception. msg = MILCompilerForANE error: failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED. E5RT: MILCompilerForANE error: failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED (11) It consumes 1.5 ~ 1.6 GB RAM on the loading the model, then the consumption is decreased to less than 100MB on the both of iPhone 15 and 16. After that, only on iPhone 16, the above error is shown on the Xcode log, the memory consumption is surged to 5 to 6GB, and the system kills the app. It works well only on iPhone 15. This model is built with the Core ML tools. Until now, I have tried the target iOS 16 to 18 and the compute units of CPU_AND_NE and ALL. But any ways have not solved this issue. Eventually, what kindof fix should I do? minimum_deployment_target = ct.target.iOS18 compute_units = ct.ComputeUnit.ALL compute_precision = ct.precision.FLOAT16
2
0
161
May ’25
AppShortcuts.xcstrings does not translate each invocation phrase option separately, just the first
Due to our min iOS version, this is my first time using .xcstrings instead of .strings for AppShortcuts. When using the migrate .strings to .xcstrings Xcode context menu option, an .xcstrings catalog is produced that, as expected, has each invocation phrase as a separate string key. However, after compilation, the catalog changes to group all invocation phrases under the first phrase listed for each intent (see attached screenshot). It is possible to hover in blank space on the right and add more translations, but there is no 1:1 key matching requirement to the phrases on the left nor a requirement that there are the same number of keys in one language vs. another. (The lines just happen to align due to my window size.) What does that mean, practically? Do all sub-phrases in each language in AppShortcuts.xcstrings get processed during compilation, even if there isn't an equivalent phrase key declared in the AppShortcut (e.g., the ja translation has more phrases than the English)? (That makes some logical sense, as these phrases need not be 1:1 across languages.) In the AppShortcut declaration, if I delete all but the top invocation phrase, does nothing change with Siri? Is there something I'm doing incorrectly? struct WatchShortcuts: AppShortcutsProvider { static var appShortcuts: [AppShortcut] { AppShortcut( intent: QuickAddWaterIntent(), phrases: [ "\(.applicationName) log water", "\(.applicationName) log my water", "Log water in \(.applicationName)", "Log my water in \(.applicationName)", "Log a bottle of water in \(.applicationName)", ], shortTitle: "Log Water", systemImageName: "drop.fill" ) } }
0
0
276
Aug ’25
`LanguageModelSession.respond()` never resolves in Beta 5
Hi all, I noticed on Friday that on the new Beta 5 using FoundationModels on a simulator LanguageModelSession.respond() neither resolves nor throws most of the time. The SwiftUI test app below was working perfectly in Xcode 16 Beta 4 and iOS 26 Beta 4 (simulator). import SwiftUI import FoundationModels struct ContentView: View { var body: some View { VStack { Image(systemName: "globe") .imageScale(.large) .foregroundStyle(.tint) Text("Hello, world!") } .padding() .onAppear { Task { do { let session = LanguageModelSession() let response = try await session.respond(to: "are cats better than dogs ???") print(response.content) } catch { print("error") } } } } } After updating to Xcode 16 Beta 5 and iOS 26 Beta 5 (simulator), the code now often hangs. Occasionally it will work if I toggle Apple Intelligence on and off in Settings, but it’s unreliable.
2
0
350
Aug ’25
Error with guardrailViolation and underlyingErrors
Hi, I am a new IOS developer, trying to learn to integrate the Apple Foundation Model. my set up is: Mac M1 Pro MacOS 26 Beta Version 26.0 beta 3 Apple Intelligence & Siri --> On here is the code, func generate() { Task { isGenerating = true output = "⏳ Thinking..." do { let session = LanguageModelSession( instructions: """ Extract time from a message. Example Q: Golfing at 6PM A: 6PM """) let response = try await session.respond(to: "Go to gym at 7PM") output = response.content } catch { output = "❌ Error:, \(error)" print(output) } isGenerating = false } and I get these errors guardrailViolation(FoundationModels.LanguageModelSession.GenerationError.Context(debugDescription: "Prompt may contain sensitive or unsafe content", underlyingErrors: [Asset com.apple.gm.safety_embedding_deny.all not found in Model Catalog])) Can you help me get through this?
2
0
302
Aug ’25
VNDetectTextRectanglesRequest not detecting text rectangles (includes image)
Hi everyone, I'm trying to use VNDetectTextRectanglesRequest to detect text rectangles in an image. Here's my current code: guard let cgImage = image.cgImage(forProposedRect: nil, context: nil, hints: nil) else { return } let textDetectionRequest = VNDetectTextRectanglesRequest { request, error in if let error = error { print("Text detection error: \(error)") return } guard let observations = request.results as? [VNTextObservation] else { print("No text rectangles detected.") return } print("Detected \(observations.count) text rectangles.") for observation in observations { print(observation.boundingBox) } } textDetectionRequest.revision = VNDetectTextRectanglesRequestRevision1 textDetectionRequest.reportCharacterBoxes = true let handler = VNImageRequestHandler(cgImage: cgImage, orientation: .up, options: [:]) do { try handler.perform([textDetectionRequest]) } catch { print("Vision request error: \(error)") } The request completes without error, but no text rectangles are detected — the observations array is empty (count = 0). Here's a sample image I'm testing with: I expected VNTextObservation results, but I'm not getting any. Is there something I'm missing in how this API works? Or could it be a limitation of this request or revision? Thanks for any help!
2
0
135
May ’25
Accessing Apple Intelligence APIs: Custom Prompt Support and Inference Capabilities
Hello Apple Developer Community, I'm exploring the integration of Apple Intelligence features into my mobile application and have a couple of questions regarding the current and upcoming API capabilities: Custom Prompt Support: Is there a way to pass custom prompts to Apple Intelligence to generate specific inferences? For instance, can we provide a unique prompt to the Writing Tools or Image Playground APIs to obtain tailored outputs? Direct Inference Capabilities: Beyond the predefined functionalities like text rewriting or image generation, does Apple Intelligence offer APIs that allow for more generalized inference tasks based on custom inputs? I understand that Apple has provided APIs such as Writing Tools, Image Playground, and Genmoji. However, I'm interested in understanding the extent of customization and flexibility these APIs offer, especially concerning custom prompts and generalized inference. Additionally, are there any plans or timelines for expanding these capabilities, perhaps with the introduction of new SDKs or frameworks that allow deeper integration and customization? Any insights, documentation links, or experiences shared would be greatly appreciated. Thank you in advance for your assistance!
3
0
313
Jun ’25