I have a question. In China, long pressing a picture in the album can segment the target. Is this model a local model? Is there any information? Can developers use it?
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
According to the Tool documentation, the arguments to the tool are specified as a static struct type T, which is given to tool.call(argument: T) However, if the arguments are not known until runtime, is it possible to still create a Tool object with the proper parameters? Let's say a JSON-style dictionary is passed into the Tool init function to specify T, is this achievable?
I’m working on generating images using Image Playground. The code works fine for other styles but fails when using an external provider. I don’t see any other requirements mentioned in the documentation. Has anyone else encountered a similar issue?
Here’s the relevant code snippet:
https://developer.apple.com/documentation/imageplayground/imageplaygroundstyle/externalprovider?changes=_2
The error message is also not very helpful. It simply states that the creation failed.
Note: I have enabled ChatGPT Plus, and the image generation using ChatGPT styles works fine when using the Playground app.
do {
let creator = try await ImageCreator()
let concept = ImagePlaygroundConcept.text("Love")
let images = creator.images(for: [concept], style: .externalProvider, limit: 1)
for try await image in images {
// Handle image
break
}
} catch {
// Handle error
}
I’m using the iOS 26 RC, and when I print creator.availableStyles, it doesn’t display the external Provider.
[ImagePlayground.ImagePlaygroundStyle(id: "animation", _representationInfo: nil), ImagePlayground.ImagePlaygroundStyle(id: "emoji", _representationInfo: nil), ImagePlayground.ImagePlaygroundStyle(id: "illustration", _representationInfo: nil), ImagePlayground.ImagePlaygroundStyle(id: "sketch", _representationInfo: nil), ImagePlayground.ImagePlaygroundStyle(id: "messages-background", _representationInfo: nil)]
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
This is my code:
witch SystemLanguageModel.default.availability {
case .available:
ContentView()
.popover(isPresented: $showSettings) {
SettingsView().presentationCompactAdaptation(.popover)
}
case .unavailable(.modelNotReady):
ContentUnavailableView("Apple Intelligence is unavailable",
systemImage: "apple.intelligence.badge.xmark",
description: Text("Please come back later."))
case .unavailable(.appleIntelligenceNotEnabled):
ContentUnavailableView("Apple Intelligence is unavailable",
systemImage: "apple.intelligence.badge.xmark",
description: Text("Please turn on Apple Intelligence."))
case .unavailable(.deviceNotEligible):
ContentUnavailableView("Apple Intelligence is unavailable",
systemImage: "apple.intelligence.badge.xmark",
description: Text("This device is not eligible for Apple Intelligence."))
case .unavailable:
ContentUnavailableView("Apple Intelligence is unavailable",
systemImage: "apple.intelligence.badge.xmark")
}
When I switch off Apple Intelligence, I expected "Please turn on Apple Intelligence.", but instead I get "Please come back later."
This seems to be wrong error?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Testing Foundation Models framework with a health-focused recipe generation app. The on-device approach is appealing but performance is rough. Taking 20+ seconds just to get recipe name and description. Same content from Claude API: 4 seconds.
I know it's beta and on-device has different tradeoffs, but this is approaching unusable territory for real-time user experience. The streaming helps psychologically but doesn't mask the underlying latency.The privacy/cost benefits are compelling but not if users abandon the feature before it completes.
Anyone else seeing similar performance? Is this expected for beta, or are there optimization techniques I'm missing?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hello
I’m experimenting with Apple’s on‑device language model via the FoundationModels framework in Xcode (using LanguageModelSession in my code). I’d like to confirm a few points:
• Is the language model provided by FoundationModels designed and trained by Apple? Or is it based on an open‑source model?
• Is this on‑device model available on iOS (and iPadOS), or is it limited to macOS?
• When I write code in Xcode, is code completion powered by this same local model? If so, why isn’t the same model available in the left‑hand chat sidebar in Xcode (so that I can use it there instead of relying on ChatGPT)?
• Can I grant this local model access to my personal data (photos, contacts, SMS, emails) so it can answer questions based on that information? If yes, what APIs, permission prompts, and privacy constraints apply?
Thanks
Hello,
I'm trying to write a model with PyTorch and convert it to CoreML.
I wrote another models and that works succesfully, even the one that gave the problem is, but I can't visualize it with XCode to know where is running.
The error that appear is:
There was a problem decoding this Core ML document
validator error: unable to open file for read
Anyone knows why is this happening?
Thanks a lot,
Álvaro Corrochano
Topic:
Machine Learning & AI
SubTopic:
Core ML
Hey guys 👋
I’ve been thinking about a feature idea for iOS that could totally change the way we interact with apps like Twitter/X.
Imagine if we could define our own recommendation algorithm, and have an AI on the iPhone that replaces the suggested tweets in the feed with ones that match our personal interests — based on public tweets, and without hacking anything.
Kinda like a personalized "AI skin" over the app that curates content you actually care about. Feels like this would make content way more relevant and less algorithmically manipulative.
Would love to know what you all think — and if Apple could pull this off 🔥
Topic:
Machine Learning & AI
SubTopic:
General
Hey,
Would be great to have an equivalent of toolCallId for both toolCall and toolResult in the transcript. Otherwise, it is hard to connect tool calls with their respective responses, when there were multiple parallel calls to the same tool.
Thanks!
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hey Devs,
I'm trying to create my own Real Time Text detection like this Apple project. https://developer.apple.com/documentation/vision/extracting-phone-numbers-from-text-in-images
I want to use the new iOS18 RecognizeTextRequest instead of the old VNRecognizeTextRequest in my SwiftUI project.
This is my delegate code with the camera setup. I removed region of interest for debugging but I'm trying to scan English words in books. The idea is to get one word in the ROI in the future. But I can't even get proper words so testing without ROI incase my math is wrong.
@Observable
class CameraManager: NSObject, AVCapturePhotoCaptureDelegate
...
override init() {
super.init()
setUpVisionRequest()
}
private func setUpVisionRequest() {
textRequest = RecognizeTextRequest(.revision3)
}
...
func setup() -> Bool {
captureSession.beginConfiguration()
guard
let captureDevice = AVCaptureDevice.default(
.builtInWideAngleCamera, for: .video, position: .back)
else {
return false
}
self.captureDevice = captureDevice
guard let deviceInput = try? AVCaptureDeviceInput(device: captureDevice)
else {
return false
}
/// Check whether the session can add input.
guard captureSession.canAddInput(deviceInput) else {
print("Unable to add device input to the capture session.")
return false
}
/// Add the input and output to session
captureSession.addInput(deviceInput)
/// Configure the video data output
videoDataOutput.setSampleBufferDelegate(
self, queue: videoDataOutputQueue)
if captureSession.canAddOutput(videoDataOutput) {
captureSession.addOutput(videoDataOutput)
videoDataOutput.connection(with: .video)?
.preferredVideoStabilizationMode = .off
} else {
return false
}
// Set zoom and autofocus to help focus on very small text
do {
try captureDevice.lockForConfiguration()
captureDevice.videoZoomFactor = 2
captureDevice.autoFocusRangeRestriction = .near
captureDevice.unlockForConfiguration()
} catch {
print("Could not set zoom level due to error: \(error)")
return false
}
captureSession.commitConfiguration()
// potential issue with background vs dispatchqueue ??
Task(priority: .background) {
captureSession.startRunning()
}
return true
}
}
// Issue here ???
extension CameraManager: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(
_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection
) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
Task {
textRequest.recognitionLevel = .fast
textRequest.recognitionLanguages = [Locale.Language(identifier: "en-US")]
do {
let observations = try await textRequest.perform(on: pixelBuffer)
for observation in observations {
let recognizedText = observation.topCandidates(1).first
print("recognized text \(recognizedText)")
}
} catch {
print("Recognition error: \(error.localizedDescription)")
}
}
}
}
The results I get look like this ( full page of English from a any book)
recognized text Optional(RecognizedText(string: e bnUI W4, confidence: 0.5))
recognized text Optional(RecognizedText(string: ?'U, confidence: 0.3))
recognized text Optional(RecognizedText(string: traQt4, confidence: 0.3))
recognized text Optional(RecognizedText(string: li, confidence: 0.3))
recognized text Optional(RecognizedText(string: 15,1,#, confidence: 0.3))
recognized text Optional(RecognizedText(string: jllÈ, confidence: 0.3))
recognized text Optional(RecognizedText(string: vtrll, confidence: 0.3))
recognized text Optional(RecognizedText(string: 5,1,: 11, confidence: 0.5))
recognized text Optional(RecognizedText(string: 1141, confidence: 0.3))
recognized text Optional(RecognizedText(string: jllll ljiiilij41, confidence: 0.3))
recognized text Optional(RecognizedText(string: 2f4, confidence: 0.3))
recognized text Optional(RecognizedText(string: ktril, confidence: 0.3))
recognized text Optional(RecognizedText(string: ¥LLI, confidence: 0.3))
recognized text Optional(RecognizedText(string: 11[Itl,, confidence: 0.3))
recognized text Optional(RecognizedText(string: 'rtlÈ131, confidence: 0.3))
Even with ROI set to a specific rectangle Normalized to Vision, I get the same results with single characters returning gibberish.
Any help would be amazing thank you.
Am I using the buffer right ?
Am I using the new perform(on: CVPixelBuffer) right ?
Maybe I didn't set up my camera properly? I can provide code
When the @Generable is applied toward a Swift struct declared within another struct, and when said nested struct is defined as the type of one of the properties of another @Generable type, which is in turn defined as the output format of Foundation Model session, Foundation Model can stuck in a loop trying to create a infinitely nested response, until the context window limit exceeded error is triggered.
I have filed feedback FB19987191 with a demo project. Is this expected behavior?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I keep getting the error “An unsupported language or locale was used.”
Is there any documentation that specifies the accepted languages or locales in Foundation model?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hello,
We find that models sometimes load very fast (<< 1 second) and sometimes encounter very long load times (>> 120 seconds). During such slow load times, the model is being compiled.
We would greatly appreciate the ability to check cache validity via CoreML and determine that we are about to encounter long load times so that we can mitigate and provide a good user experience.
A secondary issue: sometimes the cache is corrupted (typically .mpsgraphpackage yielding Metal cold asserts). This yields load failures and OS errors that persist between launches, and we have to manually nuke the cache (~/Library/..../my-app/...) for the CoreML assets. A CoreML API for clearing caches and hardening from asserts across the load paths would be appreciated
Topic:
Machine Learning & AI
SubTopic:
Core ML
Hi,
I’m developing an app targeting iOS 26, using the new FoundationModels framework to perform on-device LLM inference. I’m currently testing memory usage.
Does the memory used by FoundationModels—including model weights, KV cache, and any inference-related buffers—count toward my app’s Jetsam memory limit, or is any of it managed separately by the system?
I may need to run two concurrent inferences, each with a 4096-token context window. Is this explicitly supported or allowed by FoundationModels on iOS 26? Would this significantly increase the risk of memory-based termination?
Thanks in advance for any clarification.
Thanks.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hi friends,
I have just found that the inference speed dropped to only 1/10 of the original model.
Had anyone encountered this?
Thank you.
Topic:
Machine Learning & AI
SubTopic:
Core ML
iOS26 is supported by a wider range of devices than are able to run AI, e.g iPhone 12 runs iOS26, but does not support AI.
How do we determine in code if AI is supported on a device ?
How do we determine what features use AI under the hood ?
Thanks,
Steve.
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Dear Apple Foundation Models Development Team,
I am a developer integrating Apple Foundation Models (AFM) into my app and encountered the exceededContextWindowSize error when exceeding the 4096-token limit.
Proposal:
I suggest Apple develop a tool to estimate the token count of a prompt before sending it to the model. This tool could be integrated into FoundationModels Framework for ease of use.
Benefits:
A token estimation tool would help developers manage the context window limit and optimize performance. I hope Apple considers this proposal soon.
Thank you!
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I'm building a new feature with Visual Intelligence framework. My implementation for IndexedEntity and IntentValueQuery worked as expected and I can see a list of objects in visual search result.
However, my OpenIntent doesn't work. When I tap on the object, I got a message on screen "Sorry somethinf went wrong ...". and the breakpoint in perform() is never triggered.
Things I've tried:
I added @MainActor before perform(), this didn't change anything
I set static let openAppWhenRun: Bool = true and static var supportedModes: IntentModes = [.foreground(.immediate)], still nothing
I created a different intent for the see more button at the end of feed. This AppIntent with schema: .visualIntelligence.semanticContentSearch worked, perform() is executed
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Hi, DataScannerViewController does't recognize currencies less than 1.00 (e.g. 0.59 USD, 0.99 EUR, etc.). Why? How to solve the problem?
This feature is not described in Apple documentation, is there a solution?
This is my code:
func makeUIViewController(context: Context) -> DataScannerViewController {
let dataScanner = DataScannerViewController(recognizedDataTypes: [ .text(textContentType: .currency)])
return dataScanner
}
I am experimenting with Foundation Models in my time tracking app to analyze users tracked events, but I am finding that the model struggles with even basic computation of time. Specifically converting from seconds to hours and minutes.
To give just one example, when I prompt:
"Convert 3672 seconds to hours, minutes, and seconds. Don't include the calculations in the resulting output"
I get this:
"3672 seconds is equal to 1 hour, 0 minutes, and 36 seconds".
Which is clearly wrong - it should be 1 hour, 1 minute, and 12 seconds. Another issue that I saw a lot is that seconds were considered to be minutes, or that the hours were just completely off.
What can I do to make the support for math better? Or is that just something that the model is not meant to be used for?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models