A Protocol-Based Approach to Mixing Local and Cloud AI Services

Big Brother season is back, and so is my superfan app, Hamster Soup. I rebuild this app every few years—not just for fans, but as my personal playground to explore new ideas, designs, and technologies. Last year that new technology was AI.
In 2024 I built AI features on top of OpenAI’s cloud-based LLM APIs, but the plan was always to use built-in AI as much as possible. Since Apple only unveiled their Foundation Models SDK in June, that capability won’t be available until iOS 26 launches near the tail end of this season of Big Brother.
Because I want my iOS 26 app update to launch on day 1, I’ve already started building this in a branch. However even after launch, only about 20% of current iPhones and iPads will be able to use the local language model. For the foreseeable future, that means my AI features either need to be limited to devices that support local models, or I need the flexibility to choose a cloud or local model as needed. After all, even if the device supports it, the model may be in use by something else and unavailable.
The AI service protocol
Most of the functions an AI model perform fall within a few major categories. Examples are text generation, summarization, translation, and even media generation like images and audio. These can easily be described by an enum and a protocol.
You can even be clever and use a flag for local services, making it easy to select for them later, and a quality property to distinguish your really expensive (monetizable!) AI services from your basic, free and fast ones.
enum AIServiceFeature: String, CaseIterable {
case summarization = "Summarization"
case translation = "Translation"
case textGeneration = "Text Generation"
case imageGeneration = "Image Generation"
}
enum AIServiceQuality: String, CaseIterable {
case high = "High"
case medium = "Medium"
case basic = "Basic"
}
protocol AIServiceProtocol {
/// Unique identifier for the AI service
var id: UUID { get }
/// Unique identifier for the AI service
var name: String { get }
/// User-friendly name for the AI service
var description: String { get }
/// The quality level of the AI service.
var quality: AIServiceQuality { get }
/// The features supported by the AI service.
var supportedFeatures: [AIServiceFeature] { get }
/// Returns `true` when the service is a local service.
var isLocal: Bool { get }
/// Determine if the service is available for a specific feature.
/// - Parameter feature: The feature to check availability for.
/// - Returns: `true` if the service is available for the feature, otherwise `false`.
/// - Note: Always check availability before using the service.
func isAvailable(for features: [AIServiceFeature]) -> Bool
/// Generates a summary for the given text.
/// - Parameter text: The text to summarize.
/// - Parameter params: Additional parameters for the summary generation.
/// - Returns: A summary of the text.
/// - Throws: An error if the summary generation fails.
func generateSummary(for text: String, params: [String: Any]) async throws -> String?
/// Generates a translation for the given text.
/// - Parameter text: The text to translate.
/// - Parameter language: The target language for the translation.
/// - Parameter params: Additional parameters for the translation generation.
/// - Returns: A translation of the text.
/// - Throws: An error if the translation generation fails.
func generateTranslation(for text: String, to language: String, params: [String: Any]) async throws -> String?
/// Generates text based on the given prompt.
/// - Parameter prompt: The prompt to generate text from.
/// - Parameter params: Additional parameters for the text generation.
/// - Returns: Generated text based on the prompt.
/// - Throws: An error if the text generation fails.
func generateText(from prompt: String, params: [String: Any]) async throws -> String?
/// Generates an image based on the given prompt.
/// - Parameter prompt: The prompt to generate an image from.
/// - Parameter params: Additional parameters for the image generation.
/// - Returns: Generated image data based on the prompt.
/// - Throws: An error if the image generation fails.
func generateImage(from prompt: String, params: [String: Any]) async throws -> Data?
}
You’ll probably want to define some errors,
enum AIServiceError: Error {
case noAvailableService
case serviceUnavailable
case generationFailed(Error)
case unknownError(String)
}
and maybe some default implementations so that every service you create only has to implement functions that make sense. (And let you, the developer, know when you accidentally call the wrong one!)
extension AIServiceProtocol {
func generateSummary(for text: String, params: [String: Any] = [:]) async throws -> String? {
fatalError("This method must be implemented by conforming types.")
}
func generateTranslation(for text: String, to language: String, params: [String: Any] = [:]) async throws -> String? {
fatalError("This method must be implemented by conforming types.")
}
func generateText(from prompt: String, params: [String: Any] = [:]) async throws -> String? {
fatalError("This method must be implemented by conforming types.")
}
func generateImage(from prompt: String, params: [String: Any] = [:]) async throws -> Data? {
fatalError("This method must be implemented by conforming types.")
}
}
Once you define your protocol, creating a manager of objects conforming to that protocol is simple.
The AI service manager
Creating a manager is a lot simpler than you might think. It basically maintains a collection of objects, with functions to add and remove objects.
class AIServiceManager {
static let shared = AIServiceManager()
var services: [AIServiceProtocol]
/// Initializes the AIServiceManager with an optional array of services.
/// - Parameter services: An array of AIServiceProtocol conforming services to initialize with.
/// If no services are provided, it initializes with an empty array.
init(services: [AIServiceProtocol] = []) {
self.services = services
}
/// Registers a new AI service.
/// - Parameter service: The AIServiceProtocol conforming service to register.
/// This method adds the service to the internal services array.
func registerService(_ service: AIServiceProtocol) {
services.append(service)
}
/// Unregisters an existing AI service.
/// - Parameter service: The AIServiceProtocol conforming service to unregister.
/// This method removes the service from the internal services array.
func unregisterService(_ service: AIServiceProtocol) {
services.removeAll { $0.id == service.id }
}
A manager can also expose functions to quickly get a list of services based on needs like availability, features, and quality.
/// Retrieves all available AI services.
/// - Returns: An array of AIServiceProtocol conforming services that are currently available.
func getAvailableServices() -> [AIServiceProtocol] {
return getServices(features: AIServiceFeature.allCases, availability: true)
}
/// Retrieves a specific AI service by its unique identifier.
/// - Parameter id: The unique identifier of the AI service to retrieve.
/// - Returns: An optional AIServiceProtocol conforming service if found, `nil` otherwise.
func getService(by id: UUID) -> AIServiceProtocol? {
return services.first { $0.id == id }
}
/// Retrieves available AI services that support a specific feature and optionally, service quality.
/// - Parameters:
/// - feature: The feature that the AI services must support.
/// - quality: An optional quality level to filter the AI services.
/// - Returns: An array of AIServiceProtocol conforming services that support the specified feature and match the quality.
func getAvailableServices(feature: AIServiceFeature, quality: AIServiceQuality? = nil) -> [AIServiceProtocol] {
return getServices(features: [feature], availability: true, quality: quality)
}
/// Retrieves AI services that support a specific feature and optionally, service availability and quality.
/// - Parameters:
/// - feature: The feature that the AI services must support.
/// - availability: An optional boolean indicating whether to filter by availability.
/// - quality: An optional quality level to filter the AI services.
/// - Returns: An array of AIServiceProtocol conforming services that support the specified feature and match the availability and quality.
func getServices(feature: AIServiceFeature, availability: Bool? = nil, quality: AIServiceQuality? = nil) -> [AIServiceProtocol] {
return getServices(features: [feature], availability: availability, quality: quality)
}
/// Retrieves AI services that support specific features, availability, and quality.
/// - Parameters:
/// - features: An array of features that the AI services must support.
/// - availability: An optional boolean indicating whether to filter by availability.
/// - quality: An optional quality level to filter the AI services.
/// - Returns: An array of AIServiceProtocol conforming services that match the specified features, availability, and quality.
func getServices(
features: [AIServiceFeature] = [], availability: Bool? = nil, quality: AIServiceQuality? = nil) -> [AIServiceProtocol] {
return services.filter { service in
(features.isEmpty || features.allSatisfy { service.supportedFeatures.contains($0) }) &&
(availability == nil || availability == true && service.isAvailable(for: features)) &&
(quality == nil || service.quality == quality!)
}
}
This is all you need to manage a collection of services, find one that does what you need it to do, and go on your merry way using the service to implement features.
However the really useful managers also provide functions to access service features where you don’t actually care which service handles it (but maybe you want to prioritize the local ones first).
//Mark: - AI Service Features
/// Selects the first available AI service that supports a selected feature.
/// - Parameters:
/// - feature: The AI service feature to select.
/// - preferLocal: A boolean indicating whether to prefer local services over cloud services.
/// - Returns: An optional AIServiceProtocol conforming service that supports the specified feature.
private func selectAvailableService(feature: AIServiceFeature, preferLocal: Bool = true) -> AIServiceProtocol? {
let sortedServices = getAvailableServices(feature: feature)
.sorted { $0.isLocal && preferLocal && !$1.isLocal }
return sortedServices.first
}
/// Generates a summary for the provided text using the current or first available AI service.
/// - Parameters:
/// - text: The text to summarize.
/// - params: Optional parameters to customize the summarization process.
/// - preferLocal: A boolean indicating whether to prefer local services over cloud services.
/// - Returns: An optional string containing the generated summary.
/// - Throws: An error if no available service is found or if the summarization fails.
func generateSummary(for text: String, params: [String: Any] = [:], preferLocal: Bool = true) async throws -> String? {
guard let service = selectAvailableService(feature: .summarization, preferLocal: preferLocal) else {
throw AIServiceError.noAvailableService
}
do {
let summary = try await service.generateSummary(for: text, params: params)
return summary
} catch {
throw AIServiceError.generationFailed(error)
}
}
/// Generates a translation for the provided text using the current or first available AI service.
/// - Parameters:
/// - text: The text to translate.
/// - language: The target language for the translation.
/// - params: Optional parameters to customize the translation process.
/// - preferLocal: A boolean indicating whether to prefer local services over cloud services.
/// - Returns: An optional string containing the generated translation.
/// - Throws: An error if no available service is found or if the translation fails.
func generateTranslation(for text: String, to language: String, params: [String: Any] = [:], preferLocal: Bool = true) async throws -> String? {
guard let service = selectAvailableService(feature: .translation, preferLocal: preferLocal) else {
throw AIServiceError.noAvailableService
}
do {
let translation = try await service.generateTranslation(for: text, to: language, params: params)
return translation
} catch {
throw AIServiceError.generationFailed(error)
}
}
/// Generates text based on the provided prompt using the current or first available AI service.
/// - Parameters:
/// - prompt: The prompt to generate text from.
/// - params: Optional parameters to customize the text generation process.
/// - preferLocal: A boolean indicating whether to prefer local services over cloud services.
/// - Returns: An optional string containing the generated text.
/// - Throws: An error if no available service is found or if the text generation fails.
func generateText(from prompt: String, params: [String: Any] = [:], preferLocal: Bool = true) async throws -> String? {
guard let service = selectAvailableService(feature: .textGeneration, preferLocal: preferLocal) else {
throw AIServiceError.noAvailableService
}
do {
let text = try await service.generateText(from: prompt, params: params)
return text
} catch {
throw AIServiceError.generationFailed(error)
}
}
/// Generates an image based on the provided prompt using the currentor first available AI service.
/// - Parameters:
/// - prompt: The prompt to generate an image from.
/// - params: Optional parameters to customize the image generation process.
/// - preferLocal: A boolean indicating whether to prefer local services over cloud services.
/// - Returns: An optional Data object containing the generated image.
func generateImage(from prompt: String, params: [String: Any] = [:], preferLocal: Bool = true) async throws -> Data? {
guard let service = selectAvailableService(feature: .imageGeneration, preferLocal: preferLocal) else {
throw AIServiceError.noAvailableService
}
do {
let data = try await service.generateImage(from: prompt, params: params)
return data
} catch {
throw AIServiceError.generationFailed(error)
}
}
}
This AIServiceManager is now one of the core components of the Hamster app. Bringing last year’s AISummarizer
class into this new architecture was almost a straight copy/paste job.
From this,
//
// AISummarizer.swift
// Hamster Soup Forever
//
// Created by Dan Murrell Jr on 7/16/24.
//
import Foundation
import Combine
import Alamofire
struct AISummaryResponse: Decodable {
let summary: String?
let hash: String?
let index: Int?
}
class AISummarizer: ObservableObject {
private let summarizerEndpoint = "..."
func summarize(
text: String,
tone: String,
emotion: String,
contentFocus: String,
conciseness: String,
duration: Int,
maxTokens: Int = 125
) async -> String? {
guard let url = URL(string: summarizerEndpoint) else {
return nil
}
let parameters: [String: Any] = [
"text": text,
"tone": tone,
"emotion": emotion,
"contentFocus": contentFocus,
"conciseness": conciseness,
"duration": duration,
"maxTokens": maxTokens
]
return await withCheckedContinuation { continuation in
AF.request(url, method: .post, parameters: parameters, encoding: JSONEncoding.default)
.validate()
.responseDecodable(of: AISummaryResponse.self) { [self] response in
switch response.result {
case .success(let value):
continuation.resume(returning: value.summary)
case .failure(let error):
continuation.resume(returning: nil)
}
}
}
}
}
to this,
//
// CloudHamsterAIService.swift
// Hamster Soup Forever
//
// Created by Dan Murrell Jr on 6/27/25.
//
import Foundation
import Alamofire
/// A basic implementation of an AI service that can be used to generate summaries.
/// This service is designed to be simple and may not have advanced capabilities.
/// This is the original implementation of HamsterAI, and can be used as a fallback or for testing purposes.
class CloudHamsterAIService: AIServiceProtocol {
var id: UUID
var name: String = "HamsterAI Service"
var description: String = "A simple AI service that generates basic summaries."
var quality: AIServiceQuality = .basic
var isLocal = false // This service is not local
var supportedFeatures: [AIServiceFeature] = [.summarization]
private let summarizerEndpoint = "..."
/// Initializes the basic HamsterAI service with default parameters.
/// - Parameter appSettings: The application settings that will be used to configure the service.
/// - Returns: An instance of `CloudHamsterAIService`.
init() {
self.id = UUID()
}
func isAvailable(for features: [AIServiceFeature]) -> Bool {
// This service only supports summarization
guard features.count == 1, features.first! == .summarization else {
return false
}
return true
}
func generateSummary(for text: String, params: [String: Any] = [:]) async throws -> String? {
guard let url = URL(string: summarizerEndpoint) else {
return nil
}
let parameters: [String: Any] = [
"text": text,
"tone": params["tone"] as? String ?? "neutral",
"emotion": params["emotion"] as? String ?? "neutral",
"contentFocus": params["contentFocus"] as? String ?? "balanced",
"conciseness": params["conciseness"] as? String ?? "balanced",
"duration": 2,
"maxTokens": 125
]
return await withCheckedContinuation { continuation in
AF.request(url, method: .post, parameters: parameters, encoding: JSONEncoding.default)
.validate()
.responseDecodable(of: AISummaryResponse.self) { [self] response in
switch response.result {
case .success(let value):
if let summary = value.summary {
continuation.resume(returning: summary)
} else {
continuation.resume(returning: nil)
}
case .failure(let error):
continuation.resume(returning: nil)
}
}
}
}
}
In my iOS 26 branch, there is also this service:
//
// LocalHamsterAIService.swift
// Hamster Soup Forever
//
// Created by Dan Murrell Jr on 6/27/25.
//
import Foundation
import FoundationModels
@available(iOS 26, *)
/// This service provides access to the on-device SystemLanguageModel implementations for private, offline use.
class LocalHamsterAIService: AIServiceProtocol {
var id: UUID = UUID()
var name: String = "Local Hamster AI"
var description: String = "A local AI service that generates summaries."
var quality: AIServiceQuality = .medium
var isLocal = true // This service uses local models
var supportedFeatures: [AIServiceFeature] = [.summarization]
private let appSettings: AppSettings
init(appSettings: AppSettings) {
self.appSettings = appSettings
}
func isAvailable(for features: [AIServiceFeature]) -> Bool {
let model = SystemLanguageModel.default
return model.isAvailable
}
func generateSummary(for text: String, params: [String: Any]) async throws -> String? {
let session = getSession(for: .summarization)
let prompt = getSummaryPrompt(for: text, params: params)
let maxTokens = params["maxTokens"] as? Int ?? 500
let options = GenerationOptions(maximumResponseTokens: maxTokens)
do {
if let summary = try await session?.respond(to: prompt, options: options).content {
return "Foundation: \(summary)"
} else {
return nil
}
} catch {
throw error
}
}
// MARK: - Set up foundation models
private func getSession(for service: AIServiceFeature) -> LanguageModelSession? {
let model = SystemLanguageModel.default
guard model.isAvailable else {
return nil
}
switch service {
case .summarization:
let session = LanguageModelSession(model: model, instructions: summaryInstructions)
return session
default:
return nil
}
}
// MARK: - Configure foundation models
private func getSummaryPrompt(for text: String, params: [String: Any]) -> String {
guard text.isEmpty == false else {
return ""
}
let toneParam = params["tone"] as? String ?? appSettings.selectedTone.rawValue
let emotionParam = params["emotion"] as? String ?? appSettings.selectedEmotion.rawValue
let contentFocusParam = params["contentFocus"] as? String ?? appSettings.selectedContentFocus.rawValue
let concisenessParam = params["conciseness"] as? String ?? appSettings.selectedConciseness.rawValue
let anyParams = !toneParam.isEmpty || !emotionParam.isEmpty || !contentFocusParam.isEmpty || !concisenessParam.isEmpty
var prompt = ""
if anyParams {
prompt += "Summarize using the following parameters:\n"
if !toneParam.isEmpty {
prompt += "Tone: \(toneParam)\n"
}
if !emotionParam.isEmpty {
prompt += "Emotion: \(emotionParam)\n"
}
if !contentFocusParam.isEmpty {
prompt += "Content Focus: \(contentFocusParam)\n"
}
if !concisenessParam.isEmpty {
prompt += "Conciseness: \(concisenessParam)\n"
}
prompt += "\n\n"
}
prompt += "Summarize the following text:\n\n\(text)"
return prompt
}
private var summaryInstructions =
"""
You are an AI summarizer. Your task is to summarize the provided text based on the provided parameters. Focus on factual events, interactions, and strategies without embellishing or adding fictional elements.
The provided text will contain updates from a reality TV show. You MUST NOT make your summary sound like a TV episode recap. Instead, focus on summarizing key events, interactions, and strategies happening in real-time. Base your summary only on the information provided in the text.
If the updates mention fish or animals or FOTH (front of the house), that means that the production team is blocking the feeds and showing prerecorded content. Do not refer to fish, animals, or FOTH in your summary, other than to say production is blocking the feeds.
Emphasize tone and conciseness as primary guidelines, then content focus, and emotion the least. For content focus, only emphasize when houseguests are doing and talking about things that match the content focus preference. Provide a natural, flowing summary without referencing these instructions or your methodology.
IMPORTANT: DO NOT make up extra details or events that are not present in the text. Focus solely on summarizing the provided text without adding any fictional elements. If the source text is short or lacks sufficient detail, provide a concise summary without embellishing.
"""
}
Now, using this manager is simple. A summary is generated by whichever service is available.
let localHamsterAI = LocalHamsterAIService(appSettings: appSettings)
let cloudHamsterAI = CloudHamsterAIService()
let aiServiceManager = AIServiceManager(services: [localHamsterAI, cloudHamsterAI])
@Published var summary: String = ""
...
do {
if let summary = try await aiServiceManager.generateSummary(
for: combinedText,
params: [
"tone": tone,
"emotion": emotion,
"contentFocus": contentFocus,
"conciseness": conciseness,
"duration": duration,
"maxTokens": maxTokens
]
) {
self.summary = summary
}
} catch {
self.summary = "Failed to generate summary"
}
Here’s a gist with the enums, protocol, and manager code, so you can implement and extend this yourself.
Tech notes
I make no guarantees that I’m following the absolute best practices, but this code works really well.
I use cloud functions to communicate with OpenAI. Putting your API calls directly in your app’s code is probably a bad idea, unless you’re really smart about managing API keys, prompts, and parameters like max tokens. You don’t want to expose your API keys in public code (your app would be public once it goes up on the App Store, where a hacker will eventually find it and blow up your usage costs), and updating prompts and parameters are much easier in the cloud than on devices.
Still, use account budget limits and monitor them in case you need to disable endpoints, rotate keys, or adjust your prompts and parameters (which I did a lot last year while dialing in my AI-generated summaries during the season).
You can also make it more efficient to reduce your costs. I also cache cloud results in a database. For any given source material being summarized, I cache a number of variations. If 100 users' apps request the same exact summary, only the first 5 actually generate a summary. The other 95 get a random, previously-generated summary, limiting my costs significantly. I’m planning to also upload locally-generated summaries, filling the cache even faster and reducing my OpenAI bill even further.
For the local Hamster service, I’ll ship sensible defaults for the prompt and parameters, but use CloudKit for overrides (see my cloud AI experience from last year) so I can dial things in without having to update the app itself.
Apple’s Foundation Models SDK is AWESOME and easy to use. The very first time I tried generating a summary it worked, and I was so skeptical I actually added “Foundation: " to the summary to prove to myself that yes, the local model generated the summary (remember I’m calling it through the service manager function, which doesn’t care which service is used). Summaries generate in 1-5 seconds on my M4 iPad Pro running iOS 26 beta 2 (the iOS simulator doesn’t have the model), depending on how much content I throw at it. It’s fast, and the quality of the summaries are much better than I had hoped.
I’ve experimented with on-device open source models and have been disappointed with the results for the smaller 0.6B and 1B models, and expecting a user to download a multi-gigabyte model for acceptable quality seems like a non-starter. A cloud model makes way more sense as long as you can manage the cost.
I’d love to hear if you’re mixing local and cloud AI services, or if you’ve built similar managers for your apps!