Introducing Aurora Toolkit: Simple AI Integration for IOS and Mac Projects

A highly realistic view of a coder codng with AI bathed in golden sunlight

Over the summer of 2024, I conducted an experiment to see if I could use Artificial Intelligence to do in a month what took several in previous years, which is to rewrite my long-running Big Brother superfan app to incorporate the latest technologies. I’ve been shipping this app since the App Store opened and every 2-3 years would rewrite it from scratch, to use the latest frameworks and explore fresh ideas for design, user experience, and content. This year, the big feature was adding AI and it seemed only natural to use AI extensively to do so – that was the experiment. The results blew my mind.

Not only was AI the productivity accelerator we’ve all been promised, but it invigorated and filled my well of ideas for what you can do with app development. As the show wound down for this year (Big Brother is a summer TV show), I had gained both a ton of experience and perspective developing with AI and ideas for how I could make this easier for others wanting to integrate it.

So even before Big Brother aired its finale, I already moved on to my next project, which became Aurora Toolkit, and the first major component of it, AuroraCore. AuroraCore is a Swift Package, and the foundational library of Aurora Toolkit.

I originally envisioned Aurora as an AI assistant, something that lived in the menu bar, or a widget, or some other quick-to-access space. AuroraCore was to be the foundation of the assistant app, but the more I developed my ideas for it, it I realized it was really a framework for any developer building any kind of app for iOS/Mac. So the AI assistant became a Swift Package for anyone to use instead.

My goals for the package:

  • Written in Swift with no external dependencies outside of what comes with Swift/iOS/MacOS, making it portable to other languages and platforms
  • Lightweight and fast
  • Support all the major AI platforms like those from OpenAI, Anthropic, and Google
  • Support open source models via Ollama, the leading platform for OSS AI models
  • Include support for multiple LLMs at once
  • Include some kind of workflow system where AI tasks could be chained together easily

When I use ChatGPT I treat it like Slack with access to multiple developers to collaborate with. I use multiple chat conversations, each fulfilling different roles. One of those conversations is solely for generating git commit descriptions. I gave it my preferred format for commit descriptions, and when I’m ready to make a commit, I simply paste in the diffs and ask, “Give me a commit description in my preferred format using these diffs”.

My initial use case for AuroraCore was to replicate that part of my workflow while developing my Big Brother app. I envisioned a system that watches my git folder for changes, detects updates, and uses a local llama model to generate commit descriptions in my ideal format—continuously. The goal was to streamline the process, so I could copy and paste the description whenever I was ready to commit. Ideally, there would even be a big button I could press to handle it all on demand.

I haven’t actually built that yet, but I built a framework that would make it possible. Also, ChatGPT significantly outperforms llama3 at writing good commit descriptions. I’ve got a lot more prompt work to do on that one, I’m afraid.

But approaching it with these goals and this particular use case helped me press forward down what I feel are some really productive paths. There are three major features of AuroraCore so far that I’m especially excited about and proud of.

Support for multiple Large Language Models at once

From the start, I wanted to use multiple active models, and a way to feed AI prompts to each. I envisioned mixing Anthropic Claude with OpenAI GPT, and one or more open source models via Ollama. The LLMManager in AuroraCore can support as many models as you want. Under the hood, they can all be the same claude, gpt-4o, llama3, gemma2, mistral, etc. models, but each can have its own setup, including token limits and system prompt. LLMManager can handle a whole team of models with ease, and understands which models have what token limits and even supported domains, and send appropriate requests to the appropriate models.

Automatic domain routing

Once I started exploring using multiple models each with different domains, I made an example using a small model to review a prompt and spit out the most appropriate domain that matched a list of domains I gave it. Then I had models set up for each of those domains, so that LLMManager would use the right model to respond. It became obvious that this shouldn’t be an example, but built in behavior in LLMManager, so first I added support for a model that was consulted first for domain routing. That led to creating a first class object LLMDomainRouter, that takes an LLM service model, a list of supported domains, and a customizable set of instructions with a reasonable default. When you route a request through the LLMManager, it will consult with the domain router, then send the request to the appropriate model. The domain router is as fast as whatever model you give it to work with, and you can essentially build your very own “mixture of experts” simply by setting up several models with high quality prompts tailored to different domains.

Declarative workflows with tasks and task groups

A robust workflow system was a key goal for this project, enabling developers to chain multiple tasks, including AI-based actions. Each task can produce outputs, which feed into subsequent tasks as inputs. Inspired by the Apple Shortcuts app (originally Workflow), I envisioned AuroraCore workflows providing similar flexibility for developers.

The initial version of Workflow was a simple system that took an array of WorkflowTask classes, and tied the input mappings together with another dictionary referencing different tasks by name. I quietly made AuroraCore public with that, but what I really wanted was a declarative system similar to SwiftUI. It took a few more days, but Workflow has been refactored to that declarative system, with an easier way to map values between tasks and task groups.

The TVScriptWorkflowExample in the repository demonstrates a workflow with tasks that fetch an RSS feed from Associated Press’s Technology news, parse the feed, limit it to the 10 most recent articles, fetch those articles, extract summaries, and then use AI to generate a TV news script. The script includes 2-3 anchors, studio directions, and that familiar local TV news banter we all know and love—or cringe at. It even wraps up on a lighter note, because even the most intense news programs like to end with something uplifting to make viewers smile. The entire process takes about 30-40 seconds to complete, and I’ve been amazed by how realistic the dozens of test runs have sounded.

What’s next

So that’s Aurora Toolkit so far. There are no actual tools in the kit yet, just this foundational library. Other than a simple client I wrote to test a lot of the functionality, no apps currently use this. My Big Brother app will certainly use it starting next year, but in the meantime, I’d love to see what other developers make of it. If someone can help add multimodal support or integrate small on-device LLMs to take advantage of all these neural cores our phones are packing these days—even better.

AuroraCore is backed by a lot of unit tests, but probably not enough. It feels solid, but it is still tagged as prerelease software for now so YMMV. If you try this Swift package out, I’d love to know what you think!

GitHub link: https://github.com/AuroraToolkit/AuroraCore