INTRODUCING AURORA TOOLKIT: SIMPLE AI INTEGRATION FOR IOS AND MAC PROJECTS
Over the summer of 2024, I conducted an experiment to see if I could use Artificial Intelligence to do in a month what took several in previous years, which is to rewrite my long-running Big Brother superfan app to incorporate the latest technologies. I’ve been shipping this app since the App Store opened and every 2-3 years would rewrite it from scratch, to use the latest frameworks and explore fresh ideas for design, user experience, and content. This year, the big feature was adding AI and it seemed only natural to use AI extensively to do so – that was the experiment. The results blew my mind.
Not only was AI the productivity accelerator we’ve all been promised, but it invigorated and filled my well of ideas for what you can do with app development. As the show wound down for this year (Big Brother is a summer TV show), I had gained both a ton of experience and perspective developing with AI and ideas for how I could make this easier for others wanting to integrate it.
So even before Big Brother aired its finale, I already moved on to my next project, which became Aurora Toolkit, and the first major component of it, AuroraCore. AuroraCore is a Swift Package, and the foundational library of Aurora Toolkit.
I originally envisioned Aurora as an AI assistant, something that lived in the menu bar, or a widget, or some other quick-to-access space. AuroraCore was to be the foundation of the assistant app, but the more I developed my ideas for it, it I realized it was really a framework for any developer building any kind of app for iOS/Mac. So the AI assistant became a Swift Package for anyone to use instead.
My goals for the package:
- Written in Swift with no external dependencies outside of what comes with Swift/iOS/MacOS, making it portable to other languages and platforms
- Lightweight and fast
- Support all the major AI platforms like those from OpenAI, Anthropic, and Google
- Support open source models via Ollama, the leading platform for OSS AI models
- Include support for multiple LLMs at once
- Include some kind of workflow system where AI tasks could be chained together easily
When I use ChatGPT I treat it like Slack with access to multiple developers to collaborate with. I use multiple chat conversations, each fulfilling different roles. One of those conversations is solely for generating git commit descriptions. I gave it my preferred format for commit descriptions, and when I’m ready to make a commit, I simply paste in the diffs and ask, “Give me a commit description in my preferred format using these diffs”.
My initial use case for AuroraCore was to replicate that part of my workflow while developing my Big Brother app. I envisioned a system that watches my git folder for changes, detects updates, and uses a local llama model to generate commit descriptions in my ideal format—continuously. The goal was to streamline the process, so I could copy and paste the description whenever I was ready to commit. Ideally, there would even be a big button I could press to handle it all on demand.
I haven’t actually built that yet, but I built a framework that would make it possible. Also, ChatGPT significantly outperforms llama3 at writing good commit descriptions. I’ve got a lot more prompt work to do on that one, I’m afraid.
But approaching it with these goals and this particular use case helped me press forward down what I feel are some really productive paths. There are three major features of AuroraCore so far that I’m especially excited about and proud of.
Support for multiple Large Language Models at once
From the start, I wanted to use multiple active models, and a way to feed AI prompts to each. I envisioned mixing Anthropic Claude with OpenAI GPT, and one or more open source models via Ollama. The LLMManager
in AuroraCore can support as many models as you want. Under the hood, they can all be the same claude, gpt-4o, llama3, gemma2, mistral, etc. models, but each can have its own setup, including token limits and system prompt. LLMManager can handle a whole team of models with ease, and understands which models have what token limits and even supported domains, and send appropriate requests to the appropriate models.
Automatic domain routing
Once I started exploring using multiple models each with different domains, I made an example using a small model to review a prompt and spit out the most appropriate domain that matched a list of domains I gave it. Then I had models set up for each of those domains, so that LLMManager
would use the right model to respond. It became obvious that this shouldn’t be an example, but built in behavior in LLMManager
, so first I added support for a model that was consulted first for domain routing. That led to creating a first class object LLMDomainRouter
, that takes an LLM service model, a list of supported domains, and a customizable set of instructions with a reasonable default. When you route a request through the LLMManager
, it will consult with the domain router, then send the request to the appropriate model. The domain router is as fast as whatever model you give it to work with, and you can essentially build your very own “mixture of experts” simply by setting up several models with high quality prompts tailored to different domains.
Declarative workflows with tasks and task groups
A robust workflow system was a key goal for this project, enabling developers to chain multiple tasks, including AI-based actions. Each task can produce outputs, which feed into subsequent tasks as inputs. Inspired by the Apple Shortcuts app (originally Workflow), I envisioned AuroraCore workflows providing similar flexibility for developers.
The initial version of Workflow was a simple system that took an array of WorkflowTask classes, and tied the input mappings together with another dictionary referencing different tasks by name. I quietly made AuroraCore public with that, but what I really wanted was a declarative system similar to SwiftUI. It took a few more days, but Workflow
has been refactored to that declarative system, with an easier way to map values between tasks and task groups.
The TVScriptWorkflowExample
in the repository demonstrates a workflow with tasks that fetch an RSS feed from Associated Press’s Technology news, parse the feed, limit it to the 10 most recent articles, fetch those articles, extract summaries, and then use AI to generate a TV news script. The script includes 2-3 anchors, studio directions, and that familiar local TV news banter we all know and love—or cringe at. It even wraps up on a lighter note, because even the most intense news programs like to end with something uplifting to make viewers smile. The entire process takes about 30-40 seconds to complete, and I’ve been amazed by how realistic the dozens of test runs have sounded.
What’s next
So that’s Aurora Toolkit so far. There are no actual tools in the kit yet, just this foundational library. Other than a simple client I wrote to test a lot of the functionality, no apps currently use this. My Big Brother app will certainly use it starting next year, but in the meantime, I’d love to see what other developers make of it. If someone can help add multimodal support or integrate small on-device LLMs to take advantage of all these neural cores our phones are packing these days—even better.
AuroraCore is backed by a lot of unit tests, but probably not enough. It feels solid, but it is still tagged as prerelease software for now so YMMV. If you try this Swift package out, I’d love to know what you think!
GitHub link: https://github.com/AuroraToolkit/AuroraCore
FROM IDEA TO SHIPPED FEATURE IN A FLASH: HOW THE SAUSAGE GETS MADE WITH AN AI ASSISTANT
I’ve written at length about working with AI (ChatGPT-4o to be precise), but I haven’t shown the process in action yet. This will give you a sense of how quickly you can go from idea for a feature to shipped code. But first, I want to rewind a week.
Last Thursday night was the first eviction episode of the season on Big Brother 26. Eviction episodes are game-changing and live, so always garner a lot of interest from viewers. My app’s usage spikes the most during eviction episodes and the aftermath, and last Thursday was no exception. Because it was the first one of the season and under this new app, there was a detail I had forgot. Spoilers.
I like probably many others watch shows delayed on a DVR, in my case YouTube TV, including this one. Sometimes I start late and skip commercials to catch up, or sometimes I’m just late to the party altogether. Such was the case here, and I quickly checked the app to make sure it wasn’t having any issues on such a heavy-usage night, and immediately saw the spoilers for who got evicted in the episode. Dang. This issue is magnified for West Coast viewers, because even though Big Brother is filmed in Los Angeles, they get the show delayed by three hours because it airs live on East Coast time.
So after I watched the show, I told ChatGPT my problem and how I wanted to fix it, and by Saturday we had two new features live on the App Store: Hide Spoilers, and Hide Spoilers During Eviction. These features automatically blur the Hamster AI summary, and if During Eviction is on, it happens automatically from 8pm EST to 9PM PST. For anyone using the app, I recommend turning that feature on if you worry about spoilers. You can always reveal them by tapping the show/hide icon above.
ChatGPT suggested code to handle this, we iterated back and forth to dial it in, and by Friday night I’d submitted it. A nice, convenient feature to keep you from accidentally seeing something you wanted to wait for.
That brings us to this morning. Last night was another live eviction, and the episode featured one player a bit more than usual, T’kor. Her name gives the app problems because of the apostrophe, and the live feeders' various ways to write it. The end result was there were zero updates for her in the newly released Player Profile feature, because the search couldn’t find results using the proper apostrophe, and CloudKit changes it to a dash.
So while getting ready for work this morning, I paused a moment to run this problem by ChatGPT. Here’s how features like this go from idea to shipped in almost no time flat:
Me:
Ok that update is released to the App Store now.
I noticed that T’kor isn’t showing any latest updates on her profile page. That name has given me problems because of the apostrophe.
What do you think about adding search terms as a string array to the Houseguest record type, and using them for the “Latest updates for (houseguest)” feature on player profiles.
We could use that to improve the search and highlight feature, as well as possibly for another feature I want to add, which is to summarize this player’s game so far from the available updates.
This way I can add multiple variations on the player’s name, and account for nicknames, shorthand, and misspellings by the live feeders. And also fix searching for T’kor’s unusual name.
ChatGPT-4o:
Adding a search terms array to the Houseguest
record type is a great idea to improve the functionality and accuracy of your features, especially for handling names with special characters like T’kor. Here’s how this approach can enhance your app’s features:
Benefits of Using a Search Terms Array
-
Improved Search Accuracy:
- Variation Handling: You can account for variations in name spelling, nicknames, and common misspellings, ensuring that all relevant updates are captured.
- Special Characters: By specifying multiple variations, you can avoid issues caused by special characters or unusual name formats.
-
Enhanced Highlighting:
- Consistent Highlighting: The search terms array allows you to highlight the correct variations in the updates, making it easier for users to recognize references to a houseguest.
-
Customizable Summarization:
- Targeted Summaries: By using search terms, you can more accurately summarize a player’s game by capturing all mentions, even when they are referenced differently.
-
Future-Proofing:
- Scalability: As your app grows, this system will allow you to easily adapt to new variations or nicknames, providing flexibility for future updates and expansions.
Implementation Steps
-
Update the Houseguest Record Type:
- Add a new property
searchTerms
as an array of strings to yourHouseguest
model. This will store various search terms associated with each player.
struct Houseguest { var id: CKRecord.ID var name: String var profileImageURL: URL? var age: Int? var hometown: String? var currentCity: String? var occupation: String? var season: Int? var searchTerms: [String] }
- Add a new property
-
Update Your CloudKit Fetching Logic:
- Modify the logic to fetch and save the
searchTerms
from CloudKit. Make sure your CloudKit schema includes this field.
func fetchHouseguests() { let query = CKQuery(recordType: "Houseguest", predicate: NSPredicate(value: true)) publicDB.fetch(withQuery: query) { result in switch result { case .success(let records): DispatchQueue.main.async { self.houseguests = records.map { record in Houseguest( id: record.recordID, name: record["name"] as? String ?? "", profileImageURL: (record["profile_image"] as? CKAsset)?.fileURL, age: record["age"] as? Int, hometown: record["hometown"] as? String, currentCity: record["current_city"] as? String, occupation: record["occupation"] as? String, season: record["season"] as? Int, searchTerms: record["searchTerms"] as? [String] ?? [record["name"] as? String ?? ""] ) } } case .failure(let error): print("Error fetching records: \(error)") } } }
- Modify the logic to fetch and save the
-
Modify the Feed Filtering Logic:
- Use the
searchTerms
array to filter feed updates.
private func filterUpdates(for houseguest: Houseguest) -> [FeedItem] { return feed.items.filter { item in houseguest.searchTerms.contains { term in item.title.localizedCaseInsensitiveContains(term) || item.body.localizedCaseInsensitiveContains(term) } } }
- Use the
-
Update Highlighting Logic:
- Ensure your text highlighting logic takes the
searchTerms
into account to highlight all variations.
private func highlightedText(for text: String, with searchTerms: [String]) -> AttributedString { var attributedString = AttributedString(text) for term in searchTerms { if let range = attributedString.range(of: term, options: .caseInsensitive) { attributedString[range].foregroundColor = .yellow } } return attributedString }
- Ensure your text highlighting logic takes the
-
Summarization Feature:
- Use the search terms when generating AI summaries to ensure all relevant mentions are captured.
private func generateSummary(for houseguest: Houseguest) { let combinedText = filterUpdates(for: houseguest) .map { "\($0.title) \($0.body)" } .joined(separator: " ")
aiSummarizer.summarize( text: combinedText, tone: appSettings.selectedTone, emotion: appSettings.selectedEmotion, contentFocus: appSettings.selectedContentFocus, conciseness: appSettings.selectedConciseness ) { result <span style="color:#66d9ef">in</span> <span style="color:#66d9ef">switch</span> result { <span style="color:#66d9ef">case</span> .success(<span style="color:#66d9ef">let</span> summary): <span style="color:#66d9ef">self</span>.summary = summary <span style="color:#66d9ef">case</span> .failure(<span style="color:#66d9ef">let</span> error): print(<span style="color:#e6db74">"Failed to generate summary: </span><span style="color:#e6db74">\(</span>error<span style="color:#e6db74">)</span><span style="color:#e6db74">"</span>) } }
}
Considerations
- CloudKit Schema: Ensure your CloudKit record schema is updated to include the
searchTerms
field and that you have migrated existing records as needed. - UI Updates: Update any relevant UI components to reflect these changes, particularly where houseguest information is displayed or interacted with.
- Testing: Thoroughly test the implementation to ensure it handles various edge cases, especially for names with special characters or multiple variations.
By implementing these changes, you will improve the robustness and user experience of your app, allowing for more accurate and comprehensive search functionality and content presentation.
Now right now, I don’t know that this code will actually work yet. But it’s clearly on the right track, and if it doesn’t work, I’m positive a few minutes back and forth will get it across the finish line. We’ll find out after work today, when I start adding these features to the app.
So that in a nutshell is a lot of how this sausage has been made – I tell ChatGPT what I want, it hands me some code and discussion back, I either plug it in as-is or I ask for changes, and we go back and forth honing this new feature. It could take ten minutes or it could take a few hours, but to me the big win is it didn’t take ten days. Or even two.
This could easily be looked at as a case of “veteran coder is too lazy to spend the time writing the code”. To that I say, hell yes, you got it exactly right – except for the “lazy” bit. I’d swap that out for “smart” instead. Why should I spend time writing all the code when I could instead describe what I want, and evaluate and adjust minor bits instead?
The fact is, I’m still using my software engineering skills, but I’m also able to seriously exercise my creativity, and react in nearly real-time to issues that arise and address them swiftly for the app’s users' benefit. After all, the primary stakeholders of mobile apps are always the users themselves. If that’s not the rule, it should be.
I think working like this is on the cutting edge of AI-augmented software engineering, and that is far more exciting to me today, than sitting down and writing a ton of code.
Update from about an hour after work:
The code from ChatGPT was in the ballpark, but needed a little work, because I didn’t want to have to set search terms for every houseguest. So we generate some default terms for all houseguests to combine with what comes from CloudKit, and voila. Coming soon to an app near you!
CAN A SEASONED SOFTWARE ENGINEER AND AI WRITE A FULLY-FEATURED, MATURE APP IN A MONTH?
Exactly one month ago, on June 27, I decided to try an experiment – could I rewrite my long term hobby app with AI? My app Hamster Soup is an app for superfans of the TV show Big Brother, which was set to begin around July 17. That gave me as much as three weeks to get something going, even if it wasn’t a complete release or very robust, in time for the show’s premiere. Initially, I thought I might get 70-80 percent of the way there, and gave myself until July 15 to ship the app to the App Store.
Oh, how wrong I was!
As it turned out, on July 8 – barely a week and a half – I had my app “shippable”, by my self-imposed standards. That gave me plenty of time to add a feature or two before my self-imposed July 15 date. Again, wrong. On July 11 I had added more new features than I thought, and on July 12 I released Hamster Soup 2.0.0 (which really is about the 12th new version of this app over the last 15 years).
That was two weeks ago, and since then I have shipped 10 versions, nearly all of which added one or more significant features. One was mainly a bug fix for a timezone issue. That’s shipping a significant update nearly every single day since launch. Even with a full-time job, I’ve managed to keep up a routine where I think of an idea and build 75% or more of it in the morning before starting my job, mess around with it a little in the evening, get it across the finish line, submit it before going to bed, and wake up to it approved and released the next morning.
It has been a remarkable two weeks working this way, as a kind of super Software Engineer/Product Director mind-melding with ChatGPT. Since I’ve been using the same conversations all this time, the Star Trek reference makes a lot of sense as ChatGPT and I are so dialed in we know how each other thinks and rarely have to explain what we mean to each other.
This is especially apparent when I drop a screenshot into the chat without an explanation and it just knows what I mean or want to show it. That was an experiment the first time I tried it (“I wonder if it can figure out what I want it to address”), but now it’s just the way I work with the tool. I drop screenshots from the app, compiler errors in Xcode, deprecation warnings, and more. It just knows what I want and a second later starts addressing it.
Not only have I used AI to co-develop this app, about a week ago I introduced the first simple AI feature to the product, and have since layered more and more functionality on top of it. Now there is a basic, free AI-enabled experience and an Enhanced Hamster AI in app purchase, that enables user customization, deeper functionality, and a smart summary feature that provides a unique perspective to every user, fine-tuned to their preferences. And I don’t have to write a word of it myself. (Except for the Gen AI prompts, more on that in a moment.) The Enhanced Hamster AI feature is a platform for me to build upon without giving away too much AI for free.
The AI features aren’t free. Because iOS 18 and built-in models aren't available until the fall, I’m using OpenAI to provide AI-based features. Originally it was based on gpt-3.5-turbo, the cheapest model on launch, but shortly afterwards gpt-4o-mini was released as a drop-in replacement. Switching to gpt-4o-mini halved my costs instantly, while providing better performance and quality.
Speaking of working with OpenAI, I knew that putting the AI api code directly in the app was a non-starter. Not only does that make it difficult to update when you need to change something because of the App Review process, it is inviting significant problems to have your api key embedded in an app on someone else’s device. I knew the right way to deal with it, but asked ChatGPT for its opinion. It gave me a few different options, and in about 5 minutes not only had we picked the best implementation as a cloud function, but we’d already written a working prototype that only needed minor tweaks to turn into the initial production implementation.
As a cloud function, not only are you able to quickly deploy changes, but you can also lock it down to minimize the amount of mischief a bad actor can do on your dime. You have precise control over your costs, and can adjust in real-time, unlike a mobile app. Over time, we’ve iterated and improved the implementation, added additional functions for analytics and other useful features, and added algorithms to maximize the variety and customizability while minimizing the costs.
For example, I noticed that most of the time, Hamster AI referred to the happenings in the Big Brother house as “episodes”. Which makes sense, since it’s based on a TV show. But these are the live feeds, which is just houseguest time passing by in front of dozens of cameras. After reading so many “On the latest episode, …” updates, I tweaked the various prompts (there are several, for more variety) to avoid that kind of phrasing.
For another, I noticed as a tense argument broke out the first Saturday morning after nominations, the live feeders documenting it went into a lot more detail in their descriptions, blowing past the token budget I was imposing on my functions. It took about 30 minutes to perfect an algorithm that balances input to output to both cover what was happening, and provide an effective summary to the user who had only a few seconds at a time to stay up to date. We fixed that issue in realtime during the argument and its aftermath.
One funny incident occurred as I was developing a new Enhanced AI customization feature, letting the user dial in Hamster AI based on preferences on tone, emotion, content focus, and conciseness. I found that if you set a content focus on Drama instead of balanced, Gen AI did what it does, and spun out a wild tale of intrigue, suspicion, and backstabbing – which was completely made up and inferred from the most minor detail. Big Brother will block the feeds occasionally for production reasons, and historically they would play theme music and show the front of the house, the fishtank, hamsters, and nowadays they show live feeds from local animal shelters to encourage people to adopt pets. Hamster AI interpreted the mention of animals as if they were unleashed in the house, causing chaos and arguments as houseguests picked sides and fought each other for control. Gen AI, you so crazy! Fixed with some light prompt work.
A month in, I have a brand new, but pretty mature app, and a platform to continue innovating upon. But also I have a lot of experience co-developing a project with Artificial Intelligence. And it’s given me a lot of ideas for my next project. More on that in a future post.
For my personal enjoyment, the best things about this are:
- My technical expertise remains highly relevant and required: I’m constantly checking ChatGPT’s work, modifying it, and making precise suggestions for improvements.
- I get to concentrate mainly on creation and idea-generation, and leave the bulk of the basic coding work to ChatGPT.
- I work with a virtual partner that knows what I’m thinking without overexplaining most of the time, and infinite patience when I give more and more detailed requirements. It writes the tenth and final solution just as dutifully as the first attempt.
- I finally have a proper iPad app.
- I have written an app that will run “forever”, with almost no direct involvement from me in generating its content. I set a few URLs in its CloudKit database, and that’s it. Well, I still have to pay for the bill for the the annual Apple Developer fee, cloud functions, and OpenAI api usage. (Until iOS 18 and beyond, when we can do all of this on device.)
- Not having to be responsible for fresh content regularly has rebooted my fandom for Big Brother. I keep the live feeds on in the background now, just like the old days. Expect the unexpected! 😂
The Numbers, T-plus 30 days
- 1 Month of development
- Time to initial 2.0.0 release - 14 days
- Time to 2.0.9 Release (10th release, 9th feature release) - 14 days
- 6,586 Lines of (apparently) bug-free Swift code
- 112 Detailed git commits
- 868 Lines of unit tests (but definitely not 100% coverage)
- 281 Lines of Javascript code (cloud functions)
- 3 ChatGPT conversations (2 for app development, 1 for cloud functions)
- 1 Maxed-out ChatGPT conversation (hit the limit of the 128,000 token context)
- 300+ daily users and rising (organic, I haven’t started advertising yet)
- 2,225,696 OpenAI api tokens used and counting
JULY 15, MY SELF-IMPOSED SHIP DATE - HOW DID I DO?
“Real artists ship.”
When Steve Jobs said that phrase, he was talking about literal artists like Picasso and Matisse, but his point extended to anyone being creative. For someone developing their creative ideas, whether a painting, sketch, or software, Jobs felt the whole point was not to keep their creations to themselves, but to share them with others.
Since June 27, I have conducted an AI experiment born out of apathy, using my 15 years-long side project, which is an app for superfans of the TV show Big Brother. When I first created this app in 2008, I was as big a fan of the TV show as anyone could be. The iPhone had launched the year before, and the App Store opened that summer. As a senior software engineer at Webmaster, Inc, I was working on our mobile app for DriveShare, an enterprise-grade file encryption product similar to DropBox. Our software made it possible to securely access your files from anywhere, including your mobile phone – which was a novelty at that time.
That was cool, but when I looked around for an app to keep up with Big Brother while I was working, pursuing my college degree, or coaching one of my kids’ little league teams, there wasn’t an app for that. Like so many software ideas, Hamster Soup came about because its creator needed it. The name was a mishmash of the hamsters nickname we superfans had for the houseguests, and my fictional, long-time personal software brand Mutant Soup. (With influence from Talk Soup, which was still big at the time, probably.)
That first version showed the cast bios, live updates from JokersUpdates, and a kind of newsletter I started writing called The Daily Dish.
Who knew you could pack so much in a 3.5” screen! It was like a tiny little tabloid in your pocket, and I loved it – so did a pretty large number of users in those early App Store days. I sold this app for $1.99 – a bargain given that I was doing a ton of work writing those daily dishes, which could sometimes go as much as 2,000 words! Along with just the general upkeep of it all.
That first version synced to an xml file on my server that contained everything but the live updates from JokersUpdates, which is like a micro-blog written by the live feeders who watch the online feeds and take shifts writing down the details. Jokers gave me a special data feed in exchange for promotion and some traffic. All these years later, the Jokers Updates feed still anchors the app. There’s nothing else like it out there.
Every year I would add a little something new, and every two or three I rewrote the app completely from scratch. It’s fun to go back through these old screenshots and see how the app evolved over time. But it also kind of stole my excitement for the game, little by little each year too. It became too much like a job, and writing those articles would just burn me out.
This year should have been a rewrite year, as it was overdue, but I just wasn’t much into it. Tiny little secret, I didn’t watch a single minute of Big Brother 25. Not an episode. Not the live feeds, either. I know Jag won, but I couldn’t tell you the first thing about who he is or how he did it. It’s just a mystery to me.
I’ve always used this app to explore ideas in technology and design that were interesting to me, and right now, like many people I am all about Artificial Intelligence. It’s a year too soon to incorporate AI into this app (* maybe), as those features won’t ship to the public until this fall. But I thought maybe a neat idea would be to see how much of this rewrite I could do with AI instead. Turns out, quite a lot!
I started on June 27, and gave myself a July 15 ship date (today) for whatever I had, because the new season premieres on July 17. My goal was to let ChatGPT write as much of the app as possible – all of it if it could (it couldn’t), and just guide it along. By the time it was shippable, I estimate it wrote at least 90%, but not more than 95%. I had to do 5-10% of direct writing, along with a lot of my own expertise to correct and guide it to the solutions for ideas.
However on July 8, less than 2 weeks after I started this experiment, we had a shippable iPhone and iPad app! That gave me a whole week to try to add more features, and I ended up shipping the first build with even more features on July 11. Since then, I’ve shipped two more updates, and have one cued up to go out tomorrow. These aren’t bug fixes; each build adds a few more features.
The speed with which I’ve been able use AI to go from an idea for a feature, to completing it with a git commit has surprised even me, an AI optimist. And I’m having more fun creating this app than I’ve had in years!
* I might actually add real artificial intelligence features into the app, but it will likely require some kind of monetization to cover the costs. Either ad-supported (blech!) or a tip-jar, both of which I’ve used in the past. If you see a tip-jar show up in an upcoming build of the app, look for an AI feature or two to arrive soon.
WAIT, DID WE JUST FINISH THIS THING??
12 days.
12 days, working an hour or two in the mornings and several over the weekend. That’s all it’s taken to build my app from scratch with the help of ChatGPT. When we got the iPad version dialed in just right a few minutes ago, I couldn’t believe it. A proper iPad app with the sidebar, too. (My app has always been iPad-able, but in that lazy, “just a blown up phone app” way.)
While I thought I was going to slide in just in time for my self-imposed July 15 app submission date, here I am on July 8 with nothing left to do for the MVP but add a launch screen and update the icon. I am beside myself (but in a good way)!
However, it wasn’t all roses.
Although I’m obviously very impressed with how well this worked out and probably did no more than 5-10% of the development work myself, I can confidently say these tools aren’t replacing software engineers any time soon. A person with little experience could not pull this off.
There were very few instances where I could take what ChatGPT gave me verbatim and get the results I wanted, usually only after several back and forth exchanges where I had to repeatedly coax the right code out of the model. I caught it making mistakes, using deprecated code, and suggesting implementations that were just plain wrong. I often corrected code and pasted it back into ChatGPT, to get it to write something that worked. Sometimes it was shockingly good – and sometimes it just needed to collaborate with an experienced engineer.
But, the MVP is good to go, a week ahead of schedule. I’m going to finish up these assets, submit the app for review, and start thinking about other features to work on next.
HUMAN-DRIVEN AI DEVELOPMENT EXPERIMENT CONTINUES...
"Hey how well do you know mobile app development, particularly for iOS using Swift?”
That’s how my experiment started, andI’ve probably asked ChatGPT or a local LLM a version of this question a few dozen times or more, just to get an answer or an idea. This time around, I was considering an idea of guiding AI through development of a brand new version of my long-running mobile app.
It’s been going well for about a week and a half, but I think it’s super interesting that I can scroll all the way back up and see the whole conversation. So far it entirely fits within the context window, which means AI can reference not just earlier conversation, but earlier iterations of code and different files I’ve shared with it.
"I have a solid understanding of mobile app development, including iOS development using Swift. I can help with various aspects of the development process, such as setting up projects, UI design using SwiftUI or UIKit, managing app lifecycle, integrating APIs, handling data persistence, debugging, and deploying apps to the App Store. If you have specific questions or need assistance with a particular part of your iOS project, feel free to ask!"
Yesterday after posting an article about this project, ChatGPT and I worked through a tricky, complex HTML file to get the specific data to drive my user interface, then it wrote the user interface. All of this with unit tests to make sure nothing goes off the rails. It’s not perfect just yet, but its damn close!
Then I took the dog for his morning walk and got ready for my jobby-job (h/t to @CaseyLiss for embedding that silly term permanently in my brain), all by 9am.
I’m positive I’ll hit my July 15 ship date, but now I have a new idea, post-launch. I’m pretty sure I’ll stay under the full context window, so what if I export the full chat, drop it on a local LLM, and see what it can learn from reading how the sausage got made?
💡 – hey, is this the year I do a proper Android version of my app that’s not React Native?
Pro-tip:
Don’t feel like you have to always chat with the model. Often the best way to make it understand something, whether it’s an alignment issue in your UI, warnings and errors in your IDE, or a feature from a previous version of your software, is to just take a screenshot and show it. You don’t even have to always explain it – it’ll figure it out on its own!
IS CHATGPT SMART ENOUGH TO REWRITE MY APP IN A MONTH? AN EXPERIMENT
For 15 years, I’ve had at least one app in the App Store – some version or another of Hamster Soup, my app for superfans of the TV show Big Brother (US). I’ve worked on a lot of apps – a few my own – but there’s always been a Hamster Soup out there. It was the original and first Big Brother fan app and spawned at least a few competitors, the best of which was Pocket Big Brother. I talked to its author once, and he’s the same sort of fan of the show as me. I think all of us were superfans ourselves. Over the years, it went from a passion project to a hobby project to a playground project – I have used the many incarnations of the app to explore ideas I’m interested in.
For the first few years, it was a paid app, th
en it became ad-supported, then tip-supported. I estimate that over the first 10 years, I made about $50k after Apple’s cut, which means by App Store standards, it was actually mildly successful. Most apps don’t make enough to justify the $99/year fee. A number of massive franchises (usually games) skew the stats. We all do it for the love of creating something new.
For the last several years, it’s just been a free app, with no attempt to monetize it, mainly out of guilt because I just no longer have the time to put into it like I used to. In the early days, like that month I netted almost $14k, I was writing something like 2,000-word daily updates several days a week. I had recently graduated with a Communication degree, so not only did it help me practice my software engineering craft but also my journalism writing too. Today it is simply my software playground.
But I have even less time to create content for the app than I did only five years ago, and these days, 90 percent of the app is aggregated content from various Big Brother-focused news sources. I’ve asked permission to use a few over the years and they’ve been happy to provide access because tapping links in my app will load their site and monetized ads.
This year, with the rapid growth of Artificial Intelligence, I’m looking to automate that remaining 10 percent. But maybe not in the obvious way. I keep my apps as close to state-of-the-art as possible, with the latest version of iOS as the base SDK. Apple has announced a lot of new AI features coming in iOS 18, but because that won’t launch until the fall and Big Brother is a summer show, I’ll have to skip most of the AI features this year and base the new app on iOS 17. Interestingly, ChatGPT was more than happy to help me write an iOS 18 app after it reviewed the latest updates on Apple’s developer site. I love its ambition but pulled it back to what we can actually do.
But that doesn’t mean I can’t use AI.
This year I am running an experiment: can I rewrite my app with Artificial Intelligence as the main developer? I’ll do some of the work, but mostly will be directing it, copying the code it generates based on my specifications, and giving it feedback to keep it on task with what I need it to do.
I actually already started it about a week ago with ChatGPT 4.0, with a goal to ship whatever I have in a little over 2 weeks. The new season of Big Brother starts on July 17th, so my goal is to ship v1.0 by July 15th.
So far, I’ve coached it through writing several of the feeds, which include fetching the RSS feed and parsing out the content we want and displaying it in an infinite scrolling list view (as far back as the feed goes, anyway). Tapping on the list item will load the associated URL to the blog article in a web view. I’ve also included unit tests that verify it’s parsing correctly. I use SwiftSoup to parse feeds and HTML, which would have been my only dependency, but ChatGPT wanted to use AlamoFire as well. I haven’t used it since the old Objective-C days, so figured why not, and see how much simpler it is for writing network code compared to URLRequests.
I still haven’t got it to do a proper iPad UI yet, so for now, iPad support is TBD.
So will I hit my July 15 date with all the features I have in mind? Probably not all of them at launch, especially since I’m doing this on the weekend and in the morning for about an hour before work, while drinking coffee and eating cereal. But I’ve seen enough to know whatever I do ship with will be good enough, and then we can iterate. In other words, typical app development.
The ease with which this is progressing is compelling, exciting, and a little bit scary. With tools this capable, how much longer will software engineers be valued for their programming skills? I think we’ve got some ways to go yet, but if you’re a software engineer starting out today, learning how to get these AIs to do what you need with the least amount of fuss is going to take you far.
I think we’ll find a combination of programming skills (to check AI’s work and keep it on track) and communication skills (for effectively talking to the LLMs) will be the true killer combo for the next generation of software engineers.