Speech Recognition and Core Data in iOS 10

Written by: on September 13, 2016

We’re always excited to play with new APIs that arrive with each iOS release. Apple introduced a new Speech Recognition API that allows us to tap into the server-side speech-processing capabilities that power Siri. Core Data has significant changes for us this year that make it much easier and safer to work with. Let’s make a mashup app where we experiment with speech recognition and new Core Data APIs. To demonstrate these two new APIs we’ll make a contrived example app called “Tell Core Data” that lets users add colors to a table with voice commands.

Speech Recognition

The iOS platform has had speech-to-text in the system keyboard since 2011. The new Speech Recognition APIs expose that functionality decoupled from the keyboard so we can programmatically process dictation results. This gives us the ability to do pattern-matching on what a user is dictating. My first programming experiences were inspired by text-based adventure games like “Tass Times in Tonetown” in which you issue text commands like “GO EAST” and “WEAR JUMPSUIT”—it would have been fun to be able to tap into speech-to-text processing for my games.

The first step to using the Speech Recognition API is to add a couple entries to Info.plist to designate the app requires permissions to send data to Apple’s servers. We will import Speech to import the Speech module. There’s a little bit of setup required to initialize speech recognition. Take a look in the SpeechService class in the example project to see how it works, which is based on Apple’s SpeakToMe example.

The basic flow is first we request authorization with SFSpeechRecognizer.requestAuthorization. When the user presses a button to begin speaking, we’ll send off a series of calls beginning with initializing an AVAudioSession. Then we’ll create an SFSpeechAudioBufferRecognitionRequest and provide that to an SFSpeechRecognitionTask. Finally we start an AVAudioEngine. After that series of calls we have fast, live speech recognition we can consume as text provided in the callback from the speech recognition task. In our example app, when a user says a color like “red”, we’ll dispatch those commands to Core Data to make changes in the persistent store, and then refresh our table view. Here’s our example after calling out a series of colors.


This new speech-processing capability is another data point in a trend where Apple is giving us more power to do interesting things. As fun is it is to play with speech recognition, new Core Data APIs are far more germane to the apps we build today. Core Data is getting some really terrific updates this year.

Core Data

Core Data is getting much better about how it handles multiple concurrent requests with regards to locking. A typical Core Data stack uses multiple Managed Object Contexts (MOC) each one attached to a Persistent Store Coordinator (PSC). Usually a main-queue MOC is used for reading, and writes are done on background-queue MOCs. Core Data’s concurrency model has required locking at the PSC level to coordinate multiple requests. For apps that needed to download large payloads and write them in the background, we have benefitted from maintaining two independent stacks with independent PSCs. The big change this year is that the lock on the PSC is moved down to the SQLite layer which results in a much more responsive Core Data stack, and might eliminate the cases where two stacks are needed.

Another exciting change this year is the introduction of NSPersistentContainer which wraps up an NSManagedObjectModel, NSPersistentStoreCoordinator, and NSManagedObjectContext in one object. This makes setup much easier than building and attaching each of those components separately. Setting up an NSPersistentContainer couldn’t be easier.



When we use an NSPersistentStoreContainer, we get support for the common workflow that includes using a main queue MOC for fetches, and private queue MOCs for doing updates in the background. For the main queue context, look for the aptly named viewContext property. Private queue contexts are provided as a block parameter when you call the handy performBackgroundTask() method.



There’s a philosophy shift behind this year’s Core Data revisions in that we’ll have really good conventions and sane defaults so that we do less configuration ourselves. NSPersistentContainer exemplifies that philosophy as do some other new additions such as automaticallyMergesChangesFromParent, which relieves us of writing code to handle merge notifications. NSPersistentContainer will simplify Core Data usage for many apps, and for those cases where it fits we’re looking forward to using it. Other Core Data features we’re happy to see are Swift generics usage in Core Data types, and UICollectionViewDataSourcePrefetching support in NSFetchedResultsController.

Check out the example app if you’d like to read through more of the implementation of the new Core Data stack. That wraps up our mashup of new Speech Recognition and Core Data.

This article is part of our Welcome to iOS 10 series.

Sean Coleman

Sean Coleman

Sean Coleman is a Technology Lead at POSSIBLE Mobile where he leads multi-platform projects with large teams for well-known brands. Outside of writing native apps for Apple platforms you’ll find Sean playing with his two kids or hitting balls on the tennis court.

Add your voice to the discussion: