What is type erasure in Swift? An explanation with code samples
Published on: May 18, 2020Swift's type system is (mostly) fantastic. Its tight constraints and flexible generics allow developers to express complicated concepts in an extremely safe manner because the Swift compiler will detect and flag any inconsistencies within the types in your program.
While this is great most of the time, there are times where Swift's strict typing gets in the way of what we're trying to build. This is especially true if you're working on code that involves protocols and generics.
With protocols and generics, you can express ideas that are insanely complex and flexible. But sometimes you're coding along happily and the Swift compiler starts yelling at you. You've hit one of those scenarios where your code is so flexible and dynamic that Swift isn't having it.
Let's say you want to write a function that returns an object that conforms to a protocol that has an associated type? Not going to happen unless you use an opaque result type.
But what if you don't want to return the exact same concrete type from your function all the time? Unfortunately, opaque result types won't help you there. Luckily, Swift 5.7 which came out in 2022 allows us to define so-called primary assocated types which allow us to specialize our opaque return types where needed.
It's important to note that primary associated types remove many of the reasons the use type erasure in your app, but they don't make type erasure completely obsolete.
So when the Swift compiler keeps yelling at you and you have no idea how to make it stop, it might be time to apply some type erasure.
In this week's blog post I will explain what type erasure is and show an example of how type erasure can be used to craft highly flexible code that the Swift compiler will be happy to compile.
There are multiple scenarios where type erasure makes sense and I want to cover two of them.
Using type erasure to hide implementation details
The most straightforward way to think of type erasure is to consider it a way to hide an object's "real" type. Some examples that come to mind immediately are Combine's AnyCancellable
and AnyPublisher
. An AnyPublisher
in Combine is generic over an Output
and a Failure
. If you're not familiar with Combine, you can read up in the Combine category on this blog. All you really need to know about AnyPublisher
is that it conforms to the Publisher
protocol and wraps another publisher. Combine comes with tons of built-in publishers like Publishers.Map
, Publishers.FlatMap
, Future
, Publishers.Filter
, and many, many more.
Often when you're working with Combine, you will write functions that set up a chain of publishers. You usually don't want to expose the publishers you used to callers of your function. In essence, all you want to expose is that you're creating a publisher that emits values of a certain type (Output
) or fails with a specific error (Failure
). So instead of writing this:
func fetchData() -> URLSession.DataTaskPublisher<(data: Data, response: URLResponse), URLError> {
return URLSession.shared.dataTaskPublisher(for: someURL)
}
You will usually want to write this:
func fetchData() -> AnyPublisher<(data: Data, response: URLResponse), URLError> {
return URLSession.shared.dataTaskPublisher(for: someURL)
.eraseToAnyPublisher()
}
By applying type erasure to the publisher created in fetchData
we are now free to change its implementation as needed, and callers of fetchData
don't need to care about the exact publisher that's used under the hood.
When you think about how you can refactor this code, you might be tempted to try and use a protocol instead of an AnyPublisher
. And you'd be right to wonder why we wouldn't.
Since a Publisher
has an Output
and Failure
that we want to be able to use, using some Publisher
wouldn't work. We wouldn't be able to return Publisher
due to its associated type constraints, so returning some Publisher
would allow the code to compile but it would be pretty useless:
func fetchData() -> some Publisher {
return URLSession.shared.dataTaskPublisher(for: someURL)
}
fetchData().sink(receiveCompletion: { completion in
print(completion)
}, receiveValue: { output in
print(output.data) // Value of type '(some Publisher).Output' has no member 'data'
})
Because some Publisher
hides the true type of the generics used by Publisher
, there is no way to do anything useful with the output
or completion
in this example. An AnyPublisher
hides the underlying type just like some Publisher
does, except you can still define what the Output
and Failure
types are for the publisher by writing AnyPublisher<Output, Failure>
.
With primary associated types in Swift 5.7, you can write the following code:
func fetchData() -> any Publisher<(Data, URLResponse), URLError> {
return URLSession.shared.dataTaskPublisher(for: someURL)
}
The only problem with using primary associated types on a Publisher
is that not all methods that exist on publishers like an AnyPublisher
are added to the Publisher
protocol. This means that we might lose some functionality by not using AnyPublisher
.
I will show you how type erasure works in the next section. But first I want to show you a slightly different application of type erasure from the Combine framework. In Combine, you'll find an object called AnyCancellable
. If you use Combine, you will encounter AnyCancellable
when you subscribe to a publisher using one of Combine's built-in subscription methods.
Without going into too much detail, Combine has a protocol called Cancellable
. This protocol requires that conforming objects implement a cancel
method that can be called to cancel a subscription to a publisher's output. Combine provides three objects that conform to Cancellable
:
AnyCancellable
Subscribers.Assign
Subscribers.Sink
The Assign
and Sink
subscribers match up with two of Publisher
's methods:
assign(to:on:)
sink(receiveCompletion:receiveValue)
These two methods both return AnyCancellable
instances rather than Subscribers.Assign
and Subscribers.Sink
. Apple could have chosen to make both of these methods return Cancellable
instead of AnyCancellable
.
But they didn't.
The reason Apple applies type erasure in this example is that they don't want users of assign(to:on:)
and sink(receiveCompletion:receiveValue)
to know which type is returned exactly. It simply doesn't matter. All you need to know is that it's an AnyCancellable
. Not just that it's Cancellable
, but that it could be _any MARKDOWN_HASH9fba8e737f748904c9dc7415d4876e4aMARKDOWN<em>HASH
.
Because AnyCancellable
erases the type of the original Cancellable
by wrapping it, you don't know if the AnyCancellable
wraps a Subscribers.Sink
or some other kind of internal, private Cancellable
that we're not supposed to know about.
If you have a need to hide implementation details in your code, or if you run into a case where you want to return an object that conforms to a protocol that has an associated type that you need to access without returning the actual type of object you wanted to return, type erasure just might be what you're looking for.
Applying type erasure in your codebase
To apply type erasure to an object, you need to define a wrapper. Let's look at an example:
protocol DataStore {
associatedtype StoredType
func store(_ object: StoredType, forKey: String)
func fetchObject(forKey key: String) -> StoredType?
}
class AnyDataStore<StoredType>: DataStore {
private let storeObject: (StoredType, String) -> Void
private let fetchObject: (String) -> StoredType?
init<Store: DataStore>(wrappedStore: Store) where Store.StoredType == StoredType {
self.storeObject = wrappedStore.store
self.fetchObject = wrappedStore.fetchObject
}
func store(_ object: StoredType, forKey key: String) {
storeObject(object, key)
}
func fetchObject(forKey key: String) -> StoredType? {
return fetchObject(key)
}
}
This example defines a DataStore
protocol and a type erasing wrapper called AnyDataStore
. The purpose of the AnyDataStore
is to provide an abstraction that hides the underlying data store entirely. Much like Combine's AnyPublisher
. The AnyDataStore
object makes extensive use of generics and if you're not too familiar with them this object probably looks a little bit confusing.
The AnyDataStore
itself is generic over StoredType
. This is the type of object that the underlying DataStore
stores. The initializer for AnyDataStore
is generic over Store
where Store
conforms to DataStore
and the objects that are stored in the Store
must match the objects stored by the AnyDataStore
. Due to the way this wrapper is set up that should always be the case but Swift requires us to be explicit.
We want to forward any calls on AnyDataStore
to the wrapped store, but we can't hold on to the wrapped store since that would require making AnyDataStore
generic over the underlying data store, which would expose the underlying datastore. Instead, we capture references to the method we need in the storeObject
and fetchObject
properties and forward any calls to store(_:forKey:)
and fetchObject(forKey:)
to their respective stored references.
It's quite a generics feast and again, if you're not too familiar with them this can look confusing. I wrote about generics a while ago so make sure to click through to that post if you want to learn more.
Let's see how this AnyDataStore
can be used in an example:
class InMemoryImageStore: DataStore {
var images = [String: UIImage]()
func store(_ object: UIImage, forKey key: String) {
images[key] = object
}
func fetchObject(forKey key: String) -> UIImage? {
return images[key]
}
}
struct FileManagerImageStore: DataStore {
typealias StoredType = UIImage
func store(_ object: UIImage, forKey key: String) {
// write image to file system
}
func fetchObject(forKey key: String) -> UIImage? {
return nil // grab image from file system
}
}
class StorageManager {
func preferredImageStore() -> AnyDataStore<UIImage> {
if Bool.random() {
let fileManagerStore = FileManagerImageStore()
return AnyDataStore(wrappedStore: fileManagerStore)
} else {
let memoryStore = InMemoryImageStore()
return AnyDataStore(wrappedStore: memoryStore)
}
}
}
In the code snippet above I create two different data stores and a StorageManager
that is responsible for providing a preferred storage solution. Since the StorageManager
decides which storage we want to use it returns an AnyDataStore
that's generic over UIImage
. So when you call preferredImageStore()
all you know is that you'll receive an object that conforms to DataStore
and provides UIImage
object.
Of course, the StorageManager
I wrote is pretty terrible. When you're working with data and storing it you need a lot more control over what happens and whether data is persisted. And more importantly, a StorageManager
that will randomly switch between stores is not that useful. However, the important part here is not whether or not my DataStore
is good. It's that you can use type erasure to hide what's happening under the hood while making your code more flexible in the process.
The example of AnyDataStore
I just showed you is very similar to the AnyPublisher
scenario that I described in the previous section. It's pretty complex but I think it's good to know this exists and how it (possibly) looks under the hood.
In the previous section, I also mentioned AnyCancellable
. An object like that is much simpler to recreate because it doesn't involve any generics or associated types. Let's try to create something similar except my version will be called AnyPersistable
:
protocol Persistable {
func persist()
}
class AnyPersistable: Persistable {
private let wrapped: Persistable
init(wrapped: Persistable) {
self.wrapped = wrapped
}
func persist() {
wrapped.persist()
}
}
An abstraction like the one I showed could be useful if you're dealing with a whole bunch of objects that need to be persisted but you want to hide what these objects really are. Since there are no complicated generics involved in this example it's okay to hold on to the Persistable
object that's wrapped by AnyPersistable
.
In summary
In this post, you learned about type erasure. I showed you what type erasing is, and why it's used. You saw how Apple's Combine framework uses type erasure to abstract Publisher
and Cancellable
objects and hide their implementation details. This can be really useful, especially if you're working on a framework or library where you don't want others to know which objects you are using internally to prevent users from making any assumptions about how your API works internally.
After explaining how type erasure is used, I showed you two examples. First, you saw a complicated example that uses generics and stores references to functions as closures. It's pretty complex if you haven't seen anything like it before so don't feel bad if it looks a little crazy to you. I know that with time and experience, a construction like the one I showed you will start to make more sense. Type erasure can be a pretty complicated topic.
The second example I showed you was simpler because it doesn't involve any generics. It mimics what Apple does with Combine's AnyCancellable
to hide the underlying Cancellable
objects from developers.
If you have any questions about this post or if you have feedback for me, reach out to me on Twitter