How to decide between a Set and Array in Swift?
Published on: May 15, 2024Collections are a key component in any programming language. We often refer to collections as Array
or Set
but there are several other kinds of collections in programming like String
(often a collection of type Character
) and ArraySlice
(referring to a part of an array).
In this post, I’d like to explore two of the most common collection types; Set
and Array
. We’ll take a look at the key characteristics for each and we’ll explore use cases where we can use each.
We’ll cover the following topics:
- Understanding Array’s key characteristics
- Understanding Set’s key characteristics
- Exploring performance considerations
- Use cases for Set and Array
Understanding Array’s key characteristics
An Array
in Swift is defined as follows:
let myList = ["one", "two", "three"]
If we fully write out the type for myList
, we’d write let myList: Array<String>
. That’s because arrays in Swift can only contain a homogeneous collection of objects. In other words, it can only contain objects of a single type. In this case that type is String
.
We can have any kind of object in an Array
, the only restriction is that your array must only contain objects that are all of the same type. In other words, we can’t have an array that contains both Int
and String
, but we can have an array that contains a custom enum:
enum MixedValue {
case int(Int)
case string(String)
}
let myList: [MixedValue] = [.int(1337), .string("Hello")]
Our array in this example only contains values of type MixedValue
. Even though the associated values for my array are mixed, Swift will allow this because our array is still an array of MixedValue
.
Items in an array are ordered. This means that items in an array will always be in the same order, no matter how many times you iterate over your array. For example, if you use a for loop to iterate your array thousands of times, the ordering of your elements won’t change.
You can reorder your array if you’d like by sorting it, and from that point on the new sorting will remain as the single ordering for your array.
Arrays can also contain duplicate values. This means that you can have multiple objects that are equal in the same array.
If we want to find an item in an array we can use the first(where:)
function to iterate the array until we find what we’re looking for:
let myList: [Int] = [1337, 1338, 1339]
let item = myLIst.first(where: { $0 == 1340 })
The code above would iterate all items, not find a match based on my comparison and set item
to nil
.
There’s a lot more to know about working with arrays and collections in general, but to keep this post focused on the comparison between set and array, these are the key characteristics that I wanted to show you on array.
Arrays are meant to hold data that is ordered and this data doesn’t have to be unique
Understanding Set’s key characteristics
A Set
in Swift holds a single type of object, just like Array
does. For example, we can have a Set
of strings like this:
let mySet: Set<String> = ["hello", "world"]
Notice how defining the set looked pretty much the same as defining an array which would have looked as follows in this specific case:
let myArray: Array<String> = ["hello", "world"]
Both sets and arrays can be initialized using array literal syntax.
One key difference between sets and arrays is that elements in a Set
must be Hashable
, and a Set
only contains unique values.
This means that we can add items like String
to a Set
because String
is Hashable
. We can also add custom types to a Set
as long as the type is Hashable
.
Also note that I wrote earlier that items in a Set
must be unique. Items in a Set
are compared based on their hash value and when you add a second item with a hash value that’s already in your set the old item is removed and the new one is kept in the set instead.
If we want to find out whether an item in our Set
exists we can use contains
and pass the value we’re looking for:
let mySet: Set<String> = ["hello", "world"]
let hasValue = mySet.contains("hello")
If we want to find a specific item in our Set
we can use the same first(where:)
method that you saw earlier on Array
. That’s because this method is part of the Collection
protocol that both Array
and Set
conform to.
When you iterate over a set, the order of elements in the set is not guaranteed. This means that when you perform many iterations, you’ll notice that sometimes the order of items in your set gets shuffled. That’s expected.
A Set is meant to hold on to unique, unordered data that conforms to Hashable
If you require Set
semantics but also need ordering, you could consider pulling in the swift-collections package and use its OrderedSet
object which holds unique Hashable
items but it also maintains an ordering. In a way, OrderedSet
is an Array
that enforces unique items and has O(1)
lookup. Kind of the best of both worlds.
Performance considerations
It’s hard to give you a complete overview and advice for performance comparisons between Set
and Array
because there’s loads of things we can do with them.
The key aspect of performance that we can reason about is looking up items in either.
An array performs an item lookup in O(n)
time. This means that in a worst case scenario we’ll need to look at every element in our array before we find our item. A Set
on the other hand performs a lookup in O(1)
. This means that a set always takes the exact same amount of time to find the item you want to look for. This is orders of magnitude better than O(n)
, especially when you’re dealing with large data sets.
In Summary
In the end, the decision between Set
and Array
is one that I believe is made best based on semantics. Do you have a list of Hashable
items that need to be unique in a collection without ordering; you’re thinking of a Set
. Do you care about order? Or maybe you can’t make the items Hashable
, then you’re probably thinking of an array.
There is of course the exception where you might want to have unique items that are Hashable
while maintaining order, in which case you can choose to use an OrderedSet
from swift-collections
.
I would always base my decision on the above and not on things like performance unless I’m working on a performance-critical piece of code where I can measure a difference in performance between Set
and Array
.