Picking the right data structure in Swift | Swift by Sundell

https://www.swiftbysundell.com/articles/picking-the-right-data-structure-in-swift

Deciding which data structure to use to represent a given collection of values can often be trickier than it seems. Since each kind of data structure is optimized for a certain number of use cases, finding the right match for each set of data can often have a big impact on how efficient our code ends up becoming.

The Swift standard library ships with three main data structures — Array, Dictionary and Set — that each comes with a different set of optimizations, pros and cons. This week, let’s take a look at some of those characteristics, and also how we sometimes might need to venture outside the realm of the standard library to find the right data structure for our needs.

The linearity of arrays

Array is arguably one of the most commonly used data structures in Swift, and for good reasons. It keeps its elements in sequence, it’s easy to iterate through in a predictable fashion, and it can store any kind of value — from structs, to class instances, to other collections.

For example, here we’re using an array to store a collection of shapes that are placed on a Canvas within a drawing app. Then, when asked to render our canvas into an image, we simply iterate through our array in order to draw each element using a DrawingContext — like this:

struct Canvas {
    var shapes: [Shape]

    func render() -> Image {
        let context = DrawingContext()
        shapes.forEach(context.draw)
        return context.makeImage()
    }
}

When it comes to linearly drawing all of our shapes, like we do above, using an array is a perfect fit. Not only do arrays store their elements in a very efficient manner, they also have a guaranteed order of iteration, which gives us a predictable drawing order without having to do any extra work.

However, just like all other data structures, arrays also have downsides. In our case, we’ll start encountering one such downside when we want to start removing shapes from our canvas. Since array elements are stored by index, we will always need to look up which index that a given shape is associated with before we can remove it:

extension Canvas {
    mutating func remove(_ shape: Shape) {
        guard let index = shapes.firstIndex(of: shape) else {
            return
        }

        shapes.remove(at: index)
    }
}

At first the above code might not seem that problematic, but it’s very likely to become a performance bottleneck for any canvas that contains a large number of shapes — since firstIndex is linear (O(n)) in terms of time complexity.

While we could work around that limitation wherever we’re using our Canvas type — for example by always referring to shapes by index, rather than by value or ID — doing so would make our code both more complex and more fragile, since we’d always need to make sure that our indexes won’t become stale whenever the canvas we’re working with is mutated.

Instead, let’s see if we can optimize Canvas itself by changing its underlying data structure. Looking at the above problem, one of our initial ideas might be to use a Set instead of an Array. Like we took a look at in “The power of sets in Swift”, one of the big advantages that sets have over arrays is that both inserts and removals can always be performed in constant (O(1)) time, since members are stored by hash value, rather than by index.

Updating Canvas to use a set instead would make it look like this:

struct Canvas {
    var shapes: Set<Shape>

    func render() -> Image {
        let context = DrawingContext()
        shapes.forEach(context.draw)
        return context.makeImage()
    }

    mutating func remove(_ shape: Shape) {
        shapes.remove(shape)
    }
}

Again, the above code might look right, and it even compiles without a hitch. However, while we’ve solved our removal problem, we’ve also lost our stable drawing order — since, unlike arrays, sets do not give us a guaranteed order of iteration — which is a dealbreaker in this situation, as we’d start drawing the user’s shapes in a seemingly random order.

Indexing indexes

Let’s keep experimenting. Next, let’s see if we can optimize Canvas by introducing a Dictionary that lets us look up any shape’s index based on its ID. We’ll start by making our array of shapes private to be able to take control over how elements are inserted — using a new add method — and each time a new shape is added we also add its index to our dictionary: