Custom Sequence and Extending Array in Swift
In this article I will describe how to extend Array in Swift by creating a custom sequence. I will also describe how type system in Swift provides requirements and allows to optimize the standard library.
The Idea
Let’s write a method that flattens nested (multidimensional) array. I will call it recursiveFlatten
becase it is supposed to concatenate all elements in nested arrays recursively. Element of an array can be a collection or arbitrary object:
[1, [2, 3, [[4], 5]]].recursiveFlatten()
[1, 2, 3, 4, 5]
Array has joined
method that concatenates elements of two-dimensional array. For multidimensional array it has to be repeated and it won’t work with a mix of sequences and other types.
[[1], [2, 3]]
joined: [1, 2, 3]
recursiveFlatten: [1, 2, 3][[1], [2, [3]]]
joined: [1, 2, [3]]
recursiveFlatten: [1, 2, 3][1, 2, [3]]
joined: won't work
recursiveFlatten: [1, 2, 3]
Designing With Swift Type System
Standard library uses Swift type system extensively in its API design. This allows to declare specific requirements and opens opportunities for optimizations.
Let’s first have a look how joined
method is declared:
- The declaration is in an extension of
Array
that requires its elements for be collections. - The return type is
FlattenCollection
. This is an example of optimization possible in regards to Swift type system. At a glance we expectjoined
to return anArray
object. This will create a copy.FlattenCollection
wraps original array and iterates over it. This allows to alter the way iteration works without memory allocation.
To replicate this design the return type of recursiveFlatten
is RecursiveFlattenSequence
, my custom sequence.
- The declaration is in an extension of
Array
without additional requirements for its elements. - The return type is
RecursiveFlattenSequence
, a wrapper for the base collection. This avoids creating unnecessary copy. RecursiveFlattenSequence
is a generic type that conforms toSequence
.- The return sequence uses
Array<Any>
in a type parameter, because my array can be heterogeneous, contain elements of different types.
Here’s a closer look on how RecursiveFlattenSequence
declared:
Being Lazy
Returning a custom sequence or a collection for this operations makes our code lazy. This approach is used by many standard library methods.
Consider return type of reversed
method:
func reversed() -> ReversedCollection<Array<Element>>
ReversedCollection
is a custom collection that wraps an array. It doesn’t make a copy, but allows working with reversed array as you would work with the original. ReversedCollection
is also immutable like other custom collections.
We can always explicitly create an array using init
with sequence:
let reversedCopy = Array(originalArray.reversed())
Iterator and Conforming to Sequence
Iterator is an object used to traverse a sequence. It provides iteration interface for a sequence and encapsulates its iteration state.
The iterator pattern provides ability to hide how sequence is implemented and also create multiple traversals. This is exactly what I am want to use. Here is a basic implementation:
Conforming to Sequence requires Element
, Iterator
and makeIterator
method to instantiate the iterator.
Element
is a type alias to the type used in the base collection.
Iterator
must implement IteratorProtocol
. Fairly simple, it requires a type of an element and next
method.
Iterator
has reference to its base sequence and stores the current position. This implementation returns the type of the base element. The next
method simply returns an element and increments the position. Of course there is a guard
statement to check if the position is not out of bounds.
This implementation iterates over a one-dimensional array. Let’s see how we can iterate multidimensional array.
Iterating Multidimensional Array
I have two kinds of objects in the array and here’s how next
method must process them:
- Arbitrary objects:
next
method returns arbitrary object when encounters it. If no arbitrary objects foundnext
method returnsnil
. - Nested array:
next
method must recursively look inside nested arrays till it finds an arbitrary object. Ifnext
method reaches the end of the nested array without finding an object to return, it must step back to the nesting array if possible.
Now the key word here is recursively and this can be a bit challenging: next
method returns when it finds an element. Therefore Iterator
must store iteration and recursion states. We can do this in two ways:
- Store a reference to the iterator of a nested array;
- Use a stack.
Both solutions are alike and difference is in using reference types or value types, so I name them accordingly.
The Reference Approach
In this solution Iterator
stores references to iterators of inner and outer arrays. This way algorithm can traverse nested arrays and step back when needed.
The Stack Approach
In this approach Iterator
stores iteration state on a stack every time it encounteres a nested array. I use inner type Node
to store a collection with the position in it.
Time Complexity
Both algorithms are lookups for a non-collection element. Time complexity is alike in both cases. The only difference is a starting point.
Reference type algorithm depends on previous runs. It must traverse its list of inner iterators to find the previous one. This is fairly simple to optimize by additional reference to a previous iterator. The is linear O(n) and the worst case is an array deeply nested.
In both cases next
method must process every element once, skipping empty arrays and diving into nested ones. Consider having and array of empty arrays with a single element at the very end. The method must traverse every element till it reaches the last one. This makes complexity O(n).
From space complexity both list of references and a stack takes O(n) space in the worst scenario: many nested arrays.
You can find the source code on my github: https://github.com/dmytro-anokhin/recursive-flatten
Thank you for reading. If you like the article, please follow me and share.
Show authors more ❤️ with 👏’s
And hope to see you next time 😊