Swift. Collections
Here is a quick introduction, along with the full articles list
Array
An array is an ordered, random-access collection
// Creating an array[Int]() // []
let array: [Int] = [] // []
Array<Int>() // []
let array2: [Int] = Array() // []
Array<Int>(repeating: 0, count: 5) // [0, 0, 0, 0, 0]
[1, 2, 3, 4, 5] // [1, 2, 3, 4, 5]
The count property shows an actual amount of elements
The capacity property shows the total amount of elements an array can contain before exceeding and having to allocate new memory
An array reserves a specific amount of memory to store initial elements. Once a new element is added, and the memory gets exceeded, the array allocates a new memory region, twice more than the current one. It copies old elements there, and only after that, it adds the new element to that region. It can be time-consuming. The original storage size grows exponentially, which is called the geometric growth pattern
var array = [1, 2, 3, 4, 5]
// count = 5, capacity = 5array.append(6)
// count = 6, capacity = 10array += [7, 8, 9, 10, 11]
// count = 11, capacity = 20
The reserveCapacity(_:) method allows to set the initial capacity of an array. If you know how many items the array will store, then explicit specifying capacity can reduce the number of memory allocations
var array: [Int] = []array.reserveCapacity(30)array += [1, 2, 3, 4, 5]
// count = 5, capacity = 30array.append(6)
// count = 6, capacity = 30array += [7, 8, 9, 10, 11]
// count = 11, capacity = 30array += Array.init(repeating: 0, count: 20)
// count = 31, capacity = 60
Reinitializing an array resets its capacity
var array: [Int] = []
array.reserveCapacity(30) // capacity = 30
array = [1, 2, 3, 4, 5] // capacity = 5
// Adding new elementsvar array: [Int] = []array.append(1)
// [1]array.append(contentsOf: [2, 3])
// [1, 2, 3]array += [4, 5]
// [1, 2, 3, 4, 5]array.insert(100, at: 1)
// [1, 100, 2, 3, 4, 5]array.insert(contentsOf: [200, 300], at: 3)
// [1, 100, 2, 200, 300, 3, 4, 5]
append(element) = O(n), where n = 1
append(contentsOf: [element2, element3]) = O(n), where n = 2
insert(element, at: index) and insert(contentsOf: [element2, element3]) = O(k), where k = (count — index) + 2
// Removing elementsvar array = [1, 2, 3, 4, 5]
// [1, 2, 3, 4, 5]array.remove(at: 2)
// [1, 2, 4, 5]let el = array.removeFirst()
// el = 1, [2, 4, 5], throws an error if no elementslet el2 = array.removeLast()
// el2 = 5, [2, 4], throws an error if no elementsarray.removeAll(keepingCapacity: true)
// capacity = 5, []array.removeAll()
// capacity = 0, []let el3 = array.popLast()
// el3 = nil, but it could be an optional numbervar array2 = [1, 2, 3, 4, 5]
// [1, 2, 3, 4, 5]array.removeSubrange(1..<3)
// [1, 4, 5]
// Indexing elementsvar array = [1, 2, 3, 4, 5]
// [1, 2, 3, 4, 5]array.first
// Optional(1)array.last
// Optional(5)array.startIndex
// 0array.endIndex
// 5, endIndex == countarray.indices
// 0..<5array.firstIndex(of: 2)
// Optional(1)array.firstIndex { $0 > 3 }
// Optional(3)array.lastIndex(of: 2)
// Optional(1)array.lastIndex { $0 > 3 }
// Optional(4)
count vs. endIndex: there is no performance difference between count and endIndex in random-access collections (arrays) — O(1). For bidirectional ones (trees, linked lists), count performance could be O(n), but endIndex — still is O(1)
// Replacing elementsvar array = [1, 2, 3, 4, 5]
// [1, 2, 3, 4, 5]array[0] = 10
// [10, 2, 3, 4, 5]array[1..<5] = [20, 30, 40, 50]
// [10, 20, 30, 40, 50]array[...] = [1, 2, 3, 4, 5, 6]
// [1, 2, 3, 4, 5, 6], here is the range < valuesToReplacearray[...] = [1, 2, 3]
// [1, 2, 3], here is the range > valuesToReplacearray.swapAt(0, 4)
// [5, 2, 3, 4, 1]
ArraySlice
ArraySlice is “View onto Array” ArraySlice is an array subsequence, which means a set of consecutive elements from the origin array. If you have [1, 2, 3, 4, 5], it is impossible to get a subsequence like this [1, 5] (it is not a subsequence at all). So, the subsequence could be: [1, 2, 3], [2, 3, 4], [4, 5]
var array = [1, 2, 3, 4, 5]
// Creating a slice via droparray.dropFirst()
// ArraySlice<Int>[2, 3, 4, 5]array.dropFirst(2)
// ArraySlice<Int>[3, 4, 5]array.dropLast()
// ArraySlice<Int>[1, 2, 3, 4]array.dropLast(2)
// ArraySlice<Int>[1, 2, 3]array.drop { $0 < 3 }
// ArraySlice<Int>[3, 4, 5] <- drops the first elements
// till the condition
// Creating a slice via prefixarray.prefix(3)
// ArraySlice<Int>[1, 2, 3] <- first 3 elementsarray.prefix(upTo: 3)
// ArraySlice<Int>[1, 2, 3] <- first elements until index == 3,
// excluding the element at index == 3array.prefix(through: 3)
// ArraySlice<Int>[1, 2, 3, 4] <- the same as prefix(upTo:)
// + the element at index == 3array.prefix { $0 < 3 }
// ArraySlice<Int>[1, 2] <- first elements till the condition
// Creating a slice via suffix (oppoiste of prefix)array.suffix(3)
// ArraySlice<Int>[3, 4, 5]array.suffix(from: 3)
// ArraySlice<Int>[4, 5]
// Creating a slice via Rangearray[2...4]
// ArraySlice<Int>[3, 4, 5]array[2..<4]
// ArraySlice<Int>[3, 4]array[...3]
// ArraySlice<Int>[1, 2, 3, 4]array[..<3]
// ArraySlice<Int>[1, 2, 3]array[3...]
// ArraySlice<Int>[4, 5]array[...]
// ArraySlice<Int>[1, 2, 3, 4, 5]
ArraySlice maintains the same indices of its array
var array = [1, 2, 3, 4, 5]
// [1, 2, 3, 4, 5]var slice = array[1...3]
// ArraySlice<Int>[2, 3, 4]print(slice[0])
// index out of bounds errorprint(slice[slice.startIndex])
// 2 <- first slice indexprint(slice[slice.endIndex - 1])
// 4 <- last slice index
ArraySlice holds a strong reference to the original array’s entire storage, even if the array consists of thousands of elements, but the slice has just a few ones. Long-term storage of a slice may therefore prolong the lifetime of elements that are no longer otherwise accessible, which can appear to be memory and object leakage
var array: Array<Element>? = [Element(), Element()]
var slice: ArraySlice<Element>? = array!.dropFirst()array = nil // no deinit was called
slice = nil // two deinits were called
Set
A set is an unordered collection of unique elements
A set represents a hash table. Insertion, deletion, and search for elements in a set take on average constant time (O(1)). A set computes a hash as an index based on the content of the object. The index can be treated as the direct address in memory. Thus, each item can be directly retrieved, deleted, and added by its hash
Hash collision means that two different objects have the same hash. There are few solutions to resolve collisions:
- chaining — getting by hash a list, instead of a single item
- open addressing — each item is a single object; all items are stored directly in a hash table. They are flattened. Swift uses open addressing with linear probing — each item is a single object, on collision find the next free slot
// Linear probingIf slot hash(x) % S is full, then we try (hash(x) + 1) % S
If (hash(x) + 1) % S is also full, then we try (hash(x) + 2) % S
If (hash(x) + 2) % S is also full, then we try (hash(x) + 3) % S
...
Constant time on average means that if a set has collisions, it should check all elements with the same hash to return the correct one. Thus, the more collisions a set has, the less performant getting of its elements is
A good hash function should distribute elements as best as it can. So, on collision, when a hash table uses open addressing with linear probing, and the hash function is good, it really will take just the next slot
To be stored in a set, each object must conform to Hashable (and also Equatable) protocol. A set stores elements by hash and verifies their uniqueness by == function
struct Point {
let x: Int
let y: Int
}extension Point: Hashable {
func hash(into hasher: inout Hasher) {
hasher.combine(x)
}
}let p1 = Point(x: 1, y: 0)
let p2 = Point(x: 1, y: 1)p1.hashValue == p2.hashValue // true
p1 == p2 // falsevar set: Set<Point> = [p1, p2] // [p1, p2]// vsstruct Point {
let x: Int
let y: Int
}extension Point: Hashable {
func hash(into hasher: inout Hasher) {
hasher.combine(x)
} static func ==(lhs: Point, rhs: Point) -> Bool {
return lhs.hashValue == rhs.hashValue
// by default for structures, Swift compares properties
// instead
}
}let p1 = Point(x: 1, y: 0)
let p2 = Point(x: 1, y: 1)p1.hashValue == p2.hashValue // true
p1 == p2 // truevar set: Set<Point> = [p1, p2] // [p1]
In previous versions (less than 4.1) of Swift, conforming to the protocols took quite a bit of boilerplate code to satisfy the requirements for storing a custom type in a set. So, developers had to apply XOR overall object properties when computing their hash. Also XOR is commutative (a ^ b == b ^ a) operation. There was quite a high chance to get a hash collision
struct Color {
let red: UInt8
let green: UInt8
let blue: UInt8
}// Swift < 4.1extension Color: Equatable {
static func ==(lhs: Color, rhs: Color) -> Bool {
return lhs.red == rhs.red &&
lhs.green == rhs.green &&
lhs.blue == rhs.blue
}
}extension Color: Hashable {
var hashValue: Int {
return red.hashValue ^ green.hashValue ^ blue.hashValue
}
}let cyan = Color(red: 0x00, green: 0xFF, blue: 0xFF)
let yellow = Color(red: 0xFF, green: 0xFF, blue: 0x00)cyan.hashValue == yellow.hashValue // true, a collision
In Swift 4.1, the compiler automatically synthesizes conformance to the Equatable and Hashable protocols for types, which adopt these protocols in their declaration, if all their members also conform to those protocols (works for structures only, for classes you must always override hashValue by yourself). And default implementation calls the standard library’s _mixInt function on each member’s hash value and then combine them with exclusive-or (^) — the same hash collision problem
// Swift == 4.1struct Color {
let red: UInt8
let green: UInt8
let blue: UInt8
}let cyan = Color(red: 0x00, green: 0xFF, blue: 0xFF)
let yellow = Color(red: 0xFF, green: 0xFF, blue: 0x00)cyan.hashValue == yellow.hashValue // true, a collision
// Swift >= 4.2struct Color: Hashable {
let red: UInt8
let green: UInt8
let blue: UInt8// synthesized by the compiler
// func hash(into hasher: inout Hasher) {
// hasher.combine(red)
// hasher.combine(green)
// hasher.combine(blue)
// }// var hashValue: Int {
// var hasher = Hasher()
// hash(into: &hasher)
// return hasher.finalize()
// }}let cyan = Color(red: 0x00, green: 0xFF, blue: 0xFF)
let yellow = Color(red: 0xFF, green: 0xFF, blue: 0x00)cyan.hashValue == yellow.hashValue // false, no collision
Today, if you want to customize how your type implements Hashable, you can override the hash(into:) method instead of hashValue. The hash(into:) method takes a Hasher object as in-out, upon which you make combine(_:) method call to add the essential state information of your type
Again, all properties must be hashable to be able to use functionality generated by the compiler. Otherwise, you must override a hash function yourself
Code generation, in this case, works for structures only. Classes require hash function override anyway. Otherwise, you will get the compile-time error
class Color: Hashable {
let red: UInt8
let green: UInt8
let blue: UInt8 init(red: UInt8, green: UInt8, blue: UInt8) {
self.red = red
self.green = green
self.blue = blue
} func hash(into hasher: inout Hasher) {
hasher.combine(red)
hasher.combine(green)
hasher.combine(blue)
}
}
// Creating a setSet<Int>() // []
var set: Set<Int> = Set() // []
var set2: Set<Int> = [] // []
var set3: Set = [1, 2, 3, 4, 5] // [1, 2, 3, 4, 5]
Sets have a fixed amount of elements they can store before allocating new memory for more elements — just like dynamic arrays. But allocation here is more costly because along with copying all elements and putting them to the new memory region (like arrays do), sets also rehash each element
var set: Set = [1, 2, 3, 4, 5] // [1, 2, 3, 4, 5]
// count = 5, capacity = 6set.insert(6)
// count = 6, capacity = 6set.insert(7)
// count = 7, capacity = 12
The reserveCapacity(_:) method allows to set a minimal capacity for elements. It means that if you specify capacity that equals 30, you can get capacity that will equal 48
var set: Set<Int> = []
set.reserveCapacity(30) // capacity = 48
// Adding elementsvar set: Set = [1, 2, 3] // [1, 2, 3]
set.insert(4) // (true, 4), [1, 2, 3, 4]
set.insert(4) // (false, 4), [1, 2, 3, 4]
// Removing elementsvar set: Set = [1, 2, 3]
// [1, 2, 3]set.remove(3)
// 3, [1, 2]set.removeFirst()
// 2, [1], the element may not be the first element, added to the setset.removeAll(keepingCapacity: true)
// capacity = 3, []set.removeAll()
// capacity = 0, []
// Modifying elements
// (these functions mutate the original set)var set: Set = [1, 2, 3]
// [1, 2, 3]set.formUnion([4, 5, 6])
// set is [2, 3, 6, 5, 4, 1]set.formIntersection([1, 2, 3])
// set is [1, 2, 3], removes elements,
// which are not included to [1, 2, 3]set.formSymmetricDifference([1, 2, 4])
// set is [3, 4], difference between [1, 2, 3] and [1, 2, 4]set.subtract([3])
// set is [4]
// Set operations (these functions create new set)var set: Set = [1, 2, 3] // [1, 2, 3]
set.union([4, 5, 6]) // [5, 3, 4, 1, 6, 2]
set.intersection([1, 2]) // [2, 1]
set.symmetricDifference([1, 2, 4]) // [4, 3]
set.subtracting([3]) // [1, 2]
// Comparing elementsvar set1: Set = [1, 2, 3, 4, 5]
var set2: Set = [5, 4, 3, 2, 1] set1 == set2 // true
set1.elementsEqual(set2) // false, the order isn't the same
// Comparing elementsvar set: Set = [1, 2, 3, 4, 5]set.isSubset(of: [1, 2, 3, 4, 5, 6])
// trueset.isStrictSubset(of: [1, 2, 3, 4, 5, 6])
// trueset.isStrictSubset(of: [1, 2, 3, 4, 5])
// false, must have different elementset.isSuperset(of: [1, 2, 3])
// trueset.isStrictSuperset(of: [1, 2, 3])
// trueset.isStrictSuperset(of: [1, 2, 3, 4, 5])
// false, must have different elementset.isDisjoint(with: [6, 7])
// true
Dictionary
Dictionary is an unordered collection of key-value pairs
Dictionary, like Set, represents a hash table. Keys of a dictionary must conform to Hashable (described in Set)
// Creating a dictionaryDictionary<Int, String>()
var dictionary: Dictionary<Int, String> = Dictionary()
var dictionary2: Dictionary<Int, String> = [:]
var dictionary3 = [1: "One", 2: "Two", 3: "Three"]
Like an array and set, a dictionary has a fixed amount of elements it can store before allocating new memory. When the capacity gets exceeded, the dictionary allocates a new storage and copies all elements there. Since any dictionary is a hash table, it also rehashes all items when copying them to a new memory region
var dictionary = [1: "One", 2: "Two", 3: "Three"]
// count = 3, capacity = 3dictionary[4] = "Four"
// count = 4, capacity = 6
// Manipulating dictionary elementsvar dictionary = [1: "One", 2: "Two", 3: "Three"]
// [2: "Two", 1: "One", 3: "Three"]dictionary[1]
// Optional("One")dictionary[4] = "Four"
// [2: "Two", 1: "One", 3: "Three", 4: "Four"]dictionary[2] = nil
// [1: "One", 3: "Three", 4: "Four"]dictionary[4]?.append("!")
// [1: "One", 3: "Three", 4: "Four!"]
// Iterating over the contents of a dictionaryvar dictionary = [1: "One", 2: "Two", 3: "Three"]for (key, value) in dictionary {
print("\(key): \(value)")
}