Introduction
The some
keyword was introduced in Swift 5.1, while any
keyword was introduced in Swift 5.6. Now, these keywords can be used in the function’s parameters and before the definition of a type, as follows:
protocol P {}
struct S: P {}
// 'any P' is an explicit existential type.
let p: any P = S()
// 'some P' is an opaque type.
let p: some P = S()
func f(_ p: any P) {}
func f(_ p: some P) {}
In Swift 5 the any
keyword can be used to explicitly denote an existential type. Starting from Swift 6, existential types are required to be explicitly spelled with the any
keyword.
An opaque type helps describe the expected return type without defining a specific concrete type. This way, the compiler can get access to the actual type information and can potentially perform optimizations in this context.
Existential types incur notably higher costs compared to using concrete types. An existential type can store any value that conforms to the protocol, and the type of the value can change dynamically, requiring dynamic memory allocation. Code that uses existential types incurs pointer indirection and dynamic method dispatch that cannot be optimized.
Table of contents
Open Table of contents
Understanding the some Keyword
some
keyword represents an opaque type. An opaque result type is an implicit generic placeholder satisfied by the implementation, so you can think of this:
protocol P {}
struct S1 : P {}
func f() -> some P {
return S1()
}
The key takeaway here is that a function that produces a type P
specifically returns a value of a singular, concrete type adhering to P
. If the function attempts to return various conforming types, it will result in a compiler error. As the implicit generic placeholder cannot be satisfied by multiple types.
struct F1: P {}
struct F2: P {}
// error: Function declares an opaque return type, but the return
// statements in its body do not have matching underlying types.
func f(_ value: Bool) -> some P {
if value {
return F1()
} else {
return F2()
}
}
Let’s consider the benefits that opaque types offer over protocol return types.
-
Opaque result types can be used with PATs The main limitation of using protocols is that protocols with associated types cannot be used as actual types. This means that the following code doesn’t compile:
func collection() -> Collection { return ["1", "2", "3"] }
As for opaque types, they are merely generic placeholders that can be used in such scenarios:
// protocol Collection<Element> : Sequence func collection() -> some Collection { return ["1", "2", "3"] }
-
Opaque result types have identity Because opaque types guarantee that only one type will be returned, the compiler knows that a function must return the same type on several calls:
func method() -> some Equatable { return "method" } let x = method() let y = method() print(x == y) // true
-
Opaque result types compose with generic placeholders Contrary to conventional protocol-typed values, opaque result types integrate effectively with standard generic placeholders. For instance:
protocol P { var message: String { get } } struct M: P { var message: String } func makeM() -> some P { return M(message: "message") } func bar<T: P, U: P>(_ p1: T, _ p2: U) -> Bool { return p1.message == p2.message } let m1 = makeM() let m2 = makeM() print(bar(m1, m2))
However, it doesn’t work if make
M()
returns different types based on protocolP
.protocol P { var message: String { get } } struct M: P { var message: String } struct T: P { var message: String } // error: function declares an opaque return type 'some P', but the return statements in its body do not have matching underlying types func makeM() -> some P { if .random() { return M(message: "M message") } else { return T(message: "T message") } }
Understanding any Keyword
Let’s consider the following example. Suppose we have a Drawable
protocol and two concrete implementations of this protocol.
protocol Drawable {
func draw()
}
struct Line: Drawable {
let x1: Int
let y1: Int
let x2: Int
let y2: Int
func draw() {
print("Draw Line")
}
}
struct Point: Drawable {
let x: Int
let y: Int
func draw() {
print("Point")
}
}
We have two struct objects, Line
and Point
, accordingly. Let’s create a variable and store a Drawable
object:
var p1: any Drawable = Line(x1: 0, y1: 0, x2: 5, y2: 5) // 'any Drawable' is an explicit existential type
p1.draw() // print "Draw Line"
p1 = Point(x: 0, y: 0)
p1.draw() // print "Point"
We can switch between different implementations during runtime. Let’s consider another example:
let array: [any Drawable] = [
Line(x1: 0, y1: 0, x2: 5, y2: 5),
Line(x1: 0, y1: 0, x2: 5, y2: 5),
Point(x: 0, y: 0)
]
As we’re aware, achieving random access to any element within an array in constant time is feasible due to each element having the same memory size. In our example, where we store elements of varying sizes, you might wonder how we can retrieve an element from the array in such a scenario.
print(MemoryLayout<Line>.size) // 32
print(MemoryLayout<Point>.size) // 16
It’s possible because the any
means indicates that we work with existential containers. The existential container encapsulates five machine words, allocating three for storing an object or a pointer to the object, one for a pointer to the virtual table, and another for a pointer to the witness table.
The existential container takes 5 machine words (in x64-bit system 5 * 8 = 40):
- Value buffer is space for the instance
- VWT is a pointer to Value Witness Table
- PWT is a pointer to Protocol Witness Table
Whenever a value of the protocol is used in code, the compiler generates a box that we call existential container. One box for one value. Since they can store any value whose type conforms to the protocol, and the type of the stored value can change dynamically, the existential container requires dynamic memory allocation if only the value isn’t small enough to fit into a buffer of 3 machine words. In addition to heap allocation and reference counting, every use of an existential container encounters with pointer indirection and dynamic dispatch, which cannot be optimized.
Three words are used either inlining a value if it fits 3 machine words, or, if the value is more than 3 words machine words, an ARC-managed box is created. The value is then copied into the box, and a pointer to the ARC-managed box is copied into the first word of the container. The remaining two words are not used. The following words are used to point to the Value Witness Table (VWT) and Protocol Witness Table (PWT), respectively.
Protocol Witness Table is an array that contains one entry for each protocol statically associated with the existential type. If the type of a value, let’s say, is something like any P & Q
there will be two entries, one for each protocol. As long as we have the Protocol Witness Table (PWT) that describes the type’s adherence to the protocol, we can create an existential container, pass it around, and use it for dynamic dispatch.
So, we have an existential container, and we have a protocol witness table. However, we don’t know where this value should be stored in the existential container — whether it’s stored directly in the container or if we only have a reference to the value in the heap. To answer these questions, we turn into an auxiliary table called the Value Witness Table (VWT). This table contains function pointers for allocate
, copy
, destruct
, deallocate
. Every type has such a table, and it takes on the responsibility of creating an instance of the type, deciding where the value should be stored — in the stack or in the heap, etc.
Now we have a fixed-size container, which solves the issue of passing parameters of heterogeneous sizes, and a way to move this container. The remaining question is, what do we do with it?
Well, at the very least, we want to be able to invoke the necessary members of the stored value in the container, as specified in the protocol (initializers, properties — whether stored or computed, functions, and subscripts).
However, each corresponding type can implement these members differently. For instance, some may satisfy the method requirement by directly defining it, while another, say a class, may satisfy it by inheriting the method from a superclass. Some may implement a property as a stored property, others as a computed one, and some retroactively (via an extension).
Handling all this diversity is the primary purpose of the Protocol Witness Table (PWT). There is precisely one such table for each protocol conformance. It contains a set of function pointers that point to the implementation of the protocol’s requirements. While the functions may be located in different places, the location of the PWT is unique for the protocol it corresponds to. As a result, the caller can inspect the PWT of the container, use it to locate the implementation of the protocol’s methods, and invoke them.
It’s important to know that existentials are relatively expensive to use because the compiler and runtime can’t pre-determine how much memory should be allocated for the concrete object that will fill in the existential.
Summary
We considered the differences between some
and any
keywords. On one hand, it significantly enhanced the syntax and readability of our generic code, on the other hand, it introduces new avenues for us to craft generic code in a more efficient manner.
Thanks for reading.
Thanks for reading
If you enjoyed this post, be sure to follow me on Twitter to keep up with the new content.