Swift.org - Swift Everywhere: Using Interoperability to Build on Windows

Saleem Abdulrasool is a member of the Swift Core Team and a Software Engineer at The Browser Company, and previously worked at Google Brain, Facebook, and Microsoft, and currently focuses on cross-platform and embedded Swift.

This post was originally published at Speaking in Swift by The Browser Company under the title “Interoperability: Swift’s Super Power”.

Swift’s deliberate design choices over the years has resulted in a language that showcases how flexibility and compatibility does not need to come at the cost of usability. One of these design choices was Swift’s focus on native interoperability with other languages. The flexibility that this enables makes it a joy to build rich, native experiences in Swift across a variety of environments.

Traditionally when two languages need to interoperate, the function calls at the boundary between the two languages, also known as the Foreign Function Interface (FFI), will go through C using a library like libffi. This approach has some drawbacks such as incurred runtime performance costs and possibly extra boilerplate code. Instead, Swift embeds a copy of clang, the C and C++ compiler, which is able to directly translate between the languages avoiding penalties in code size and runtime performance. This level of interoperability composes wonderfully with existing systems and enables building complex software atop existing C libraries.

The Windows API

When building native rich, native applications, one important use-case of interoperability is the ability to invoke platform-specific APIs. The Windows API surface reflects its extensive history; the requirement to maintain backwards compatibility has resulted in the accretion of APIs of different shapes. As such, a significant portion of the API is old and low-level enough to be defined in C.

Since Swift uses clang rather than libffi to access C functions and data types, the Swift compiler uses a feature of clang known as (header) modules. Clang modules bundle a set of declarations together, identifying which declarations belong to a particular library, what other modules it may depend on, and what language the declarations are for. This is done by introducing an auxiliary file named module.modulemap which contains the definition of the module.

As such, to access the Windows APIs, we must modularize the Windows SDK into one or more clang modules. Fortunately this is not just a theoretical idea. The Swift toolchain contains a module definition for the Windows SDK in the form of the WinSDK clang module. To further refine these definitions, a Swift module overlays the clang definitions to provide more Swift friendly definitions in some cases. This exposes the C API surface area of the Windows SDK, and although does not contain all the more modern APIs, enables us to build a variety of command line and GUI apps on Windows.

A GUI application using Swift/Win32, which provides a layer of Swift syntactic conveniences for older, C-based Windows UI APIs.

Modern APIs aren’t exposed using just C, however, and there are large portions of the Windows SDK which are exposed as C++. Realising there is a vast software ecosystem of C++ code that Swift developers may want access to, Swift 5.9 has introduced support for extending its language level interoperability to C++. Although virtual methods and copyable types are not yet available, as Swift’s C++ interoperability matures, the native platform API surface available to Swift will also grow to include the majority of the C++ APIs in the Windows SDK.

This C++ Interop enables a new set of libraries, beyond just the platform APIs, to become available to Swift. This allows Swift code to also take advantage of a variety of high-performance, cross-platform libraries written over decades by the C++ community. Firebase, for example, is a commonly used cloud computing service and is used in many modern products, including The Browser Company’s browser, Arc. Although there is a Swift SDK for Firebase, it is limited to the Apple platforms and is based on Objective-C. However, there is also a cross-platform C++ SDK available. Now with C++ Interop, it is possible to expose this C++ SDK to Swift clients. Such a bridge is being built up with swift-firebase. Taking advantage of these C++ libraries cross-platform Swift software that would be difficult to build otherwise.

Component Object Model (COM)

While libraries are one mechanism for sharing code, they are not the only approach. Another style of code sharing is possible via inter-process communication (IPC), which allows two separate applications to communicate with each other and expose functionality to each other. One implementation of this technique that is prevalent on Windows is known as COM (Component Object Model).

Microsoft explored this idea at a higher level in 1990, evolving DDE (Dynamic Data Exchange) into “Object Linking and Embedding” or OLE. The approach was to enable sharing of custom document handlers which could be embedded into new applications without having to rewrite parsers and renders for the formats. To share the implementation of applications across processes, an application could implement well-defined interfaces (e.g. IOleObject) that could be consumed by other processes. Eventually, OLE would evolve into what would become to be known as the Component Object Model, or COM.

COM’s design was flexible and powerful, and resulted in it being adopted as a common design pattern across a multitude of environments. CoreFoundation adopted it for its plugin model. CFLite, and various forks thereof, brought an implementation of COM to Linux. XPCOM (Cross-Platform Component Object Model) is similar to COM and would gain popularity through Mozilla’s extensive usage, as would Open Office’s UNO technology. The model even found its way into driver development with the IOKit framework using a COM based model for kernel drivers.

At COM’s core is the idea of defining interfaces (which is normally done in the Interface Definition Language or IDL) that expose functionality to either through a library in the same address space or another process through IPC. Interfaces are identified by globally unique Interface IDs, and all inherit from a base interface called IUnknown. IUnknown exposes the two fundamental operations of COM:

object lifetime management
access to the object’s functionality

Similar to Swift, object lifetime management is implemented through reference counting, exposed in COM via the AddRef and Release methods. Access to the object’s functionality is implemented via the QueryInterface method, allowing consumers to dynamically request for the object’s functionality. Because consumers dynamically query for a specific COM interface, we cannot statically identify the operations at build time. But the cost is limited to a couple of pointer indirections, similar to C++’s virtual methods, which gives COM a negligible performance overhead.