CI: Thinning iOS Build Artifacts

1. Introduction

1.1. Pipeline Time Improvement

As engineers, we always want to land our change on master as quickly as possible. Apart from the time it takes to resolve code review comments from peers, there’s one constraint engineers have to face before getting their changes merged. That constraint is the CI pipeline time (ie. the time it takes for a CI pipeline to run against certain changes).

For iOS development, a typical CI pre-merge pipeline usually involves building the project, then running unit tests and UI tests. Therefore, the engineering work to reduce pipeline time can be broken down into 2 major problems: build time improvement and test time improvement. While build time improvement is a classic problem that is tackled by many initiatives (mostly driven by community’s efforts), there are still many rooms for test time improvement.

1.2. Separation of Build and Test Jobs in iOS

I once shared a tip to utilize test parallelism within one single CI job (see: Tackling UI tests execution time imbalance for Xcode parallel testing). Regarding parallelism among CI jobs, it can be done by splitting tests into smaller sets and run them in multiple CI jobs. As following is the form of pipelines we desire to have: png

However, the separation of build and test jobs in iOS is not that straightforward because of 2 prominent factors:

(F1) It’s not feasible to obtain the list of tests we have at compile time. We can only use heuristic approaches to extract such info.

(F2) Running unit/UI tests requires some build products from the build job. This is a hard constraint for platforms with statically-typed languages. What makes it harder for iOS development is that the iOS build artifacts are relatively big. Unfortunately, the size of artifacts we pass from one job to another is constrained by the size limit set by the CI infra.
The growth in size of an iOS project is propotional to not only the number of code lines but also the number of targets we have.

This post introduces a tip to overcome the artifacts size constraint in (F2).

2. A Closer Look at iOS Build Artifacts

Now, let’s take a look at a project that has both hosted test targets and non-hosted test targets. Assume we’re using CocoaPods to manage dependencies in your project. When building the project, the build products folder would be in the following structure.

|-- App.app / -- App (*)
|           |
|           |-- Frameworks / -- DynamicFW_A.framework
|           |               |-- DynamicFW_B.framework / -- DynamicFW_B.bundle
|           |
|           |-- Plugins / -- HostedTestTarget.xctest
|           |
|           |-- StaticFW_C.bundle
|
|-- NonHostedTestTarget.xctest

The .app bundles and .xctest bundles are both executable bundles. They have the same folder structure. Inside such a bundle:

3. Duplicated Contents

3.1. Dynamic Frameworks

From the structure above, we can easily spot the duplication that if a dynamic framework is used in both the app and a test target, that framework exists in 2 places:

In general, if a framework is used in N targets, it appears N times in the executable bundles. If we check the checksums of the frameworks in those executable bundles, they are all identical.

To understand why this happens, we can take a look at how CocoaPods integrates frameworks to the project.

For dynamic frameworks added to a target, CocoaPods adds a build phase called [CP] Embed Pods Frameworks at the end of the target build phases. png

This build phase actually executes a script "${PODS_ROOT}/Target Support Files/Pods-App/Pods-App-frameworks.sh" to copy all dynamic frameworks (managed by CocoaPods) belonging to the target to the Frameworks folder inside the executable bundle of that target. Those frameworks were copied from the framework build products located at the same folder as the executable bundle.

Debug-iphonesimulator / -- App.app / -- Frameworks / -- Dynamic_A.framework
                       |
                       |-- Dynamic_A / -- Dynamic_A.framework 👈 👈 👈

One thing that’s worth a mention is that the checksum of the framework in the framework build products (ex. Debug-iphonesimulator/Dynamic_A) is different from the one in the Frameworks folder of the app bundle. This is because CocoaPods strips some unnecessary info of frameworks while copying them to the app bundles. The stripped info includes Headers, PrivateHeaders, Modules folder inside the .framework bundle and so forth.

install_framework()
{
  ...
  rsync --delete -av "${RSYNC_PROTECT_TMP_FILES[@]}" --links --filter "- CVS/" --filter "- .svn/" --filter "- .git/" --filter "- .hg/" --filter "- Headers" --filter "- PrivateHeaders" --filter "- Modules" "${source}" "${destination}"
  ...
}

3.2. Resources/Resource Bundles of Static Frameworks

For pods integrated as static frameworks, their resources and resource bundles will be copied to the executable bundles. And the duplication of these contents takes place the same way dynamics frameworks getting duplicated.

App.app / -- App (*)
         |
         |-- StaticFW_C.bundle
         |
         |-- ResourcesOfStaticFW_D / -- an_image.png

4. Reducing Artifacts Based on Duplications

Based on the observations above, we can thin the artifacts by storing the bundle contents in a storage. Each bundle is unique by its checksum.

png

To keep track of the original location of a bundle inside the storage, we need a mapping that maps the original location of a bundle to its place in the storage. This way, after thinning the artifacts, we can easily recover the artifacts to its original state. The contents integrity remains unchanged.

storage/ -- <hash_a>-A.framework
        |-- <hash_b>-B.framework
        |-- <hash_c>-C.framework
        |-- ...
        |-- storage.json # Contains the mapping

With this storage mapping solution, there are 2 additional steps running on CI:

png

5. Discussion

First, the technique to remove duplications like above is nothing new, and is not specific to iOS development. It can be applied to any project. It’s just that the problem becomes more noticeable with iOS projects because of the way Xcode structures the build products.

Second, The number of dynamic frameworks in the project plays an important role in the performance of this solution. Normally, inside a .framework bundle, for ex. A.framework, the framework binary A.framework/A takes up most of the space. However, in case of static frameworks, their binaries are merged into the executable binaries during the linking step (done by the ld linker). That means, we cannot reduce much for static frameworks except their resources and resource bundles.

In some cases, we observe 2 frameworks with the same name but having different checksums in the storage.

storage / -- <hash_1>-A.framework
         |-- <hash_2>-A.framework

This sometimes happens when you declare pods with different forms in different targets. For example, one target is using a pod with subspec A/Child_1 and another target is using A/Child_2. Another scenario this might happen is when you use a different dependencies manager that strips framework bundles differently. In such cases, you can run another round of optimization on the storage.

Last, does this work if we don’t use CocoaPods as the dependencies manager in the project? Actually, what other dependencies managers do is similar in essence. This tip is based on the duplications (in executable bundles structure) when we have more targets in the project. It should be general and independent of what dependencies manager we’re using.