Bazel interview at Software Engineering Daily
A detailed summary of the many topics we discussed during this fun interview
Just a bit over 2 months ago, on October 5th, 2023, Jordi Mon Companys interviewed me about Bazel for an episode in the Software Engineering Daily podcast. The episode finally came out on December 18th, 2023, so here is your announcement to stop by and listen to it!
If you don’t have time to listen to the whole 45 minutes, or if you want to get a sense of what you will get out of it, here is a recap of everything we touched on. Every paragraph is annotated with the rough time where the discussion starts so that you can jump right in to whatever interests you the most.
My background
[01:10] Introduction about myself. Really brief overview about my roles and the companies I’ve been at throughout the years.
[01:58] A retrospective about how I got into computers and why I think things were much easier to dive into years ago. In other words, a plug for EndBASIC.
[03:34] My history with open source projects and how I got into Linux, the BSDs, and why I ended up as a contributor to NetBSD.
[05:05] How my background helped me join Google as a Site Reliability Engineer (SRE), even if I did not have much experience in systems administration. SRE is a really cool position because of the many paths that lead into it!
[05:51] Why I moved years later to the Bazel team as a software engineer while remaining at Google.
Introduction to Bazel
[06:57] Google and internal tools. Why Google invented “everything” in house, including Bazel and the distributed build services it relies on.
[08:35] Brief introduction to what Bazel is: a polyglot build system.
[10:11] Thoughts on the distinction—or rather, lack thereof—between build systems and CI systems. While they have been traditionally thought of as separate, they are pretty intertwined and you need your CI system to be as aware of the build process as possible to be maximally efficient.
[10:56] Brief introduction to what the Starlark language is, why it exists, and how it is used to define Bazel rules. Includes a brief explanation of major Bazel concepts such as targets, rules, and actions.
[13:28] Thoughts on why Bazel is a good fit for a monorepo.
Build incrementality
[14:45] Description of how build incrementality works and how Bazel determines which parts of the build graph to rebuild. Covers how traditional file systems use timestamps, how Bazel tries to do something better, and how you can do even fancier stuff if you have the right support from the file system.
[17:36] Deeper dive into Skyframe: the component within Bazel that tracks the build graph and helps decide which parts need to be rebuilt every time.
[19:02] Separation of local machine vs. remote machine resources: what parts of the build happen where? Is the build graph “in the cloud” too?
[20:06] Notes on how Bazel decides how to behave (spolier alert: manual configuration) and a mini-rant about Bazel having too many flags.
Dependency management
[21:12] Exploring dependency management for third-party packages, focusing on how Google had traditionally done this in their monorepo.
[22:16] Deeper dive into the history of dependency management by looking into how the workspace file came to be and why it has limitations.
[24:01] Extending Bazel, or rather the difference between the Java core vs. the Starlark ecosystem, and how the Build API bridges the gap between the two.
[25:34] Brief digression into the newly-released Buck2.
Bazel migrations
[26:23] Migrating to Bazel at a company like Snowflake. Exploration of where the difficulties arise.
[28:50] Discussion on how platform engineering is becoming its own thing, with the goal of building developer tools and experiences as company-internal products.
[30:44] Highlights of the primary (positive) benefits for the users after a Bazel migration: fast builds and no more “make clean”.
Remote execution
[32:59] Integration of Bazel with remote services. Differences between remote caching and remote execution, and an exploration of different implementations of each.
[34:25] Brief discussion on generating SBOMs and how Bazel’s dependency tracking is perfectly aligned to provide these.
[35:26] Sandboxing. A discussion on the different levels of what you can restrict per platform and the trade-offs in complexity and performance. In other words: sandboxing is a spectrum.
Forward-looking plans
[37:15] Bazel’s future plans. Not my plans nor predictions, but rather what’s coming up in Bazel 7. (Already released at the time of this writing.)
[38:14] My wishlist for Bazel, which is basically what I brought up back in 2015: it’d be awesome if it was smaller so that it could be usable in the smaller projects that lie in the foundations of the Unix systems we use today.
[38:49] Thoughts on the upcoming BazelCon. (It was awesome. See my attendance report!)
Source control
[39:30] Digression on source control, how Git is not great for monorepos, and how Google ended up building their own thing after using Perforce for many years. Also a brief explanation on how Bazel can leverage source control system integrations for better performance.
[42:28] And a related follow-up topic: test selection strategies in a massive monorepo, starting with how unit tests are easy to handle but doing something smart about integration tests requires heurisitics or maybe some sort of AI.