A big problem with Bazel not mentioned here is the complexity. It's just really hard for many people to grasp, and adopting Bazel at the two places I worked was a ~10 person-year effort for the rollout with ongoing maintenance after. That's a lot of effort!
IMO Bazel has a lot of good ideas to it: hierarchical graph-based builds, pure hermetic build steps, and so on. Especially at the time, these were novel ideas. But in Bazel they are buried behind a sea of other concepts that may not be so critical: `query` vs `aquery` vs `cquery`, action-graph vs configured-action-graph vs target-graph, providers vs outputs, macro vs rule-impl, etc. Some of these are necessary for ultra-large-scale builds, some are compromises due to legacy, but for the vast majority of non-Google-scale companies there may be a better way.
I'm hoping the next generation of build tools can simplify things enough that you don't need a person-decade of engineering work to adopt it. My own OSS project Mill (https://mill-build.org/) is one attempt in that direction, by re-using ideas from functional and object-oriented programming that people are already familiar with to make build graphs easier to describe and work with. Maybe a new tool won't be able to support million-file monorepos with 10,000 active contributors, but I think for the vast majority of developers and teams that's not a problem at all.
> other concepts that may not be so critical: `query` vs `aquery` vs `cquery`, action-graph vs configured-action-graph vs target-graph, providers vs outputs, macro vs rule-impl, etc
Almost all of the distinctions you mentioned are related to the way that Bazel has the concept of a "target", which lets the build graph work at a higher level than individual files.
This lets us define, at a high level, that ":foo" and ":bar" are C/C++ libraries, and that bar depends on foo. This is the build graph of targets, and it's independent of any particular files that these rules may produce (.o, .a, .so, etc).
It's nice to be able to query the build graph at this high level. It lets you see the relationship between components in the abstract, rather than a file-by-file level. That is what "bazel query" does.
But sometimes you might want to dig deeper into the specific commands (actions) that will be executed when you build a target. That is what "bazel aquery" is for.
Macros vs. rules is basically a question of whether the build logic runs before or after the target graph is built. A macro lets you declare a bit of logic where something that looks like a target will actually expand into multiple targets (or have the attributes munged a bit). It is expanded before the target graph is built, so you won't see it in the output of "bazel query."
If you took away the target graph, I think you'd take away a lot of what makes Bazel powerful. A key idea behind Bazel is to encapsulate build logic, so that you can use a rule like cc_library() without having to know how it's implemented or exactly what actions will run.
I don't say this to minimize any of the pain people experience when adopting Bazel. I'm actually curious to learn more about what the biggest pain points are that make it difficult to adopt.
This great comment is another example of what's bad about Bazel: It has the least enlightening documentation. Bazels docs are thorough and useless. Every page you read assumes you already understand the concepts described on the page.
This comment explains a query, actions, and macros pretty decently, and I doubt you could find an explanation of these things in the Bazel docs that a new user could understand.
I care a ton about fast and accurate build systems but one issue I think we haven't solved is that: people do not want to use other build tools for their language. "Why isn't it Cargo? Why not use NPM/Yarn? Why not use Pip? Why not CMake?" These questions are often rhetorical because they do not care about build systems. They don't care about the design. They don't care if your CI could be 5x faster. You will never make them care. It's good enough. You must have absolutely zero externalized cost (and therefore put in a lot of effort) to get over this hurdle. There's seemingly no way around it.
The reason a lot of people like Bazel is, I think, tools like Gazelle -- which reduce that whole problem back to "Run gazelle" and all the crap is taken care of for you. Dependencies, BUILD files, etc. People constantly talk about the "complexity" aspect, but very few people appreciate how complex Cargo, NPM, Yarn, Cabal, Dune, internally are. Because they just run "build", and it works. Bazel, Buck2, Mill, etc will all have this problem unless huge effort is put in.
TBH, this is one of the reasons why I think Nix has wildly succeeded in the past few years while more fine-grained and scalable systems have had adoption problems -- despite its numerous, numerous flaws. You get to Bring-Your-Own-Build-System, and Nix along with the blood of 10,000 upstream contributors keeps the juice flowing, and it's cached and hermetic so you see real savings. That greatly eases people into it. So they adopt it at all points on the curve (small, medium, huge projects), because it works with what they have at all those points. That makes them willing to get deeper and use the tool more.
Bazel runs into the problem that it expects to have the complete well-defined understanding of the inputs and outputs for your project. This might have made sense when Blaze was first designed and projects were done in languages with compilers that had rigid inputs and outputs. But now we're in a world where more and more systems are becoming layers of compilers, where each compiler layer wants to just have a bunch of dependencies thrown at it. In a frontend project, it wouldn't be weird for Tailwind CSS to be compiled and embedded in SCSS, where it's pulled into a JSX module via some import that magically provides type checking with the CSS under the hood. And so you either need to handwave over it and lose some of the benefits of incremental builds, or spend time getting it to work and making it continue to work as you add new layers.
So in my mind, Bazel is no longer worth it unless the savings are so great that you can afford to staff a build team to figure these things out. Most teams would benefit out of using simple command runners instead of fully-fledged build systems.
I'm glad to see someone else describe their experience this way too.
bazel has arrived at $WORK and it has been a non-trivial amount of work for even the passionate advocates of bazel. I know it was written by the Very Smart People at google. They are clearly smarter than me so I must be the dummy. Especially since I never passed their interview tests. :-)
Of course given all things google, by the time I'm fully onboard the train, the cool kids will be making a new train and then I'll have to hop onto that way to enjoy the rewards of the promised land that never quite seem to arrive.
> know it was written by the Very Smart People at google
For Google. That's the key. I have the privilege of experiencing both sides, having been at Google for nine years. I never had a problem with Blaze, but using Bazel in a smaller company has been extremely painful. I think there are just very few places that have the exact problems as Google where something like Bazel would be a great fit.
That's the rub. It provides scalability for very large organization, of which, there are few. It's similar to running OpenStack. Meta also has some projects like this, such as buck2 which lacks the really good virtual FS acceleration stuff (eden). Megacorp FOSS tend to skew towards offering whizbang features that are incomplete, complicated, poorly documented, and require a lot of extra work.
Actually if you could make something like github, where all software would be part of a single megarepo and built constantly that would be incredibly useful, and bazel would be excellent for that (or at least the closest thing we have to reasonable)
The problem with bazel and almost every other build system (all except the "scan the source files and build a dependency graph" ones) is that you'll be writing build instructions for all your dependencies that aren't using it. If that was done for you, they'd be incredible.
Compiling things and wanting a robust build cache so developers spend less time waiting isn't a problem remotely unique to Google. You might not have Google scale to hire a team of developers to be able to optimize it to the N'th degree like they can, but holy shit we are not gonna use Makefiles for advanced build systems anymore.
> Compiling things and wanting a robust build cache so developers spend less time waiting isn't a problem remotely unique to Google.
That wasn't my argument at all. Plenty of modern tools address this exact need; it isn't unique to Bazel. If you read the article, the author made many interesting remarks on how Bazel reflects the unique design choices of Blaze, which were often picked due to Google's needs.
My point is that when people hit these barriers, they need to understand that it's not because they are unintelligent or incapable of understanding a complex system. That's what the OP I responded to was saying, and I was just providing some advice.
Bazel is an investment to learn for sure but your effort estimates are way overblown. We have a Python, Go, and Typescript monorepo that I setup for our team and rarely have to touch anything. Engineers rarely think about Bazel as we use Gazelle to generate all our build files and have patterns for most everything we need to do.
Compared with build efforts using other tools and non-monorepo setups at other companies the effort here has felt much reduced.
I find the dismissal of buck2 pretty shallow. Most of the world is not already heavily invested to bazel, so compatibility is imho overstated; I don't see it being that far-fetched for something like buck2 to leapfrog bazel. That being said, buck2 definitely would need some love from outside meta to really be viable competitor, right now it still feels like half-complete code drop
I agree. The biggest issue with Buck 2 for me (apart from documentation) is the lack of something like bzlmod. There's actually a decent number of modules available for Bazel now:
Seems like a non-issue to me. I work with a relatively large Bazel monorepo, and we have to vendor pretty much anything anyways. Many 3rd-party rules might need patches to work properly with our custom toolchains/rules.
Bzlmod sounds nice for small projects, but for big monorepos within organizations with established processes it is more of a hassle, and I imagine that most Bazel users are not using it for small projects.
Yeah it's more of an issue for small projects. But I don't think Buck/Bazel should be reserved for megarepos with thousands of contributors. Why can't small projects use it?
To be honest, I'd rather not see Buck2 repeat the mistakes of Bazel so early on, especially when it took a lot of time before settling on bzlmod.
Frankly, I'd rather it go the other way: just have a gigantic 'buck2pkgs' repo that has everything inside of it -- literally fucking everything -- just like Nix and Nixpkgs have done. I've watched and committed to Nixpkgs for over 10 years now and I think being a monorepo has massively contributed to its success, ease of contribution, and general usability. And in practice it means 99% of projects only need one "dependency" and they are done and have everything they could want.
In theory a buck2pkgs could even be completely divorced from Meta's existing Prelude, because you can just reimplement it all yourself, without being constrained by their backwards compatibility with Buck1. That would reduce the surface area between the two projects substantially and let you do a lot of huge cleanups in the APIs.
I actually started doing this a while back but it is of course a massive amount of work. I do think it's a much better way to go, though...
I miss Meta's arc + buck2 + eden + "hg" + "phabricator" + testing infrastructure that does a lot of acceleration of builds, testing, and landing commits without rebasing.
The biggest problem with everything NotBazel in this space is IDE support. JetBrains have already moved their build for IntelliJ to it, and are aggressively adding native first party support.
Great article, that gets the critique exactly right. The most frustrating part of Bazel is how shoddy the workmanship is. For example, Bazel throws away your analysis cache when you change flags that have nothing to do with what's being built or how, like flags that change what tests are run.
If course the biggest issue is that tiny operations take more time than necessary. For example, at $PreviousJob we wrote custom CLI tools and put them in our monorepo, but since you need Bazel to run them, running a basic tool took 9-12 seconds. So this revelation from the article was totally unsurprising:
> This rewrite was carefully crafted to optimize memory layouts, avoiding unnecessary pointer chasing (which was impossible to avoid in Java). The results of this experiment proved that Blaze could be made to analyze large portions of Google’s build graph in just a fraction of the time, without any sort of analysis caching or significant startup penalties.
Yeah, with attentive engineering it's not surprising that you can get massive speedups.
Finally, Bazel is ok at building multiple languages, but awful at getting them to depend on each other. I don't know what's going on here, but whatever magic makes it possible for any language to depend on C and C++ was not extended to other possible dependencies. So now you get to fight or rewrite all the rules, which is another bag of worms.
> The most frustrating part of Bazel is how shoddy the workmanship is.
Without commenting on the above statement:
> For example, Bazel throws away your analysis cache when you change flags that have nothing to do with what's being built or how, like flags that change what tests are run.
I don't think this is a good example. Bazel's analysis cache clearing is to preserve correctness in the context of a legitimately algorithmically difficult problem. The fact Bazel has this limitation is a testament toward its correctness strengths. I'm not aware of systems that have solved that algorithmic problem but curious if anyone knows any.
Also Bazel avoids that problem for most "flags that change which tests are run", since that subset of the problem is more solvable. --test_env was a notable exception, which was fixed (https://github.com/fmeum/bazel/commit/eb494194c1c466f7fd7355...) but not sure if it's in the latest Bazel release yet? But generally changing --test* has much smaller performance impact.
>On the other hand, we need a tiny build system that does all of the work locally and that can be used by the myriad of open-source projects that the industry relies on. This system has to be written in Rust (oops, I said it) with minimal dependencies and be kept lean and fast so that IDEs can communicate with it quickly. This is a niche that is not fulfilled by anyone right now and that my mind keeps coming to;
I believe that the biggest problem is that different "compilers" do different amount of work. In the race to win the popularity contest many, and especially newer languages offer compilers packaged with "compiler frontend", i.e. a program that discovers dependencies between files, links individual modules into the target programs or libraries, does code generation etc. This prevents creation of universal build systems.
I.e. javac can be fed inputs of individual Java source files, similar to GCC suite compilers, but Go compiler needs a configuration for the program or the library it compiles. Then there are also systems like Cargo (in Rust) that also do part of the job that the build system has to do for other languages.
From a perspective of someone who'd like to write a more universal build system, encountering stuff like Cargo is extremely disappointing: you immediately realize that you will have to either replace Cargo (and nobody will use your system because Cargo is already the most popular tool and covers the basic needs of many simple projects), or you will have to add a lot of work-arounds and integrations specific to Cargo, depend on their release cycle, patch bugs in someone else's code...
And it's very unfortunate because none of these "compiler frontends" come with support for other languages, CI, testing etc. So, eventually, you will need an extra tool, but by that time the tool that helped you to get by so far will become your worst enemy.
I have seen this first hand with Bazel. You have lots of Bazel rules that are partial reimplementations of the language specific tooling. It usually works better - until you hit a feature that isn’t supported.
I think the idea for these words here is more about preferring speed over remote execution and large build caching type of features, but not limiting the subset of toolchain functionality etc.,. In theory if you scoped your build tool to only support builds of sufficiently small size you can probably remove a lot of complexity you have to deal with otherwise.
Intelligent caching is also table-stakes though. It requires a detailed dependency graph and change tracking, and that's not something that can simply be relegated to a plugin— it's fundamental.
Right, and I think that's a combination of a few factors— first of all, there's the basic momentum that CMake is widely known and has a huge ecosystem of find modules, so it's a very safe choice— no one got fired for choosing https://boringtechnology.club
But bigger than that is just that a lot of these build system and infrastructure choices are made when a project is small and builds fast anyway. Who cares about incremental builds and aggressive caching when the whole thing is over in two seconds, right? Once a project is big enough that this starts to be a pain point, the build system (especially if it's one like CMake that allows a lot of undisciplined usage) is deeply entrenched and the cost of switching is higher.
Choosing technologies like Nix or Bazel can be seen as excessive upfront complexity or premature optimization, particularly if some or all of the team members would have to actually learn the things— from a manager's point of view, there's the very real risk that your star engineer spends weeks watching tech talks and yak shaving the perfect build setup instead of actually building core parts of the product.
Ultimately, this kind of thing comes back to the importance of competent technical leadership. Infrastructure like build system choice is important enough to be a CTO call, and that person needs to be able to understand the benefits, to weigh activation costs against the 5-10 plan for the product and team, and be able to say "yes, we plan for this thing to be big enough that investing in learning and using good tools right now is worth it" or "no, this is a throwaway prototype to get us to our seed money, avoid any unnecessary scaffolding."
In the 2020s I have used BUCK, BUCK2, and Blaze every working day. I don’t feel like getting in to details but I’ll just say that BUCK2 is the best of the 3 and I’m delighted it’s open source.
I don’t have experience with Bazel as it exists externally but I expect the internal Blaze has advanced far beyond it by now as the entire Google engineering world depends on it.
One thing not mentioned is the massive amount of complexity and difficulty involved in Starlark rules. In the next generation of build tools, I really wish Starlark could be replaced with some subset of TypeScript, which would have drastically better IDE support. And I wish the typing would be done in such a way that it's harder to do things that are bad for performance. Starlark is super difficult to read, navigate, and write, and there are a lot of performance gotchas.
I know there are efforts like starpls (language server for Starlark) but in my experience it really falls short. I think buck2 has type annotations but I kind of hate Python-style types, so I wish we could just use TypeScript :P TypeScript is even better positioned to fill this role now that tsc is being re-written in Go.
> Starlark is intended to be simple. There are no user-defined types, no inheritance, no reflection, no exceptions, no explicit memory management. Execution is finite. The language does not allow recursion or unbounded loops.
That has some pretty useful properties for a config language that I'm not sure TypeScript helps much on. That said, Lean also has these properties too as an alternate language with good IDE support, and it would be pretty fun to write proofs of your build scripts.
I worked on a project that used Bazel for a bit (but ended up not being a great fit).
I think the part I hated most about Bazel was the fact that there were in fact two dialects; one for BUILD files and one for .bzl files. The things you do in them are different but the documentation is always vague on what goes where.
I find anyone this dismissive of Java hard to take seriously. Java is a cutting-edge language platform where the achievable performance at the limit is on-par with C++. Also, the article's hook didn't grab me. My laptop is way, way more powerful than any "beefy workstation" they ever assigned me at Google, starting with a dual-core 2400MHz Opteron. Bazel's server performance on my laptop or workstation are total non-issues. There are issues with Bazel of course, that's just not one that smells right to me.
Performance of Java in general and Bazel in particular tends to be not CPU-limited but memory-limited. There was a time when MacBooks maxed out at 16 GB, and running Chrome + an IDE + large Bazel build on a machine with <16 GB memory was not a great experience.
As I said in the article, the lack of green threads, lack of value types, and lack of primitive boxing are all things that have gotten in the way of optimizing Bazel to the levels shown by the prototype Go reimplementation. These are all things that you'd expect a systems language to have, and Java did not have them. These are finally materializing now, but it's a little too late.
Then you also have the slow startup time of the JVM plus Bazel's initialization, which leads to a possibly-unnecessary client/server process design. This is more wishy-washy though, as this might be optimizable to levels where the startup time is unnoticeable -- but the fact that it hasn't happened in years means that it might not be an easy thing to do.
FWIW, the client server design is also used in Buck2, but it has other advantages than just startup time, like keeping track of the filesystem with inotify or watchman so that it already has fresh information about the state of the build graph by the time you run `build`.
I don't disagree in general but in Bazel's case this path has been heavily optimized. Maybe there are limits to it but "java startup slow" is a 101-level complaint.
In fact, I don't even think the client program for Bazel is written in Java, but C++, and the Java daemon it talks to is started when you first attach to a Bazel workspace and it persists, so subsequent interactions are very fast. Just running `bazel` randomly is not truly indicative of what using it feels like, because that's not what people actually use it for. People use it inside a build workspace, that is the case that matters.
Beyond that, other build systems like Buck2 (Rust) also use the client-daemon architecture for a number of other reasons, including the fact that actually-really-large build graphs are far too large to rebuild on every invocation, so the daemon is necessary to keep the build graph and interactively analyze and incrementally invalidate it on demand. Doesn't matter if it's Rust or Java, these things are architectural. That's also one of the points of the article, of course, and why they're theorizing about server-side analysis caches and graphs, etc.
This all indicates to me that the people designing these systems actually do care about interactivity of their tool. It is not a matter of "java startup slow" to use a client-server design, though it certainly is known to help.
Takes 1.97s on my machine, which is a lot of time for just showing a help message. Maybe you have the setup where Bazel is staying alive in the background or something?
Demonization introduces a range of potential cache invalidation issues. The issues are solvable, but whose KPIs depend on getting to the bottom of them?
Do you have specific examples in the context of blaze/bazel here? I think "the set of cache invalidation issues" you're describing are basically "the set of cache invalidation issues blaze/bazel intends to solve", so the answer to "whose KPIs" is "the blaze team".
I haven't used Bazel but I have used buck1 extensively, and the daemonized mode was quite buggy and often required the process to be killed. Quite frequently the process was just wedged, and even when it wasn't, memory leaks were quite common. Standard black box debuggers like strace and DTrace also become harder to use (e.g. you need to attribute a particular set of syscalls to a client, and a mindless "strace buck build ..." doesn't do what you want).
Daemonization is sometimes necessary, but it introduces lifecycle management problems that get in the way of robustness. Daemonization simply because your choice of programming language has bad startup times doesn't seem like a great idea to me.
I think on-disk cache invalidation and in-memory cache invalidation are distinctly different in practice.
I asked this because I've used bazel some, and blaze fairly extensively, and despite having done some deeply cursed things to blaze at times, I've never had issues with it as a daemon, to the point where I'd never consider needing to run it in non-daemonized mode as part of a debug process. It just works.
Second, "startup-time" is probably the least relevant complaint here in terms of why to daemonize. Running `help` without a daemon set up does take a few seconds (and yeah that's gross), but constructing the action graph cold for $arbitrary_expensive_thing takes a minute or two the first time I do it, and then around 1 second the next time.
Caching the build graph across builds is valuable, and persisting it in memory makes a lot more sense than on disk, in this case. The article we're discussing even argues in favor of making this even more extreme and moving action graph calculation entirely into a service on another machine because it prevents the situation where the daemon dies and you lose your (extremely valuable!) cached analysis graph:
> Bonanza performs analysis remotely. When traditional Bazel is configured to execute all actions remotely, the Bazel server process is essentially a driver that constructs and walks a graph of nodes. This in-memory graph is known as Skyframe and is used to represent and execute a Bazel build. Bonanza lifts the same graph theory from the Bazel server process, puts it into a remote cluster, and relies on a distributed persistent cache to store the graph’s nodes. The consequence of storing the graph in a distributed storage system is that, all of a sudden, all builds become incremental. There is no more “cold build” effect like the one you see with Bazel when you lose the analysis cache.
If you're worried about cache invalidation and correctness issues due to daemonization, I think you'd want to be even more concerned about moving them entirely off machine.
(I'm also not sure how their proposal intends to manage e.g. me and Phil on another machine trying to build conflicting changes that both significantly impact how the analysis graph is calculated and thrash each other, either you have to namespace so they don't thrash and then you've moved the daemon to the cloud, or you do some very very fancy partial graph invalidation approach, but that isn't discussed and it feels like it would be white paper worthy)
For example, Java's signal handling is not up to par with systems languages. CLI tools that orchestrate operations need to have high-quality signal handling in order to be robust.
Proper, dependency-aware build caching makes or breaks a build system. Meta's buck2 builds tend to be crazy fast because of this and it can also do synthetic checkouts with arc/"hg" via eden.
Workflow engines fundamentally separate and structure code and execution in discrete steps with a graph visualizer (hopefully the graph is acyclical).
Maybe it's just a bad example, but the code I saw looked very very unseparated. I understand most people who do builds want start out with the one build file to rule them all model.
But ultimately you are dealing with a workflow execution. If you were doing anything aside the most trivial single library build.
Next generation of whatever build needs to have the visualization and visualized execution assets.
There's a great many things that workflow engines fundamentally applied to. Unfortunately they have been politically associated with management trying to replace developer labor, usually because of the snake oil salesman, the workflow engine people are selling to the the c-suite people.
A good common visualization and execution set that's cross language would have been really helpful in so many computing domains. It's almost like we need the SQL of workflow engines to be invented
You're not wrong about build systems being workflow engines, but in my experience the risk of over-generalizing is a worse user experience. One of the most valuable lessons I've learned in my career is that a clean UX often requires messy abstractions underneath.
> (I can’t name names because they were never public, but if you probe ChatGPT to see if it knows about these efforts, it somehow knows specific details.)
I'm going to laugh if Google joins the NYT in suing OpenAI for copyright infringement from having trained on proprietary data. That said: I also wonder how that actually made it in
These were not well-guarded secrets (and the ideas were pretty "obvious" if you had faced the problems before in dealing with Bazel and large codebases). In talking to people at the conference, several knew about these efforts (and some even knew the codenames), so I assume these had been mentioned here-and-there in discussion threads.
I can say with full certainty that thousands of engineers at Google were feeding their day to day LLM assisted coding queries in to the ChatGPT web ui for about 2 years before a custom
Chrome extension was pushed to employee laptops to stop it.
So GPT has likely post-trained on a lot of Google code knowledge.
I just don't understand how the decision of which bits of a project need rebuilding can be so complex.
If I edit 50 lines of code in a 10GB project, then rebuild, the parts that need rebuilding are the parts that read those files when they were last built.
So... The decision of what to rebuild should take perhaps a millisecond and certainly doable locally.
50 lines in a single .cc source file which is only compiled once to produce the final artifact - sure, easy to handle.
Now consider that you are editing 50 lines of source for a tool which will then need to be executed on some particular platform to generate some parts of your project.
Now consider that you are editing 50 lines defining the structure and dependency graph of your project itself.
• Adding a file. It hasn't been read before, so no tasks in your graph know about it. If you can intercept and record what file patterns a build tool is looking for it helps, but you can't easily know that because programs often do matching against directory contents themselves, not in a way you can intercept.
• File changes that yield no-op changes, e.g. editing a comment in a core utility shouldn't recompile the entire project. More subtly, editing method bodies in a Java program doesn't require the users to be recompiled, but editing class definitions or exposed method prototypes does.
• "Building" test cases.
• You don't want to repeat work that has been done before, so you want to cache it (e.g. switching branches back and forth shouldn't rebuild everything).
If your system is C based, tup [0] fulfills your request by watching the outputs of the specified commands. It isn't, however, appropriate for systems like java that create intermediate files that the developer didn't write [1].
Back to bazel, I am of the impression that some of its complexity comes from a requirement to handle heterogeneous build systems. For example, some python development requires resolving both python dependencies and C. Being good at either is a bunch of work; but, handling both means rolling your own polyglot system or coordinating both as a second class citizen.
It is well possible by changing 50 lines of code in a 10GB project you have to rebuild the entire project, if everything (indirectly) depends on what you just changed.
It is not at all uncommon to have changes percolate out into larger impacts than you'd expect, though. Especially in projects that attempt whole program optimizations as part of the build.
Consider anything that basically builds a program that is used at build time. Which is not that uncommon when you consider that ML models have grown significantly. Change that tool, and suddenly you have to rebuild the entire project if you didn't split it out into a separate graph. (I say ML, but really any simple linter/whatever is the same here.)
> the parts that need rebuilding are the parts that read those files when they were last built...
...and the transitive closure of those parts, which is where things get complicated. It may be that the output didn't actually change, so you can prune the graph there with some smarts. It may be that the thing changed was a tool used in many other rules.
And you have to know the complete set of outputs and inputs of every action.
I'm super into the component model but honestly the tooling is still not ready. Basically only LLVM (and really just C) has good enough tooling. The rest of the wasm tool chains are still heavily focused on core modules and if they support components it's only `wasi:cli/run`
Wasm, specifically using the component model [1], is really good for plugin use-cases.
- Polyglot; you can write plugins (in this case Bazel rules) in "any" language, and the component model is easily the best system ever created for cross-language interoperability.
- Ecosystem; tooling is still generally immature, but you'd be surprised how much buy-in there is. Web apps are a huge incentive for language developers to support it, and Wasm now runs in a lot of places besides the browser too. Envoy and Istio support Wasm plugins. So does Postgres. So does neovim.
- Component interfaces are richly typed. You get things like lists, structs, "resources" (aka objects with instance methods), etc. all supported idiomatically in whichever language you're using and these can be used across language boundaries. Even between GC and non-GC languages. Even with Rust's ownership model.
- Plugins are fully sandboxed, so a host can allow plugins to have things like `while` loops while enforcing runtime limits on a per-instruction basis. It also means Bazel doesn't have to invent it's own bespoke sandboxing solutions for every supported platform, as it does today. And Wasm is designed to handle untrusted Web code, so there are tons of security guarantees.
- Performance is already really good, and there's every reason to believe it will get better. Again, Browser support is big here. The things the Chrome team did to optimize V8 are pretty incredible, and they're now turning their attention to Wasm in a big way.
The author clarifies that he wrote the section about Buck2 to demonstrate the need to be Bazel compatible (as opposed to Buck2), because the friction to try it out in a real code base is essentially insurmountable.
A big problem with Bazel not mentioned here is the complexity. It's just really hard for many people to grasp, and adopting Bazel at the two places I worked was a ~10 person-year effort for the rollout with ongoing maintenance after. That's a lot of effort!
IMO Bazel has a lot of good ideas to it: hierarchical graph-based builds, pure hermetic build steps, and so on. Especially at the time, these were novel ideas. But in Bazel they are buried behind a sea of other concepts that may not be so critical: `query` vs `aquery` vs `cquery`, action-graph vs configured-action-graph vs target-graph, providers vs outputs, macro vs rule-impl, etc. Some of these are necessary for ultra-large-scale builds, some are compromises due to legacy, but for the vast majority of non-Google-scale companies there may be a better way.
I'm hoping the next generation of build tools can simplify things enough that you don't need a person-decade of engineering work to adopt it. My own OSS project Mill (https://mill-build.org/) is one attempt in that direction, by re-using ideas from functional and object-oriented programming that people are already familiar with to make build graphs easier to describe and work with. Maybe a new tool won't be able to support million-file monorepos with 10,000 active contributors, but I think for the vast majority of developers and teams that's not a problem at all.
> other concepts that may not be so critical: `query` vs `aquery` vs `cquery`, action-graph vs configured-action-graph vs target-graph, providers vs outputs, macro vs rule-impl, etc
Almost all of the distinctions you mentioned are related to the way that Bazel has the concept of a "target", which lets the build graph work at a higher level than individual files.
Suppose you write the following in a BUILD file:
This lets us define, at a high level, that ":foo" and ":bar" are C/C++ libraries, and that bar depends on foo. This is the build graph of targets, and it's independent of any particular files that these rules may produce (.o, .a, .so, etc).It's nice to be able to query the build graph at this high level. It lets you see the relationship between components in the abstract, rather than a file-by-file level. That is what "bazel query" does.
But sometimes you might want to dig deeper into the specific commands (actions) that will be executed when you build a target. That is what "bazel aquery" is for.
Macros vs. rules is basically a question of whether the build logic runs before or after the target graph is built. A macro lets you declare a bit of logic where something that looks like a target will actually expand into multiple targets (or have the attributes munged a bit). It is expanded before the target graph is built, so you won't see it in the output of "bazel query."
If you took away the target graph, I think you'd take away a lot of what makes Bazel powerful. A key idea behind Bazel is to encapsulate build logic, so that you can use a rule like cc_library() without having to know how it's implemented or exactly what actions will run.
I don't say this to minimize any of the pain people experience when adopting Bazel. I'm actually curious to learn more about what the biggest pain points are that make it difficult to adopt.
This great comment is another example of what's bad about Bazel: It has the least enlightening documentation. Bazels docs are thorough and useless. Every page you read assumes you already understand the concepts described on the page.
This comment explains a query, actions, and macros pretty decently, and I doubt you could find an explanation of these things in the Bazel docs that a new user could understand.
I care a ton about fast and accurate build systems but one issue I think we haven't solved is that: people do not want to use other build tools for their language. "Why isn't it Cargo? Why not use NPM/Yarn? Why not use Pip? Why not CMake?" These questions are often rhetorical because they do not care about build systems. They don't care about the design. They don't care if your CI could be 5x faster. You will never make them care. It's good enough. You must have absolutely zero externalized cost (and therefore put in a lot of effort) to get over this hurdle. There's seemingly no way around it.
The reason a lot of people like Bazel is, I think, tools like Gazelle -- which reduce that whole problem back to "Run gazelle" and all the crap is taken care of for you. Dependencies, BUILD files, etc. People constantly talk about the "complexity" aspect, but very few people appreciate how complex Cargo, NPM, Yarn, Cabal, Dune, internally are. Because they just run "build", and it works. Bazel, Buck2, Mill, etc will all have this problem unless huge effort is put in.
TBH, this is one of the reasons why I think Nix has wildly succeeded in the past few years while more fine-grained and scalable systems have had adoption problems -- despite its numerous, numerous flaws. You get to Bring-Your-Own-Build-System, and Nix along with the blood of 10,000 upstream contributors keeps the juice flowing, and it's cached and hermetic so you see real savings. That greatly eases people into it. So they adopt it at all points on the curve (small, medium, huge projects), because it works with what they have at all those points. That makes them willing to get deeper and use the tool more.
Bazel runs into the problem that it expects to have the complete well-defined understanding of the inputs and outputs for your project. This might have made sense when Blaze was first designed and projects were done in languages with compilers that had rigid inputs and outputs. But now we're in a world where more and more systems are becoming layers of compilers, where each compiler layer wants to just have a bunch of dependencies thrown at it. In a frontend project, it wouldn't be weird for Tailwind CSS to be compiled and embedded in SCSS, where it's pulled into a JSX module via some import that magically provides type checking with the CSS under the hood. And so you either need to handwave over it and lose some of the benefits of incremental builds, or spend time getting it to work and making it continue to work as you add new layers.
So in my mind, Bazel is no longer worth it unless the savings are so great that you can afford to staff a build team to figure these things out. Most teams would benefit out of using simple command runners instead of fully-fledged build systems.
I'm glad to see someone else describe their experience this way too.
bazel has arrived at $WORK and it has been a non-trivial amount of work for even the passionate advocates of bazel. I know it was written by the Very Smart People at google. They are clearly smarter than me so I must be the dummy. Especially since I never passed their interview tests. :-)
Of course given all things google, by the time I'm fully onboard the train, the cool kids will be making a new train and then I'll have to hop onto that way to enjoy the rewards of the promised land that never quite seem to arrive.
> know it was written by the Very Smart People at google
For Google. That's the key. I have the privilege of experiencing both sides, having been at Google for nine years. I never had a problem with Blaze, but using Bazel in a smaller company has been extremely painful. I think there are just very few places that have the exact problems as Google where something like Bazel would be a great fit.
That's the rub. It provides scalability for very large organization, of which, there are few. It's similar to running OpenStack. Meta also has some projects like this, such as buck2 which lacks the really good virtual FS acceleration stuff (eden). Megacorp FOSS tend to skew towards offering whizbang features that are incomplete, complicated, poorly documented, and require a lot of extra work.
Actually if you could make something like github, where all software would be part of a single megarepo and built constantly that would be incredibly useful, and bazel would be excellent for that (or at least the closest thing we have to reasonable)
The problem with bazel and almost every other build system (all except the "scan the source files and build a dependency graph" ones) is that you'll be writing build instructions for all your dependencies that aren't using it. If that was done for you, they'd be incredible.
Compiling things and wanting a robust build cache so developers spend less time waiting isn't a problem remotely unique to Google. You might not have Google scale to hire a team of developers to be able to optimize it to the N'th degree like they can, but holy shit we are not gonna use Makefiles for advanced build systems anymore.
> Compiling things and wanting a robust build cache so developers spend less time waiting isn't a problem remotely unique to Google.
That wasn't my argument at all. Plenty of modern tools address this exact need; it isn't unique to Bazel. If you read the article, the author made many interesting remarks on how Bazel reflects the unique design choices of Blaze, which were often picked due to Google's needs.
My point is that when people hit these barriers, they need to understand that it's not because they are unintelligent or incapable of understanding a complex system. That's what the OP I responded to was saying, and I was just providing some advice.
ah yeah that's fair
Bazel is an investment to learn for sure but your effort estimates are way overblown. We have a Python, Go, and Typescript monorepo that I setup for our team and rarely have to touch anything. Engineers rarely think about Bazel as we use Gazelle to generate all our build files and have patterns for most everything we need to do.
Compared with build efforts using other tools and non-monorepo setups at other companies the effort here has felt much reduced.
I find the dismissal of buck2 pretty shallow. Most of the world is not already heavily invested to bazel, so compatibility is imho overstated; I don't see it being that far-fetched for something like buck2 to leapfrog bazel. That being said, buck2 definitely would need some love from outside meta to really be viable competitor, right now it still feels like half-complete code drop
I agree. The biggest issue with Buck 2 for me (apart from documentation) is the lack of something like bzlmod. There's actually a decent number of modules available for Bazel now:
https://registry.bazel.build/all-modules
But with Buck2 you're stuck with `http_archive` and vendoring.
Seems like a non-issue to me. I work with a relatively large Bazel monorepo, and we have to vendor pretty much anything anyways. Many 3rd-party rules might need patches to work properly with our custom toolchains/rules.
Bzlmod sounds nice for small projects, but for big monorepos within organizations with established processes it is more of a hassle, and I imagine that most Bazel users are not using it for small projects.
Yeah it's more of an issue for small projects. But I don't think Buck/Bazel should be reserved for megarepos with thousands of contributors. Why can't small projects use it?
bzlmod feels better at first but you're left navigating version conflicts anyways.
Vendoring deps is the only true real sane way for large repos
To be honest, I'd rather not see Buck2 repeat the mistakes of Bazel so early on, especially when it took a lot of time before settling on bzlmod.
Frankly, I'd rather it go the other way: just have a gigantic 'buck2pkgs' repo that has everything inside of it -- literally fucking everything -- just like Nix and Nixpkgs have done. I've watched and committed to Nixpkgs for over 10 years now and I think being a monorepo has massively contributed to its success, ease of contribution, and general usability. And in practice it means 99% of projects only need one "dependency" and they are done and have everything they could want.
In theory a buck2pkgs could even be completely divorced from Meta's existing Prelude, because you can just reimplement it all yourself, without being constrained by their backwards compatibility with Buck1. That would reduce the surface area between the two projects substantially and let you do a lot of huge cleanups in the APIs.
I actually started doing this a while back but it is of course a massive amount of work. I do think it's a much better way to go, though...
I miss Meta's arc + buck2 + eden + "hg" + "phabricator" + testing infrastructure that does a lot of acceleration of builds, testing, and landing commits without rebasing.
The biggest problem with everything NotBazel in this space is IDE support. JetBrains have already moved their build for IntelliJ to it, and are aggressively adding native first party support.
yeah. author misses that buck2 is nix like. seems he is in mind of prev gen assembly systems.
Buck2 doesn't seem to support JS/TS, so it's not ready for our use case.
In what sense do you mean support? buck2 is afaik completely language agnostic, you can use it to drive any compilers/tools you want?
Probably means a lack of out of the box rulesets. Writing your own rules to use a build system for a typical project is an unrealistic ask, IMO.
Great article, that gets the critique exactly right. The most frustrating part of Bazel is how shoddy the workmanship is. For example, Bazel throws away your analysis cache when you change flags that have nothing to do with what's being built or how, like flags that change what tests are run.
If course the biggest issue is that tiny operations take more time than necessary. For example, at $PreviousJob we wrote custom CLI tools and put them in our monorepo, but since you need Bazel to run them, running a basic tool took 9-12 seconds. So this revelation from the article was totally unsurprising:
> This rewrite was carefully crafted to optimize memory layouts, avoiding unnecessary pointer chasing (which was impossible to avoid in Java). The results of this experiment proved that Blaze could be made to analyze large portions of Google’s build graph in just a fraction of the time, without any sort of analysis caching or significant startup penalties.
Yeah, with attentive engineering it's not surprising that you can get massive speedups.
Finally, Bazel is ok at building multiple languages, but awful at getting them to depend on each other. I don't know what's going on here, but whatever magic makes it possible for any language to depend on C and C++ was not extended to other possible dependencies. So now you get to fight or rewrite all the rules, which is another bag of worms.
> The most frustrating part of Bazel is how shoddy the workmanship is.
Without commenting on the above statement:
> For example, Bazel throws away your analysis cache when you change flags that have nothing to do with what's being built or how, like flags that change what tests are run.
I don't think this is a good example. Bazel's analysis cache clearing is to preserve correctness in the context of a legitimately algorithmically difficult problem. The fact Bazel has this limitation is a testament toward its correctness strengths. I'm not aware of systems that have solved that algorithmic problem but curious if anyone knows any.
Also Bazel avoids that problem for most "flags that change which tests are run", since that subset of the problem is more solvable. --test_env was a notable exception, which was fixed (https://github.com/fmeum/bazel/commit/eb494194c1c466f7fd7355...) but not sure if it's in the latest Bazel release yet? But generally changing --test* has much smaller performance impact.
>On the other hand, we need a tiny build system that does all of the work locally and that can be used by the myriad of open-source projects that the industry relies on. This system has to be written in Rust (oops, I said it) with minimal dependencies and be kept lean and fast so that IDEs can communicate with it quickly. This is a niche that is not fulfilled by anyone right now and that my mind keeps coming to;
Yes please!
Everyone says they want a "tiny" "minimal" "lean" build-system, but there is lots of real complexity in these areas:
- Cross-compilation and target platform information
- Fetching dependencies
- Toolchains
I'm not sure a system that solves these would still be considered minimal by most, but those are table-stakes features in my view.
If you don't need these things, maybe stick with Make?
I believe that the biggest problem is that different "compilers" do different amount of work. In the race to win the popularity contest many, and especially newer languages offer compilers packaged with "compiler frontend", i.e. a program that discovers dependencies between files, links individual modules into the target programs or libraries, does code generation etc. This prevents creation of universal build systems.
I.e. javac can be fed inputs of individual Java source files, similar to GCC suite compilers, but Go compiler needs a configuration for the program or the library it compiles. Then there are also systems like Cargo (in Rust) that also do part of the job that the build system has to do for other languages.
From a perspective of someone who'd like to write a more universal build system, encountering stuff like Cargo is extremely disappointing: you immediately realize that you will have to either replace Cargo (and nobody will use your system because Cargo is already the most popular tool and covers the basic needs of many simple projects), or you will have to add a lot of work-arounds and integrations specific to Cargo, depend on their release cycle, patch bugs in someone else's code...
And it's very unfortunate because none of these "compiler frontends" come with support for other languages, CI, testing etc. So, eventually, you will need an extra tool, but by that time the tool that helped you to get by so far will become your worst enemy.
I have seen this first hand with Bazel. You have lots of Bazel rules that are partial reimplementations of the language specific tooling. It usually works better - until you hit a feature that isn’t supported.
I think the idea for these words here is more about preferring speed over remote execution and large build caching type of features, but not limiting the subset of toolchain functionality etc.,. In theory if you scoped your build tool to only support builds of sufficiently small size you can probably remove a lot of complexity you have to deal with otherwise.
Intelligent caching is also table-stakes though. It requires a detailed dependency graph and change tracking, and that's not something that can simply be relegated to a plugin— it's fundamental.
I agree, yet Make, CMake or even Node package scripts are more popular than Bazel.
Right, and I think that's a combination of a few factors— first of all, there's the basic momentum that CMake is widely known and has a huge ecosystem of find modules, so it's a very safe choice— no one got fired for choosing https://boringtechnology.club
But bigger than that is just that a lot of these build system and infrastructure choices are made when a project is small and builds fast anyway. Who cares about incremental builds and aggressive caching when the whole thing is over in two seconds, right? Once a project is big enough that this starts to be a pain point, the build system (especially if it's one like CMake that allows a lot of undisciplined usage) is deeply entrenched and the cost of switching is higher.
Choosing technologies like Nix or Bazel can be seen as excessive upfront complexity or premature optimization, particularly if some or all of the team members would have to actually learn the things— from a manager's point of view, there's the very real risk that your star engineer spends weeks watching tech talks and yak shaving the perfect build setup instead of actually building core parts of the product.
Ultimately, this kind of thing comes back to the importance of competent technical leadership. Infrastructure like build system choice is important enough to be a CTO call, and that person needs to be able to understand the benefits, to weigh activation costs against the 5-10 plan for the product and team, and be able to say "yes, we plan for this thing to be big enough that investing in learning and using good tools right now is worth it" or "no, this is a throwaway prototype to get us to our seed money, avoid any unnecessary scaffolding."
If it doesn’t have those features then why would I even use it at all? Remote build and caching are the entire reason I’d even think about it.
It's not Rust, it's not production ready, and I haven't actually used it (only read the README and blog posts), but I really enjoy the ideas behind https://github.com/256lights/zb - see https://www.zombiezen.com/blog/2024/09/zb-early-stage-build-... for a list of these ideas.
In the 2020s I have used BUCK, BUCK2, and Blaze every working day. I don’t feel like getting in to details but I’ll just say that BUCK2 is the best of the 3 and I’m delighted it’s open source.
I don’t have experience with Bazel as it exists externally but I expect the internal Blaze has advanced far beyond it by now as the entire Google engineering world depends on it.
https://github.com/facebook/sapling/tree/main/eden
I'm still unsure if a complete, viable eden solution was released with all of its necessary components.
One thing not mentioned is the massive amount of complexity and difficulty involved in Starlark rules. In the next generation of build tools, I really wish Starlark could be replaced with some subset of TypeScript, which would have drastically better IDE support. And I wish the typing would be done in such a way that it's harder to do things that are bad for performance. Starlark is super difficult to read, navigate, and write, and there are a lot of performance gotchas.
I know there are efforts like starpls (language server for Starlark) but in my experience it really falls short. I think buck2 has type annotations but I kind of hate Python-style types, so I wish we could just use TypeScript :P TypeScript is even better positioned to fill this role now that tsc is being re-written in Go.
> Starlark is intended to be simple. There are no user-defined types, no inheritance, no reflection, no exceptions, no explicit memory management. Execution is finite. The language does not allow recursion or unbounded loops.
That has some pretty useful properties for a config language that I'm not sure TypeScript helps much on. That said, Lean also has these properties too as an alternate language with good IDE support, and it would be pretty fun to write proofs of your build scripts.
You'll be happy to know that the Starlark team has recently started working on adding static typing! See https://github.com/orgs/bazelbuild/projects/21 and https://bazel.build/rules/language#StarlarkTypes
> a lot of performance gotchas
Can you list some examples?
I worked on a project that used Bazel for a bit (but ended up not being a great fit).
I think the part I hated most about Bazel was the fact that there were in fact two dialects; one for BUILD files and one for .bzl files. The things you do in them are different but the documentation is always vague on what goes where.
I'm curious, how did you run into performance issues with starlark?
I find anyone this dismissive of Java hard to take seriously. Java is a cutting-edge language platform where the achievable performance at the limit is on-par with C++. Also, the article's hook didn't grab me. My laptop is way, way more powerful than any "beefy workstation" they ever assigned me at Google, starting with a dual-core 2400MHz Opteron. Bazel's server performance on my laptop or workstation are total non-issues. There are issues with Bazel of course, that's just not one that smells right to me.
Performance of Java in general and Bazel in particular tends to be not CPU-limited but memory-limited. There was a time when MacBooks maxed out at 16 GB, and running Chrome + an IDE + large Bazel build on a machine with <16 GB memory was not a great experience.
Yep. It was common lore at Facebook last decade that the reply to “my build is broken” was “raise the JVM heap limit!”
Why do you think this author (a long-time senior Blaze maintainer) believes this? Do you think the Blaze team just not have the necessary expertise?
I have no idea but dismissing Java as not "a real systems language" just isn't a convincing critique.
As I said in the article, the lack of green threads, lack of value types, and lack of primitive boxing are all things that have gotten in the way of optimizing Bazel to the levels shown by the prototype Go reimplementation. These are all things that you'd expect a systems language to have, and Java did not have them. These are finally materializing now, but it's a little too late.
Then you also have the slow startup time of the JVM plus Bazel's initialization, which leads to a possibly-unnecessary client/server process design. This is more wishy-washy though, as this might be optimizable to levels where the startup time is unnoticeable -- but the fact that it hasn't happened in years means that it might not be an easy thing to do.
FWIW, the client server design is also used in Buck2, but it has other advantages than just startup time, like keeping track of the filesystem with inotify or watchman so that it already has fresh information about the state of the build graph by the time you run `build`.
For one, Java programs always take too long to start, and that's not something you want in a program that's supposed to be interactive.
I don't disagree in general but in Bazel's case this path has been heavily optimized. Maybe there are limits to it but "java startup slow" is a 101-level complaint.
In fact, I don't even think the client program for Bazel is written in Java, but C++, and the Java daemon it talks to is started when you first attach to a Bazel workspace and it persists, so subsequent interactions are very fast. Just running `bazel` randomly is not truly indicative of what using it feels like, because that's not what people actually use it for. People use it inside a build workspace, that is the case that matters.
Beyond that, other build systems like Buck2 (Rust) also use the client-daemon architecture for a number of other reasons, including the fact that actually-really-large build graphs are far too large to rebuild on every invocation, so the daemon is necessary to keep the build graph and interactively analyze and incrementally invalidate it on demand. Doesn't matter if it's Rust or Java, these things are architectural. That's also one of the points of the article, of course, and why they're theorizing about server-side analysis caches and graphs, etc.
This all indicates to me that the people designing these systems actually do care about interactivity of their tool. It is not a matter of "java startup slow" to use a client-server design, though it certainly is known to help.
See, that's exactly the type of uncritical statement that will force me to discount the rest of your statements on a whole range of topics.
Takes 1.97s on my machine, which is a lot of time for just showing a help message. Maybe you have the setup where Bazel is staying alive in the background or something?
Literally everyone who uses bazel uses it client-server as a persistent daemon.
Demonization introduces a range of potential cache invalidation issues. The issues are solvable, but whose KPIs depend on getting to the bottom of them?
Do you have specific examples in the context of blaze/bazel here? I think "the set of cache invalidation issues" you're describing are basically "the set of cache invalidation issues blaze/bazel intends to solve", so the answer to "whose KPIs" is "the blaze team".
I haven't used Bazel but I have used buck1 extensively, and the daemonized mode was quite buggy and often required the process to be killed. Quite frequently the process was just wedged, and even when it wasn't, memory leaks were quite common. Standard black box debuggers like strace and DTrace also become harder to use (e.g. you need to attribute a particular set of syscalls to a client, and a mindless "strace buck build ..." doesn't do what you want).
Daemonization is sometimes necessary, but it introduces lifecycle management problems that get in the way of robustness. Daemonization simply because your choice of programming language has bad startup times doesn't seem like a great idea to me.
I think on-disk cache invalidation and in-memory cache invalidation are distinctly different in practice.
I asked this because I've used bazel some, and blaze fairly extensively, and despite having done some deeply cursed things to blaze at times, I've never had issues with it as a daemon, to the point where I'd never consider needing to run it in non-daemonized mode as part of a debug process. It just works.
Second, "startup-time" is probably the least relevant complaint here in terms of why to daemonize. Running `help` without a daemon set up does take a few seconds (and yeah that's gross), but constructing the action graph cold for $arbitrary_expensive_thing takes a minute or two the first time I do it, and then around 1 second the next time.
Caching the build graph across builds is valuable, and persisting it in memory makes a lot more sense than on disk, in this case. The article we're discussing even argues in favor of making this even more extreme and moving action graph calculation entirely into a service on another machine because it prevents the situation where the daemon dies and you lose your (extremely valuable!) cached analysis graph:
> Bonanza performs analysis remotely. When traditional Bazel is configured to execute all actions remotely, the Bazel server process is essentially a driver that constructs and walks a graph of nodes. This in-memory graph is known as Skyframe and is used to represent and execute a Bazel build. Bonanza lifts the same graph theory from the Bazel server process, puts it into a remote cluster, and relies on a distributed persistent cache to store the graph’s nodes. The consequence of storing the graph in a distributed storage system is that, all of a sudden, all builds become incremental. There is no more “cold build” effect like the one you see with Bazel when you lose the analysis cache.
If you're worried about cache invalidation and correctness issues due to daemonization, I think you'd want to be even more concerned about moving them entirely off machine.
(I'm also not sure how their proposal intends to manage e.g. me and Phil on another machine trying to build conflicting changes that both significantly impact how the analysis graph is calculated and thrash each other, either you have to namespace so they don't thrash and then you've moved the daemon to the cloud, or you do some very very fancy partial graph invalidation approach, but that isn't discussed and it feels like it would be white paper worthy)
Yeah, moving more things into the cloud makes debuggability even worse than today. I think that's a very large downside.
> I haven't used Bazel
That was the perfect place to end the comment.
Are you saying that daemonization doesn't make black box debugging harder?
And part of why it's a persistent daemon is because Java is slow to start up.
I'll note that bazelisk is written in Go.
Exactly. And a Go program that forks a Java program that makes an RPC is still faster than `cargo help`
For example, Java's signal handling is not up to par with systems languages. CLI tools that orchestrate operations need to have high-quality signal handling in order to be robust.
Proper, dependency-aware build caching makes or breaks a build system. Meta's buck2 builds tend to be crazy fast because of this and it can also do synthetic checkouts with arc/"hg" via eden.
Build systems are workflow engines.
Workflow engines fundamentally separate and structure code and execution in discrete steps with a graph visualizer (hopefully the graph is acyclical).
Maybe it's just a bad example, but the code I saw looked very very unseparated. I understand most people who do builds want start out with the one build file to rule them all model.
But ultimately you are dealing with a workflow execution. If you were doing anything aside the most trivial single library build.
Next generation of whatever build needs to have the visualization and visualized execution assets.
There's a great many things that workflow engines fundamentally applied to. Unfortunately they have been politically associated with management trying to replace developer labor, usually because of the snake oil salesman, the workflow engine people are selling to the the c-suite people.
A good common visualization and execution set that's cross language would have been really helpful in so many computing domains. It's almost like we need the SQL of workflow engines to be invented
You're not wrong about build systems being workflow engines, but in my experience the risk of over-generalizing is a worse user experience. One of the most valuable lessons I've learned in my career is that a clean UX often requires messy abstractions underneath.
> (I can’t name names because they were never public, but if you probe ChatGPT to see if it knows about these efforts, it somehow knows specific details.)
I'm going to laugh if Google joins the NYT in suing OpenAI for copyright infringement from having trained on proprietary data. That said: I also wonder how that actually made it in
These were not well-guarded secrets (and the ideas were pretty "obvious" if you had faced the problems before in dealing with Bazel and large codebases). In talking to people at the conference, several knew about these efforts (and some even knew the codenames), so I assume these had been mentioned here-and-there in discussion threads.
I can say with full certainty that thousands of engineers at Google were feeding their day to day LLM assisted coding queries in to the ChatGPT web ui for about 2 years before a custom Chrome extension was pushed to employee laptops to stop it.
So GPT has likely post-trained on a lot of Google code knowledge.
I just don't understand how the decision of which bits of a project need rebuilding can be so complex.
If I edit 50 lines of code in a 10GB project, then rebuild, the parts that need rebuilding are the parts that read those files when they were last built.
So... The decision of what to rebuild should take perhaps a millisecond and certainly doable locally.
50 lines in a single .cc source file which is only compiled once to produce the final artifact - sure, easy to handle.
Now consider that you are editing 50 lines of source for a tool which will then need to be executed on some particular platform to generate some parts of your project.
Now consider that you are editing 50 lines defining the structure and dependency graph of your project itself.
That rule misses important cases:
• Adding a file. It hasn't been read before, so no tasks in your graph know about it. If you can intercept and record what file patterns a build tool is looking for it helps, but you can't easily know that because programs often do matching against directory contents themselves, not in a way you can intercept.
• File changes that yield no-op changes, e.g. editing a comment in a core utility shouldn't recompile the entire project. More subtly, editing method bodies in a Java program doesn't require the users to be recompiled, but editing class definitions or exposed method prototypes does.
• "Building" test cases.
• You don't want to repeat work that has been done before, so you want to cache it (e.g. switching branches back and forth shouldn't rebuild everything).
If your system is C based, tup [0] fulfills your request by watching the outputs of the specified commands. It isn't, however, appropriate for systems like java that create intermediate files that the developer didn't write [1].
[0] https://gittup.org/tup/ex_a_first_tupfile.html
[1] https://github.com/gittup/tup/issues/113
Back to bazel, I am of the impression that some of its complexity comes from a requirement to handle heterogeneous build systems. For example, some python development requires resolving both python dependencies and C. Being good at either is a bunch of work; but, handling both means rolling your own polyglot system or coordinating both as a second class citizen.
It is well possible by changing 50 lines of code in a 10GB project you have to rebuild the entire project, if everything (indirectly) depends on what you just changed.
I mean, easy things are easy to build?
It is not at all uncommon to have changes percolate out into larger impacts than you'd expect, though. Especially in projects that attempt whole program optimizations as part of the build.
Consider anything that basically builds a program that is used at build time. Which is not that uncommon when you consider that ML models have grown significantly. Change that tool, and suddenly you have to rebuild the entire project if you didn't split it out into a separate graph. (I say ML, but really any simple linter/whatever is the same here.)
> the parts that need rebuilding are the parts that read those files when they were last built...
...and the transitive closure of those parts, which is where things get complicated. It may be that the output didn't actually change, so you can prune the graph there with some smarts. It may be that the thing changed was a tool used in many other rules.
And you have to know the complete set of outputs and inputs of every action.
And on and on.
As the manager of the original Buck team, rules should not be starlark. They should be `wasm`.
I'm super into the component model but honestly the tooling is still not ready. Basically only LLVM (and really just C) has good enough tooling. The rest of the wasm tool chains are still heavily focused on core modules and if they support components it's only `wasi:cli/run`
Also dying for an explanation here
Wasm, specifically using the component model [1], is really good for plugin use-cases.
- Polyglot; you can write plugins (in this case Bazel rules) in "any" language, and the component model is easily the best system ever created for cross-language interoperability.
- Ecosystem; tooling is still generally immature, but you'd be surprised how much buy-in there is. Web apps are a huge incentive for language developers to support it, and Wasm now runs in a lot of places besides the browser too. Envoy and Istio support Wasm plugins. So does Postgres. So does neovim.
- Component interfaces are richly typed. You get things like lists, structs, "resources" (aka objects with instance methods), etc. all supported idiomatically in whichever language you're using and these can be used across language boundaries. Even between GC and non-GC languages. Even with Rust's ownership model.
- Plugins are fully sandboxed, so a host can allow plugins to have things like `while` loops while enforcing runtime limits on a per-instruction basis. It also means Bazel doesn't have to invent it's own bespoke sandboxing solutions for every supported platform, as it does today. And Wasm is designed to handle untrusted Web code, so there are tons of security guarantees.
- Performance is already really good, and there's every reason to believe it will get better. Again, Browser support is big here. The things the Chrome team did to optimize V8 are pretty incredible, and they're now turning their attention to Wasm in a big way.
[1]: https://component-model.bytecodealliance.org/design/why-comp...
Why?
I don’t understand
What? Isn't wasm just a bytecode? How could you write rules in wasm?
Mentions Buck, but no mention of Pants or Please.
The author clarifies that he wrote the section about Buck2 to demonstrate the need to be Bazel compatible (as opposed to Buck2), because the friction to try it out in a real code base is essentially insurmountable.