Using Monorepos Is Not That Bad [Case Study]

The decision either to use or not to use monorepos has been very difficult for a lot of reasons, specially when you make a quick research and read this article 🔗 titled “Monorepos: please don’t” and then read this one 🔗 titled “Monorepos: please do!” a lot of good reasons here and there, in this article I’m telling a quick story about me and my team at Zoover and why we get into the decision of adopting monorepo in our projects, what tools we used and how was the impact and workarounds for common issue.

The article looks a bit long but believe me it worth every moment you will spend reading.

A brief about the problem

let me tell you first what is the situation and what pushed us to that decision in the first place.

I was working for a company called Zoover đź”— since August 2019, working in the travelling industry, and shortly after working hard in building our booking system to turn into an OTA (online travelling agency) COVID-19 hits hard and governments closed travelling, and obviously we get affected badly by that.

A few months later we got acquired by another company called Vakanties, what makes us (Dev Team) having 2 brands to support and a lot of services/libs should be shared however the tech stack is not really matching and here comes the problem.

As you can see there are some matching in the tech stack but still a lot of things are not easily re-usable between the 2 websites, and since we are now a single team to develop and support 2 different websites, we need to minimise the time to create new features or fixing a bug for both brands as much as possible.

The core functionality of being an OTA which is the process of booking on our 2 websites are almost identical, and we need to share that part at least in the beginning

And a lot of challenges started to shine:

  1. 2 big repositories with 2 large git history
  2. Different APIs within different pages and different 3rd party libs
  3. Different Development Stacks (state, routing, …etc)
  4. How to manage dependencies for shared parts
  5. Deployments how/when/resources
  6. Deciding which one is going to be merged to the other
  7. and much more …

The goal here or the key metric for having this as a successful step is to have more story points burned each sprint and have less effort for developers to create/fix and change in both brands at the same time, share services and common parts so we can use the same 3rd party libs in both brands.

First thoughts about solutions

I made my research for people who had a similar problem and how they tried to fix it, and I found some respected amount of people talking about different ways to overcome that problem, the most obvious ones that I found useful were:

1 - Using monorepos

đź’ˇ First, what are monorepos?

Simply a monorepo is a one big repo that has multiple apps in a folder structure instead of multiple repositories structure, not only apps, also libs, docs, tests, build files, backend, and frontend …etc

🤔 who are using monorepos?

All FAANG 🔗 companies and more including (Facebook, Netflix, Twitter, Microsoft, … etc ) what might make it a good solution for us too!!, if it works for all those giant companies most probably monorepos could work for us as well.

However, during my research I get hit by some drawbacks for using monorepos that intimidated me from the future of the project, some very common issues that people who are against monorepos said such as:

  • Git slowdown: as the code base grow that might affect performance of simple git commands like git status
  • Broken master: as all apps are under same repo then anyone by mistake makes the master or main branch down will affect all other teams’ work
  • No teams autonomy: now all teams has to use same tech stacks or at least limited options of changes in the tooling based on the shared stuff
  • Long build time: building the whole app will take too long time comparing to building only updated service or lib

Given the results I decided to park the idea of monorepo for now as it has its own problems.

2 - Using Polyrepos and deploy shared code into packages (npm/gpr)

Another solution was to use multiple repositories (Polyrepos) as it is right now, and building a small PoC (Proof of Concept) to see how smooth it will be in sharing code and assets and apis between the 2 brands.

About the shared code we can package it as a shared library and publish it on a private registry on NPM or GPR (Github Package Registry), versioning of each published lib can keep supporting the 2 brands at the same time, sounds like a good one đź‘Ś

I found that it has some very good features and also some drawbacks, here is what I found:

  • Strong team ownership: each team is owning specific part in a separate repo, might be useful for splitting responsibilities
  • Fast Build time: due to separate repos, build takes short time.
  • Isolate master break: if someone broke master that will be on a single app or service, not affecting others.
  • Create multiple versions of each lib: versioning could be helpful
  • Duplicate work: some code cannot be shared so copy/paste
  • Access to different repos: every team member should have access to each app/lib
  • Dependency hell: diamond dependency problem
  • Overhead of publishing dependencies: on NPM or GPR
  • Newcomers heavy setup: onboarding process will take longer
  • Coding style/arch silos: each team will have a different code standard and cannot enforce it for all codebase easily.

So for using Polyrepos, I found very good reasons to use but also there are some issues that I have to face as mentioned. so, I have to park it as well until I check the last available solution.

3 - Using git submodules

The third introduced solution to fix our problems was to use git submodules, and it is basically cloning a repository into a directory inside another repository and run some git commands to make that subdirectory a submodule from the parent git repo, you can read more here đź”—.

After making some research about the expected results and making a quick PoC here is a list of what I found issues:

  • No big efforts in setup: almost no change in the current 2 repositories, can be used as they are right now without merging.
  • Steep learning curve: actually learning new git commands that might be a bit more difficult is a challenge for developers who get used to the normal git commands.
  • Switching branches: it is a well known issue in using git submodules, when you switch the branch in the parent repo you have to run a command to switch it as well in the submodule, what makes it error prune.
  • Complex to understand: the techniques of working with submodules is a bit complex and hard to understand or explain.

Even for this solution you have to compromise and not an easy decision to adopt it, specially after discussing it with the team.

Then I reconsidered the 3 solutions and compared them by the benefits we can gain out of them as all of them has its own problems.

Out of the comparison above, we found that the monorepo solution is the one that fits our needs most and we can work on finding ways to avoid its known problems, so let’s give it another shot, also found this article “Monorepos: please do! 🔗” which respond to all problems mentioned to the opponent side’s article.

Raise of the monorepo solution “again”

This time, I needed to make sure that we are picking the right tooling and making sure that the above mentioned problems are far from our team as we are a relatively small team and our app is less scalable comparing to the apps in the FAANG companies that has these problems.

In my journey in finding a good Monorepo tool, I found a lot of solutions by big companies including: Pants đź”— by Twitter, Bazel đź”— by Google, Buck đź”— by Facebook, Rush đź”— by Microsoft, and other solutions that are made specially for applications of our size like Nx, Bit, and Lerna.

Nx đź”—

I started by Nx, I read very good reviews about the tool and it’s abilities and I was very enthusiast to try it out, and here are the good things about using Nx:

  • It can be used to manage projects with different stacks: and given that we need to refactor one app to use some new stacks in the other one, this feature might be very good for us.
  • Directed Acyclic Graph (DAG): this is a tool that comes with Nx to draw a graph out of your application dependencies and show you who is going to be affected by your changes, very handy.
  • Support is really top: I needed some support and they jump with me in a 1:1 call immediately, I’ll talk about that later.

đź‘Ž Some bad things (at this moment) that I found:

  • Depending on angular CLI releases of TypeScript: which means that when we need to update TypeScript version in our project we need to update the version of Nx which will wait for the update version of angular CLI each 6 months (too much)
  • Not ready for react: faced many issues in the first try and after jumping in a call they helped me to work around it but that gives me a feeling that more of these issues are coming in the future and needs a big support.

Bit đź”—

Then I decided to give Bit a try, here is my thoughts at that moment:

  • Very easy to setup: it was kind of 2 steps to be in the game and start sharing the code you want with your team.
  • Very organised to use and host code: a great way to try and preview the code running that could make us get rid of Storybook.
  • Very expensive: very important factor is the cost, it costs $200/month which is a bit too much for our needs now.

Lerna đź”—

Finally, switch to the great combination (lerna + yarn workspaces), it is a future proof, used broadly with our stack, and I even used it in a react lib side project that I made before, so I have some good experience with it but no experience of usage with big projects, here is a list of good signals about using Lerna:

  • Easy setup: Very easy and fast to setup with a few commands
  • Guaranteed: Proven for our case and our stack
  • Free: so we got a lot of features without paying a cent
  • Downsides are coming next 👇

Turborepo đź”— (An option that was not available back then)

While we faced that issue and during my research for solutions in 2020 Turborepo was not published yet, and if you don’t know what is that turbo repo is a new build system introduced by Jared Palmer and got acquired by vercel in Dec 2021, I think it is a brilliant solution that it could have been a great option to pick if it was available back then, here is my takes:

  • Easy setup: it works just plug and go, no major changes, and super clear docs
  • Trusted creators: it was created by learning from other solutions and by great creators powered by vercel which is powering Nextjs, Turbopack as a successor to webpack, and more, so we have a history with the creators and their quality of software they are supporting and the echo-sytem they make around it.
  • Content awareness hashing and incremental builds: some of the features that caught my eyes is content awareness hashing and incremental builds, so any shared content/lib would not be re-built in another module if it was not changed from last build within any module in the monorepo, also skipping the last built stuff not to re-build everything every round in an incremental way.
  • Based on my research I didn’t find much problems in tuborepo, the only thing that people comment most about it was that it might not clearly ready for production, however the comment that I found the most is any major problem is getting fixed super fast.

Challenges we faced when using a monorepo

Everything comes with its taxes, having monorepo is the same, some challenges are well known for the community and some others are specific to our case that we discovered during our trials with out first draft of a monorepo:

  • New tools/commands learning curve: what ever the tool we picked it has some difference than what the team used to use.
  • How to merge the 2 repos and keep git history: that is a big hassle that I wrote my research and solution in an article here that you can check it out, very interesting solution
  • Cost of building monorepo setup: to build it in the first place we need to decide the service (Github, Bitbucket, ..etc) then setting up scripts for running and linking project to each others, that might be a one time effort but takes time.
  • What do we need to share? Types/Services/Components/…
  • global types: any file *.d.ts has only the scope of the project and not shared with other apps.
  • jest doesn’t support ES modules OOTB: that is something that I never faced before, and error messages were not really helping much.
  • Dependencies versions: decide what to stop hoisting and what to hoist and the versions should be aligned in all apps.
  • React-dom errors: multiple versions gives stupid hooks error, not easy to detect.
  • Bundle size: tree shaking in Webpack is not straight forward
  • Theming: having 2 brands using same components was a new thing to the team and we needed a solution for theming
  • Deployments: A complete new way of deployments (sophisticated pipelines)
  • broken master: it is true, any team mate can break the master branch and ruin other’s day

Well, that was a tragic but we could manage that finally with some steps and other tooling we decided to adopt, check that out in the next section.

Benefits we gained with a monorepo

Let’s talk first about the benefits we gained by having the monorepo setup for the 2 brands we have and make the whole Frontend projects share a lot of code between the 2 projects:

  • Single source of truth: yes, monorepo has all the code in one place and dependencies for shared code are clear finally
  • Automatic linking apps and packages: very easy and with one command
  • atomic commits: this is a very important feature in working with a monorepo, imagine that you need to change something in a shared lib if you are not using a monorepo, so you will commit it push and let someone review it, then deploy it, and after it is deployed you start making the according change in the 2 brands, and you never know if the change in the lib was fulfilling the need for the 2 brands so to make any modifications you need to re-do the same process again and again, but using atomic commits that means that changes in the shared lib and in the all consumers will be in one go in the same commit, all succeed or all fail which is very handy and saves long time.
  • diamond dependency problem fix: no dependencies requires 2 different versions.
  • codebase modernisation: we can now enforce code quality across all codebase.
  • faster feedback loops: as we have an atomic commit, any change in any part will give you a short and fast feedback if it works or not in all clients.
  • No need for permissions: to use different apps you need no different permissions, only one and you get all code at once.
  • enforcing a workflow for the whole team: that helps in avoiding the deployments issues
  • Easier cross build apps: we can now build multiple apps at once with a single command or a single commit.
  • Way easier newcomers env setup: only one permission, one setup, not complex at all.

We managed to workaround the common issues as well mentioned about monorepos in general:

  • git slowdown: Not the case yet, but you can use Mercurial âś…
  • broken master: Use Git hooks (pre-push, pre-commit) âś…
  • long build time: Split build (Github actions, lerna) âś…
  • codebase complexity: Write More docs + comments âś…
  • Bundle sizing: Use absolute paths & chunking âś…
  • Still enhancing every time something new appears

Well, after all that trials and given the results we had it is clear that: Using Monorepos Is Not That Bad!

Are monorepos for you?

Some of you might be very enthusiastic now about using a monorepo in their next project, BUT are monorepos for you too? the answer is that it is not for all projects and not for all teams, it always depends:

Here are a list of situations that using a monorepo in them is not wise:

  • If you don’t have a lot of pieces of shared code: the most important gain out of using a monorepo is sharing code easily and having atomic commits or atomic deployments for project asynchronously, if you don’t desire for that please Don’t.
  • If you have some private projects/code parts: in a monorepo, any one can access any project’s code, some big companies have their own ways and tools to limit that, but most tools don’t, so be careful for private projects that you don’t want to share in a monorepo.
  • If you don’t suffer from the dependency hell with polyrepos: using Polyrepos as I mentioned earlier comes with great benefits, if you have Micro-services for example in different repos, and you have no problem with that, don’t hassle yourself with monorepos, would not bring a big benefit then.
  • If you or your team are not ready for it: it is important to have everyone in the team capable of using the tools that the team decided, otherwise it will turn into bad practices and then jeopardising the whole project for nothing but bad decision of adopting a not useful enough tool.
  • If you are going to have millions of code later think twice: most of the common issues like slow git commands or long time builds are very common with millions of lines of code companies, think twice before adopting a tool and make sure that you can live with the issues that comes with monorepos as well.

Conclusion

In this article we talked about the problem we faced at Zoover that pushed us to find a solution for sharing code easily and merging with another teams codebase, and how monorepos fits us comparing to polyrepos or Git submodules, also explained what are monorepos, and who are using them, showing some good tools that help in managing monorepos including Nx, Bit, and Lerna.

Then we talking about the challenges that we faced after adopting monorepos, and how we work around them, also talked about all the benefits that we gained by adopting monorepo solution, and proved that using monorepos is not that bad, and showed that using monorepo is not for every project.

Resources:

SHARE