Merging 2 git repos with persisting commit history
Lately I faced a case for the first time in my career to merge 2 working repos with large amount of logs into one repo and the challenge here was to keep the history for both repos after merging.
Doing that without caring that much about the history is super easy, simply by adding one of them copy/paste into the other one, in this case you will keep only the history of the target repo but what if you would like to create even a brand new repo that will hold both of them with a new setup (which was my case)? now the problem shines more as using the copy/paste technique will make you lose both histories.
To explain the issue better let's assume that we have a repo
Original_A that has a history of 4 years of changes and repo
Original_B that also has about 3 years of different changes, and we would like to have both of them in a new monorepo (as we decided in the end) because we found quite some features in monorepos that fits us best, talking about that maybe in a different post.
The target look in the end is to have both apps inside a new repo in different folders, to be honest the mentioned article (in resources) was quite helpful in giving me a starting point but was really misleading and not precise about the changes where and in which path, that's why I decided to write this post in a clearer way.
Let's assume that the final result should be:
NewRepo |_ ProjectA // a folder that holds the content of Original_A repo |_ ProjectB // a folder that holds the content of Original_B repo
Step One: Cloning original repos
You should have the 2 original repos that you want to merge in your projects folder first, I assume you should have cloned them already so you can skip this step.
Hint: You should do that outside the new repo folder, maybe in the same level of the new repo folder.
Step Two: Rewriting the git history of both repos
⚠️ Update: after some trials/reads you could use
git subtree, Also sometimes you can even skip this step in case you would like to keep the commits ids as it was for tracking or anything else
Since both repos content will be inside different folders inside the new repo, so it is important to re-write all the paths of all files into the right folder in the new repo before moving them, so we are going to use the command
git filter-branch which has quite a lot of options but very dangerous to use, try to avoid it if you can.
Go to the path of the first repo
Original_A and run the following command:
git filter-branch --index-filter \ 'git ls-files -s | sed "s-\t\"*-&ProjectA/-" | GIT_INDEX_FILE=$GIT_INDEX_FILE.new \ git update-index --index-info && mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"' HEAD
Don't worry I'll explain line by line.
1it is the main command as I mentioned earlier and it takes one of the options which is
--index-filteryou can read more about it from here but concisely it's used usually to rewrite the index of the files based on the commits.
2we are listing all the files using
git ls-files -sand then re-name them to start with
ProjectA/as the root folder instead of the original repo(that only for the final setup folder name), read more about the sed command from here.
3we set the current index file based on the list we got before into a variable called
4is for updating the indexes with the new ones read more.
5finally moving the index to the index.new that we created.
Note: if you are using macOS replace
\t in line
control+v, TAB so it should become as follow:
'git ls-files -s | sed "s-control+v, TAB\"*-&ProjectA/-" |
Also, It was mentioned in one of the articles in resources that if the repos that you are merging has a single root folder which is rarely happening IMO, but anyway you can use this command instead:
git filter-branch --force --tree-filter \ 'mv NewRepo/ ProjectA/ || true' --tag-name-filter cat -- --all
And of course in both cases you need to repeat that for each repo.
Step Three: Create a new Repo
All the previous steps were with the original repos but now we are going to create a brand new directory in the same path with the other repos that should be eventually a monorepo, first create a new directory (folder) and open the terminal in its path, which we can consider
NewRepo folder we should have both repos merged in 2 folder
ProjectB, now in the terminal initiate
NewRepo folder as a git repo simply by running this command in the folder path:
Step Four: Add the 2 repos as remote repos for the "NewRepo" and fetch them
To set these two repos as remote repos to fetch their content into the new repo as follow:
git remote add --fetch ProjectA ../Original_A/ git remote add --fetch ProjectB ../Original_B/
Now nothing happened, you will not find any change in your folder, will remain empty.
Step Five: Merge the 2 remote repos
Now we are in the final step to merge the history of both repos, the good thing you should do here is to merge them one by one to make sure that the root files would not conflict in both of them.
Good to mention is that to merge the disparate branches (repos) which is now disabled by default in git but can be enabled with the
First run the following command:
git merge ProjectA/master --allow-unrelated-histories
You will find all content of the
Original_A repo in your root of
NewRepo, including the history, so if you run the following command you should find the whole log of the original repo:
git log --oneline
Now you can create a folder
ProjectA in the root and move all the files merged into it, again to prevent any conflicts after merging the other repo, then you run the some command for the other repo:
git merge ProjectB/master --allow-unrelated-histories
Then you do the same thing again, create a folder
ProjectB and move all the files merged from
Original_B, also now by running
git log you will find also the log history of the
Original_B along side with the
Original_A history, all together in the same place and the good thing is that there will not be any conflicts since they are originally 2 different codebase.
Hint: you can replace the branch master in the command with whatever branch you would like to merge from.
In the end I hope that this article was helpful for you, and if you have any comment or need some help with a similar case you can easily reach out to me on twitter @med7atdawoud
- 🔗 git filter-branch
- 🔗 Merge git repositories into a new repository
- 🔗 Git '--allow-unrelated-histories'
Tot ziens 👋