A lot of times you inherit a repository with several binaries which cause an exponential size growth if they’re modified. But a GIT repository should not host binaries. In fact no VCS should host those files.
Other times a developer (or yourself) uploads something mistake that shouldn’t be on the repository.
In either case you’ll want to delete those files.
I tried several guides but it never worked as supposed because they don’t update all the branches on the repository and therefore the file is never deleted.
So let’s go:
First you should ask all your developers to commit & push before you start playing.
Then clone the repository with the –mirror option so we have a new copy with all the branches and tags (a mirror):
git clone --mirror MY_REPO_URL
Note this clone is not a working copy, so you won’t be able to see it’s contents.
Now use git filter branch to remove the path you want from all refs in history:
git filter-branch --force --index-filter \
'git rm -r --cached --ignore-unmatch MY_PATH' \
--prune-empty --tag-name-filter cat -- --all
Note the -r that is used to recursively remove (you may remove it). Also MY_PATH must be the complete path from the repository root.
Finally get rid of the real files:
git reflog expire --expire=now --all
git gc --prune=now --aggressive
You may consider to add the path to .gitignore (and then commit) if the file was added by mistake.
And push:
git push
Now your developers should rebase all the branches they have. However the best approach is to clone the repository again.
Tell them not to push anything unless they follow one of the previous procedures. Otherwise you’ll have duplicate commits and a real mess.
Another would be using BFG repo cleaner, a java tool to handle the removal. It’s faster, but keep in mind that BFG will not modify the HEAD, so if you still have the file on HEAD you need to delete it with git, commit and push before starting.
Leave a Reply