139 lines
6.7 KiB
Markdown
139 lines
6.7 KiB
Markdown
---
|
|
title: "Why did I recently rewrite my whole blog Git history ?"
|
|
date: 2020-03-18 14:20
|
|
url: why-did-i-recently-rewrite-my-whole-blog-git-history
|
|
layout: post
|
|
category: Articles
|
|
image: /img/blog/why-did-i-recently-rewrite-my-whole-blog-git-history_1.png
|
|
description: "A quick Git LFS tutorial (justifying an anti-pattern technique usage)"
|
|
---
|
|
|
|
[![A missing blog post image](/img/blog/why-did-i-recently-rewrite-my-whole-blog-git-history_1.png)](/img/blog/why-did-i-recently-rewrite-my-whole-blog-git-history_1.png)
|
|
|
|
### Introduction
|
|
|
|
Back in the time, this blog was hosted on and by GitHub Pages, now Microsoft's ([as NPM more recently](https://github.blog/2020-03-16-npm-is-joining-github/)).
|
|
But today, we won't be talking about the whole _"embrace, extend, and extinguish"_ capitalist strategy, but rather Git.
|
|
You know Git, this little piece of software firstly written in two days and today daily used on Earth.
|
|
We are always looking for the next cool-but-functional graphic (including Web) interface to embellish and represent projects sources, but it's ~~always~~ often about the same program underly.
|
|
|
|
(\[Microsoft's\] GitHub's) Pages is cool and handy, and I actually decided to keep Jekyll as the static HTML generator engine for my blog.
|
|
Self-hosting allows you to really appreciate some technical constraints, the same constraints often hidden when services are operated by large ~~corporations~~ platforms.
|
|
|
|
Basically, let's take _this_ very blog as an example.
|
|
I have been, and for some years now, pushing non-diff-able objects (as images or minified front-end assets) to the Git tree.
|
|
It might be deadly-stupid ~~(and stored by GitHub when using Pages)~~, but that's not _viable_ on the long run.
|
|
|
|
Here comes the main subject of this post : [Git LFS](https://git-lfs.github.com/).
|
|
|
|
### Git LFS
|
|
|
|
Git LFS is a project allowing developers to version files that couldn't be version-ed by Git alone (those I qualified "non-diff-able" above).
|
|
By using short and diff-able pointers in the Git history, we might (finally) store binaries (or equivalent) without duplicating repositories size each time we update them.
|
|
|
|
And you know what ? It's [packaged in Debian Buster](https://packages.debian.org/source/buster/git-lfs) (and [back-ported to Stretch](https://packages.debian.org/source/stretch-backports/git-lfs) :tada:), so :
|
|
|
|
{% highlight bash %}
|
|
apt install git-lfs
|
|
{% endhighlight %}
|
|
|
|
Git LFS is well-supported by popular code hosting services, see some examples below :
|
|
|
|
* GitHub, [since 2015](https://github.blog/2015-04-08-announcing-git-large-file-storage-lfs/) ;
|
|
|
|
* GitLab, [since v8.2 (2015 too)](https://about.gitlab.com/blog/2015/11/23/announcing-git-lfs-support-in-gitlab/) ;
|
|
|
|
* Gitea, [since v1.1.0 (2016)](https://github.com/go-gitea/gitea/pull/122).
|
|
|
|
As every other existing things on this planet, it comes with its own limitations, and before diving in, I'd advise you to [consult them](https://github.com/git-lfs/git-lfs/wiki/Limitations) to check whether you are concerned or not.
|
|
|
|
### Migrate Existing Repositories
|
|
|
|
Yeah, LFS is pretty cool and you should think about it **before** creating a new project and/or pushing non-diff-able data to a remote (and often, collaborative) repository.
|
|
|
|
> But what about existing projects ?
|
|
> How am I supposed to do if I want to keep the _whole_ Git history AND migrate existing "binaries" to LFS ?
|
|
|
|
An awesome project comes with an awesome team : they thought about it.
|
|
|
|
Below is a very simple procedure to migrate already-referenced-contents.
|
|
Please adapt it, 'cause you know, **YMMV** :
|
|
|
|
{% highlight bash %}
|
|
# When I first attempted to migrate blog assets to LFS, I came across an opened issue.
|
|
# This was (likely) related to how project tags were named.
|
|
# See <https://github.com/git-lfs/git-lfs/issues/3818>.
|
|
# Thus, in order to move on (and take advantage of the COVID-19 freed time off), I've decided to delete 'em.
|
|
git tag -d v1.1.0 v1.2.0 # ...
|
|
git push -d origin v1.1.0 v1.2.0 # ...
|
|
|
|
# This blog got only one branch, so it (looks like it) drastically simplified the procedure.
|
|
# I'd advise you to clean up your repository references too.
|
|
git branch -d feature/aint_time fix/not_a_bug # ...
|
|
git push -d origin feature/aint_time fix/not_a_bug # ...
|
|
|
|
# Now is the time to install LFS's hooks to your Git project internals.
|
|
git lfs install
|
|
|
|
# Let's go !
|
|
# The command below will show you what kind of files eat up your disk space.
|
|
git lfs migrate \
|
|
info \
|
|
--include-ref=refs/heads/master
|
|
|
|
# If you are more of a BASH-guy, this could help you too.
|
|
find . -type f -not -path './.git*' -exec file --extension -b {} ';' | sort | uniq
|
|
|
|
# Once you have identified the evil file extensions, you may rune something like :
|
|
git lfs migrate \
|
|
import \
|
|
--include="*.jpg,*.svg,*.eot,*.ttf,*.woff*,*.min.*" \
|
|
--include-ref=refs/heads/master
|
|
|
|
# > Is it really... finished ?
|
|
# Yes, and now it's verification time !
|
|
git lfs ls-files
|
|
cat .gitattributes
|
|
git log
|
|
git # ...
|
|
|
|
# If you're happy with the obtained results, you may clean Git internals.
|
|
git reflog expire --expire-unreachable=now --all
|
|
git gc --prune=now
|
|
|
|
# It's time to publish these changes, so here is a check list for you :
|
|
# [ ] Disable your CI/CD hooks ;
|
|
# [ ] Tell your colleagues **not** to push to the remote ;
|
|
# [ ] Make sure LFS is enabled on your Git server ;
|
|
# [ ] Make sure the target branch is not protected upstream ;
|
|
# [ ] Force push :
|
|
git push -f
|
|
|
|
# Git may has advised you to enable LFS file locking support, you should.
|
|
# See <https://github.com/git-lfs/git-lfs/wiki/File-Locking>.
|
|
git config lfs.https://your.code.host/owner/a-repository.git/info/lfs.locksverify true
|
|
{% endhighlight %}
|
|
|
|
Wow, you're done too ! Congratulations.
|
|
|
|
Your next (optional, but recommended) steps :
|
|
|
|
* Run the garbage collector (if possible) on the remote (see example below on the Gitea administration dashboard) ;
|
|
[![A missing blog post image](/img/blog/why-did-i-recently-rewrite-my-whole-blog-git-history_2.png)](/img/blog/why-did-i-recently-rewrite-my-whole-blog-git-history_2.png)
|
|
|
|
* Tell your colleagues to install Git LFS too **BEFORE** properly re-cloning the affected repository ;
|
|
|
|
* Apply the same previous operation on "read-only" mirrors (as your production for instance) ;
|
|
|
|
* Re-enable your CI/CD hooks.
|
|
|
|
### Conclusion
|
|
|
|
**TL;DR** No, I have not been hacked, I have voluntarily recently [rewritten the whole blog Git history](https://git.forestier.app/HorlogeSkynet/blog/compare/42bb72dc97209b05ba198c41ecf67146b93fcac1...7849565abeeb83ba947b35f4b5764e835a361a27).
|
|
|
|
### Sources
|
|
|
|
* [git-lfs-migrate(1) - Migrate history to or from git-lfs](https://github.com/git-lfs/git-lfs/blob/master/docs/man/git-lfs-migrate.1.ronn)
|
|
|
|
* [Migrating existing repository data to LFS](https://github.com/git-lfs/git-lfs/wiki/Tutorial#migrating-existing-repository-data-to-lfs)
|