Git diff images and pdfs
Git is a fantastic tool to version control code. And the advantages of version controlling my work are so evident that I want to version control everything else I do besides code. Most of my work consists of editing text files, and I have even forced my workflow into text files just to be able to version control more of it. Thus, I keep track of my source code, org-mode notes and then some \(\LaTeX\) files.
However, while working on a report, I realized that I generate figures and other data plots that also require to be version controlled. I do send those images, binary blobs, into git. But I don’t really have a good way to track their changes. Github provides a great tool for checking the diff of image files. But I want to do that locally on my machine. So I decided to solve this and found a solution from these websites 1, 2, 3. From now on I can diff image files thanks to imagemagick and git difftool.
First start creating a
.gitattributes file. It can be specific to a project
or global to the user, if you save it in your home directory. This file
tells git how to treat files during version control. In this case I’ll
png image files as binaries, so that they never show a text
diff representation in git.
*.svg binary *.png binary *.pdf diff=pdf
~/.gitconfig, my global configuration, I setup how to treat
pdfinfo can give me
about them. I only need to add this 2 lines in the
[diff "pdf"] textconv = pdfinfo
More exciting now is to add the
difftool configuration. I call it
image_diff and then declare the command
cmd that will perform the diff
[difftool "image_diff"] cmd = compare $REMOTE $LOCAL png:- | montage -geometry 400x -font Liberation-Sans -label "reference" $LOCAL -label "diff" - -label "current--%f" $REMOTE x:
$REMOTE are variables intrinsic to git and correspond to the
old/staged file and the current/unstaged file.
compare takes the
2 files that can be treated by imagemagick and creates a comparison
stream (defined by
png:-). The output is piped to
montage to create a
more informative 3 column image with the reference file to the left, the
diff in the middle and to the right is the current file.
sets the size of the image, feel free to scale it.
-font Liberation-Sans is
the font of the labels, I set it up because montage seems to default to
Helvetica which I don’t care to install in my system.
When I’m working on my code or any text file I can review/stage and commit all my changes from the shell or with any other tool I use to communicate with git.
Let’s take for example this simple Python code.
import matplotlib.pyplot as plt import numpy as np x = np.linspace(0, 4, 122) plt.plot(x, np.sin(x)) plt.xlabel('$x$') plt.ylabel(r'$\sin(x)$') plt.savefig('sin.png') plt.close()
Now I can continue editing my source code. I now plot a second function, include the necessary labels and finally save the plot under the same name.
import matplotlib.pyplot as plt import numpy as np x = np.linspace(0, 4, 122) plt.plot(x, np.sin(x), 'C0', label=r'$f(x)=\sin(x)$') plt.plot(x, np.sin(3*x), 'C1', label=r'$f(x)=\sin(3x)$') plt.xlabel('$x$') plt.ylabel(r'$f(x)$') plt.legend(loc=0) plt.savefig('sin.png') plt.close()
I review and stage all my changes for text files in the usual way. But
for image files I can now review the changes using the
git difftool that I
git difftool -t image_diff
This will ask me if I want to launch
image_diff to evaluate the diffs of
every file not staged. When it comes to the image file I
accept, it immediately brings into display the diff image.
After reviewing the changes and being conscious why they happened I stage the modified image and do a new commit.