Git is a fantastic tool to version control code. And the advantages of version controlling my work are so evident that I want to version control everything else I do besides code. Most of my work consists of editing text files, and I have even forced my workflow into text files just to be able to version control more of it. Thus, I keep track of my source code, org-mode notes and then some \(\LaTeX\) files.

However, while working on a report, I realized that I generate figures and other data plots that also require to be version controlled. I do send those images, binary blobs, into git. But I don’t really have a good way to track their changes. Github provides a great tool for checking the diff of image files. But I want to do that locally on my machine. So I decided to solve this and found a solution from these websites 1, 2, 3. From now on I can diff image files thanks to imagemagick and git difftool .

The configuration

First start creating a .gitattributes file. It can be specific to a project or global to the user, if you save it in your home directory. This file tells git how to treat files during version control. In this case I’ll define svg and png image files as binaries, so that they never show a text diff representation in git. pdf files on the other hand will be treated with a special filter.

1*.svg binary
2*.png binary
3*.pdf diff=pdf

Next in ~/.gitconfig, my global configuration, I setup how to treat pdf files. Their text representation is the information pdfinfo can give me about them. I only need to add this 2 lines in the .gitconfig file.

1[diff "pdf"]
2      textconv = pdfinfo

More exciting now is to add the difftool configuration. I call it image_diff and then declare the command cmd that will perform the diff action.

1[difftool "image_diff"]
2      cmd = compare $REMOTE $LOCAL png:- | montage -geometry 400x -font Liberation-Sans -label "reference" $LOCAL -label "diff" - -label "current--%f" $REMOTE x:

$LOCAL and $REMOTE are variables intrinsic to git and correspond to the old/staged file and the current/unstaged file. compare takes the 2 files that can be treated by imagemagick and creates a comparison png stream (defined by png:-). The output is piped to montage to create a more informative 3 column image with the reference file to the left, the diff in the middle and to the right is the current file. -geometry 400x sets the size of the image, feel free to scale it. -font Liberation-Sans is the font of the labels, I set it up because montage seems to default to Helvetica which I don’t care to install in my system.

The workflow

When I’m working on my code or any text file I can review/stage and commit all my changes from the shell or with any other tool I use to communicate with git.

Let’s take for example this simple Python code.

1import matplotlib.pyplot as plt
2import numpy as np
3x = np.linspace(0, 4, 122)
4plt.plot(x, np.sin(x))
Figure 1: Image generate by the previous codeblock

Figure 1: Image generate by the previous codeblock

Now I can continue editing my source code. I now plot a second function, include the necessary labels and finally save the plot under the same name.

 1import matplotlib.pyplot as plt
 2import numpy as np
 3x = np.linspace(0, 4, 122)
 4plt.plot(x, np.sin(x), 'C0', label=r'$f(x)=\sin(x)$')
 5plt.plot(x, np.sin(3*x), 'C1', label=r'$f(x)=\sin(3x)$')
Figure 2: New plot generated by the updated code

Figure 2: New plot generated by the updated code

I review and stage all my changes for text files in the usual way. But for image files I can now review the changes using the git difftool that I just defined.

1git difftool -t image_diff

This will ask me if I want to launch image_diff to evaluate the diffs of every file not staged. When it comes to the image file I accept, it immediately brings into display the diff image.

Figure 3: Image diff

Figure 3: Image diff

After reviewing the changes and being conscious why they happened I stage the modified image and do a new commit.