Git Doesn’t Set the Modification Time on Files at Checkoutchain link icon indicating a link to a heading

The last time I updated this blog, I noticed that the last updated time on the about page had changed to the last build time. My site generator, Pelican, uses the source file’s modified time for this value, and this was being set by the operating system of the build worker when the source repository was being cloned.

As explained in the Git FAQ, there are good reasons to not preserve the modification times, such as making sure that build systems like make work as expected. Linus Torvalds also explains this in a slightly more boisterous fashion in this thread.

In Pelican’s use case, this behavior also makes sense: it typically will only rebuild existing output files whose source files have changed. Because my build process starts from a clean slate each time, there won’t be any existing files, so this optimization is useless. For similar reasons, the modification time on all the files will be the time that the clone took place.

I decided that, from the perspective of the build system, the modification time should match the commit time. To implement this, I wrote a one-liner to execute as part of the build process that touches the content files with their most recent commit time before generating the site:

git ls-tree -r -z --name-only HEAD content/ \
  | xargs -0 -I {} -- \
    git log --date='format:%Y%m%d%H%M.%S' \
      --format='format:%ad%x00{}%x00' -1 -- {} \
  | xargs -0 -n 2 -- touch -t

This pipeline prints the names of the files in the content directory known to Git separated by null bytes, uses xargs and git log to find the last modified timestamp (the -1 argument means only use the most recent revision) and prints it in a form that can be used as arguments to touch, and then finally uses xargs again to touch each of the files with the appropriate time.

As you can see, git log’s --format option allows a pretty wide set of output formats; the file name is not among the available placeholders. To work around this, I used xargs -I {} to literally place the file name in the format option. As a result, for example using the file foo.md, the git log invocation will see a format string like format:%ad%x00foo.md%x00. The %x00 placeholders are literal nulls for the subsequent xargs -0 command.