I’m a fan of alphabetically ordered lists in code and many other places too. It provides an obvious structure for lists of packages, variables, strings, and many other things. It makes checking by eye for the presence or absence of a member or element very quick. As a code base undergoes changes over time, the use of alphabetical order can result in diffs that are smaller and easier to grok. Plus I think alphabetical order appeals on an aesthetic level too, aside from the many practical considerations.

So I wrote a pre-commit hook for my Jekyll blog that verifies whether the category keywords in my blog posts are in alphabetical order. A full code listing follows. (My favorite bit is the embedded Python script—this is a new trick I’m trying out!)

#!/usr/bin/env bash

# debugging switches
# set -o errexit   # abort on nonzero exit status; same as set -e
# set -o nounset   # abort on unbound variable; same as set -u
# set -o pipefail  # don't hide errors within pipes
# set -o xtrace    # show commands being executed; same as set -x
# set -o verbose   # verbose mode; same as set -v

# A pre-commit git hook script to verify that
# category keywords are listed in case-insensitive alphabetical order
# in Jekyll blog posts.

exec 1>&2

# use a temp file and clean up on exit
function finish {
    rm -rf "${temp_file}"
trap finish EXIT

# get list of staged files
git diff --cached --name-only --diff-filter=ACM > "${temp_file}"

# embedded python script via heredoc
read -r -d '' verify_categories_are_alphabetical_py <<'EOF'
import re
import sys

filename = sys.argv[1]

with open(filename, mode="r", encoding="utf-8") as file:
    for line in file:
        match = re.match(r"^categories:\ *", line)
        if match:
            line = line[len(match.group()) :]
            words = line.strip().split()
            sorted_words = sorted(words, key=str.lower)
            if words != sorted_words:
                print(f"Error: categories in {filename} aren't in alphabetical order")
                print(f"Categories found: {' '.join(words)}")
                print(f"Expected order: {' '.join(sorted_words)}")


# python script is used to check just the ".markdown" files
while read -r file; do
    if [[ "${file}" =~ \.markdown$ ]]; then
        if ! python3 -c "${verify_categories_are_alphabetical_py}" "${file}"; then
            printf "Commit failed on %s because its categories aren't in alphabetical order.\n" "${file}"
            exit 1
done < "${temp_file}"

exit 0