Skip to content

Commit

Permalink
DX: Check trailing spaces in project files only
Browse files Browse the repository at this point in the history
Previously the `check_trailing_spaces.sh` script was checking files that
do not belong to the project, for example:

 - diverse `composer.lock` files
 - local development files like `.idea` or temporary files created

In its' previous design, the list of files to check for trailing spaces
is generated by `find` which is not aware about different files (not) in
the project. To exclude non-project files, it worked with a larger
blacklist of checking if a file is not in diverse paths, for example
and most prominently:

  - -not -path "./.git/*"

which is kind of contradictory as the project *is* managed by git, but
that folder can have a totally different name.

This blacklist is pretty verbose and also needs extra knowledge and care
due to the nature of `find` as it operates on file-system level, not
project level more specifically.

Files in the project are managed by `git` and `git` knows which files
belong into the project and which not. Git on ...

 - ... repository level contains the information which project files are
   ignored. for example the `composer.lock` files.
 - ... developer or system level also knows about which files are
   ignored. for example the `.idea` folder (see "global .gitignore")

Change here is to replace `find` as file-system only utility with `git`
itself, namely the `git-grep` command, for the operation to obtain
the list of files to check.

This includes the exec-call of grep, as gits' grep is a little more
powerful, too [1] *and* is compatible in the output format for the
post-processing with sed in the check-script.

This allows to drop the manually crafted blacklist and to specifically
mark paths/files not to be checked [2] which are (only) the test
fixtures in:

 - tests/Fixtures/

The knowledge of git about the project itself and how the developer
works next to the reduced list of files to check has the additional
benefit of a cache and overall results in a much better performance.

[1]: Fun with "git grep" <https://gitster.livejournal.com/27674.html>
[2]: pathspec - gitglossary <https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec>
  • Loading branch information
ktomk committed May 18, 2020
1 parent e5de921 commit 865d1ed
Showing 1 changed file with 3 additions and 9 deletions.
12 changes: 3 additions & 9 deletions check_trailing_spaces.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,16 @@
set -eu

files_with_trailing_spaces=$(
find . \
-type f \
-not -path "./.git/*" \
-not -path "./dev-tools/bin/*" \
-not -path "./dev-tools/vendor/*" \
-not -path "./vendor/*" \
-not -path "./tests/Fixtures/*" \
-exec grep -EIHn "\\s$" {} \; \
git grep -EIn "\\s$" \
':!tests/Fixtures/*' \
| sort -fh
)

if [ "$files_with_trailing_spaces" ]
then
printf '\033[97;41mTrailing whitespaces detected:\033[0m\n'
e=$(printf '\033')
echo "${files_with_trailing_spaces}" | sed -E "s/^\\.\\/([^:]+):([0-9]+):(.*[^\\t ])?([\\t ]+)$/${e}[0;31m - in ${e}[0;33m\\1${e}[0;31m at line ${e}[0;33m\\2\\n ${e}[0;31m>${e}[0m \\3${e}[41;1m\\4${e}[0m/"
echo "${files_with_trailing_spaces}" | sed -E "s/^([^:]+):([0-9]+):(.*[^\\t ])?([\\t ]+)$/${e}[0;31m - in ${e}[0;33m\\1${e}[0;31m at line ${e}[0;33m\\2\\n ${e}[0;31m>${e}[0m \\3${e}[41;1m\\4${e}[0m/"

exit 3
fi
Expand Down

0 comments on commit 865d1ed

Please sign in to comment.