Abhinav Gupta | About

Handle spaces in file names with xargs

1. Background

Skip to Problem if you’re already familiar with xargs.

xargs is a versatile tool that you should keep handy in your Unix toolbox. It lets you use the output of one command as the input for another command.

For example, the following command finds all files in the current directory and its subdirectories that have the extension .md, and adds them to the current git repository.

$ find . -name '*.md' | xargs git add

This is equivalent to running the command,

$ git add ./foo.md ./bar/baz.md ./qux.md ./quux/foo.md  # ...and so on

A couple flags that you’ll need to know about are:

-n N

Specifies the number of arguments per invocation of the command.

Without this flag, xargs will try to pass as many arguments as the system allows to the target command. With it, xargs will pass at most N arguments to the target command at a time. Try it out:

$ find . -name '*.md' | xargs echo (1)
./qux.md ./foo.md ./quux/foo.md ./bar/baz.md

$ find . -name '*.md' | xargs -n 2 echo (2)
./qux.md ./foo.md
./quux/foo.md ./bar/baz.md
  1. Without a -n flag, all files are listed on a single line because echo received them in the same argument group.

  2. With -n 2, pairs of files are listed together because echo receives them two at a time.

-I PLACEHOLDER

Specifies that PLACEHOLDER in the specified command should be replaced with the argument for that invocation.

Try it out:

$ find . -name '*.md' | xargs -I% echo "<%>"
<./qux.md>
<./foo.md>
<./quux/foo.md>
<./bar/baz.md>

Use this to move or rename files in bulk. For example, the following searches for Markdown files tagged with #Recipes, and moves them to the recipes folder.

$ grep --include '*.md' -l '#Recipes' -r . |
>   xargs -I% mv % recipes/%

See man xargs for a more comprehensive list.

2. Problem

xargs splits file names on blanks — newlines, tabs, and spaces. This can cause unexpected behavior if you’re operating on files and the file names have spaces in them.

For example, the following command searches for Markdown files with the string #backup in them and creates a .tar file out of them.

$ grep --include '*.md' -l '#backup' -r . |
>   xargs tar -cvf backup.tar
./Homework.md
./TODO.md

This appears to work, but what happens if there’s a file with a space in its name?

$ grep --include '*.md' -l '#backup' -r .
./How to use xargs.md
./Homework.md
./TODO.md

$ grep --include '*.md' -l '#backup' -r . |
>   xargs tar -cvf backup.tar
tar: ./How: Cannot stat: No such file or directory
tar: to: Cannot stat: No such file or directory
tar: use: Cannot stat: No such file or directory
tar: xargs.md: Cannot stat: No such file or directory
./Homework.md
./TODO.md
tar: Exiting with failure status due to previous errors

The name "How to use xargs.md" got split into four different arguments, and that broke the tar command.

3. Solution

xargs supports quoting input strings to handle spaces, but it’s rare for commands generating the output (e.g., grep) to quote their output in a compatible manner. Some versions of xargs also support a flag to change the delimiter, but the key phrase there is "some versions."

To solve this reliably in a manner that works across all versions of xargs, you can use the -0 flag.

-0

Specifies that the input delimits entries with null characters (\0) instead of blanks. xargs performs no other quoting or splitting if this flag is set.

Let’s play with it.

3.1. Fixing grep

First, switch the command above from grep to echo so that we can experiment with it. This is a good practice if you’re planning on performing a destructive operation like moving or deleting files.

$ grep --include '*.md' -l '#backup' -r . |
>   xargs -n1 echo
./How
to
use
xargs.md
./Homework.md
./TODO.md

This splits the file names on spaces like before so we can verify our fix against it.

Looking around man grep, we find this entry:

-Z

Output a zero byte (the ASCII NUL character) instead of the character that normally follows a file name.

This looks like the puzzle piece that fits into xargs -0's input slot. Let’s try it out.

$ grep --include '*.md' -l '#backup' -r . -Z |
>   xargs -0 -n1 echo
./How to use xargs.md
./Homework.md
./TODO.md

That’s better! This works for our original backup command too.

$ grep --include '*.md' -l '#backup' -r . -Z |
>   xargs -0 tar -cvf backup.tar
./How to use xargs.md
./Homework.md
./TODO.md

3.2. What about commands that aren’t grep?

What if you’re running something besides grep? Several Unix commands support equivalent flags.

Here are some,

CommandFlag

find

-print0

rg

-0

sed

-z

sort

-z

For example, the following command backs up all markdown files in-order.

$ find . -name '*.md' -print0 |
>   sort -z | (1)
>   xargs -0 tar -cvf backup.tar
./Homework.md
./How to use xargs.md
./README.md
./TODO.md
  1. The sort is unnecessary because we’re feeding the output into tar, but you get the point.

💡 Tip

If you’re using find to generate your output, you often don’t need xargs. You can use find's -exec flag to accomplish the same thing.

$ find . -name '*.md' -exec tar -cvf backup.tar '{}' '+'
./How to use xargs.md
./Homework.md
./TODO.md
./README.md

The following replaces our use of find with rg.

$ rg -0 --files -g '*.md' |
>   sort -z |
>   xargs -0 tar -cvf backup.tar
Homework.md
How to use xargs.md
README.md
TODO.md

3.3. What about other commands?

If the command you’re running doesn’t support a flag equivalent to -0/-Z, or you don’t remember the flag, you can use the tr command to turn newlines into nulls.

$ grep --include '*.md' -l '#backup' -r . |
>   tr '\n' '\0' | (1)
>   xargs -0 tar -cvf backup.tar
./How to use xargs.md
./Homework.md
./TODO.md
  1. Turn all newlines into null characters.

Personally, I tend to use tr even with commands that support -0/-Z because I don’t always remember whether it was -0 or -Z.

4. Conclusion

In summary,

  • Use xargs -0 when dealing with file names to handle spaces in the file names.

  • Use equivalent flags in other commands to generate and consume compatible output.

  • Use tr "\n" "\0" when there’s no such flag or you don’t remember it.

Written on 2022-06-04.