ARG_MAX
| Shells
| whatshell
| portability
| permissions
| UUOC
| ancient
| -
| ../Various
| HOME
$@
"
| echo/printf
| set -e
| test
| tty defs
| tty chars
| $()
vs )
| IFS
| using siginfo
| nanosleep
| line charset
| locale
find(1)
emphasizing on portability and the very details of a few basic issues
2016-11-01 (see recent changes)
Table of content
Basic issues:
-maxdepth
)
xargs(1)
(-exec ... {} +
)
-name '*'
)
-print
-ls
-path
-printf
-H/-L
..{}..
Pointers to several system independent implementations.
(What else can you do with find?)
An often recommended way to limit find(1)
to one
level (i.e., not descending into directories) is using the
expression -maxdepth 1
.
This expression was introduced by GNU findutils.
FreeBSD 4.1, NetBSD 2.0, OpenBSD 2.0 and the AST toolchest adopted this.
Mac OS X implements it since 10.2
(Darwin 6), switching from NetBSD find to FreeBSD 4.5 find.
But it's also possible to do this with the traditional find 1.
[1] |
You need a find utility that offers the expression "-prune" .
And nowadays this is generally available, because it had already been introduced with 4.3BSD-Reno and SVR4 (i.e., about 1989). An exception to this rule is the busybox multi-call binary: it emphasizes on minimalistic replacements. It implements -prune and -maxdepth since june 2007 as a compile time option. |
The portable way to avoid descending is
$ find . ! -name . -prune <remaining expressions> $ find /etc/. ! -name . -prune <remaining expressions>The first variant is flawless and the remaining text in this section is about the 2nd variant.
Usually, the second works as well (but I know SVR4.0 v2.1 and Cray Unicos as exception).
(But be very careful with other variants like
"find /etc [...]
"
and "find /etc/ [...]
".)
The order of arguments is crucial here, because they are not options but expressions.
The explanation sounds obvious: find
never
lists the '..'
entry. If you also exclude the
'.'
entry and then apply "-prune"
to all
the remaining entries, find
certainly won't descend anymore.
If you ever need the '.'
included (pointed out by Stéphane Chazelas in comp.unix.shell),
then you can use "find . \( -name . -o -prune \)
".
About the portability:
$ find /etc/. ! -name /etc/. -prune <remaining expressions> $ find /etc ! -name /etc -prune <remaining expressions>and so on; because here the string is compared literally (and not with an internal "basename" mechanism). The results of this call will look like "
/etc/./hosts
", which resolves without any problems on Unix.
Here, be careful when additionally applying a pattern match on the result of the "-prune". At least the versions since 3.8 until the stable 4.1 (and until the alpha release 4.1.5) suffer from a bug, which prevents them from working:
$ find . ! -name . -prune -name '<pattern>' <remaining expressions>"-name" is not only applied to the result of "
-prune
" but all existing entries,
because it is "optimized" to the left of "-prune".
This is violating the fundamental left-to-right order of evaluation.
By the way: If you negate the second "-name" then the bug is apparently
not triggered anymore.
Thus, unmaking the negation by just
doubling it looks like a proper workaround:
$ find . ! -name . -prune ! \( ! -name '<pattern>' \) <remaining expressions>(See <3D7355C4.23L11TSIK@bigfoot.de> and <3DA23EDA.LHE11XWCK@bigfoot.de>.)
find /etc ! -name etc
-prune <remaining expressions>
"
This won't omit files like "/etc/etc/file
".
Thanks to Stéphane Chazelas for reminding me of this.
And interestingly, on SCO OpenServer 5.0.6, this requires
"! -name /etc
" instead of
"! -name etc
", for top level directories (only).
find /etc/ ! -name etc
-prune <remaining options>
"
/etc/etc/file
" either; and
the actual output is even varying, because on many systems,
you mustn't append a slash to the path argument. This is depending on the
find-internal "basename" implementation, which is often simpler than
the libc implementation:
The above will fail with a trailing slash for example on AIX 3.2, FreeBSD 4.3, 4.5, GNU-findutils-4.1, Irix 6.5, 5.3, NetBSD 1.5.2, OpenBSD 2.9, SunOS 4.1.4, OpenServer 5.0.6, Solaris 2.1-2.9 and UnixWare 1 (aka SVR4.2).
It will work properly on AIX 4.3, HP-UX 10.x, NetBSD 1.5 and OSF1/V4, /V5.
On some of the former, affected versions (all Solaris, FreeBSD-4.3, OpenBSD-2.9, Irix 5/6, UnixWare 1), you even can illustrate the "basename bug" with an interesting workaround. Both following examples yield the same correct result, although the empty argument is provided in the second case:
$ find /etc ! -name etc -prune -print $ find /etc/ ! -name '' -prune -printThis interesting side effect is another reason to mention these two unportable variants.
Why using find
at all? It's helpful for
'.'
and '..'
(particularly, if you feed the result of find
to other commands, like
chown
, chmod
, etc). Otherwise,
you would have to fiddle around with more complicated shell pattern
matching 2 or alike.
[2] |
To match all possible files without . and .. you could use * ..?* .[!.]*
Also, see also 8ned6b$k8$1@nnrp1.deja.com
and ff., |
find
xargs(1)
(with -exec ... {} +
) Another frequently mentioned feature of GNU findutils
3 is a special combination
with xargs, "[...] -print0 | xargs -0".
The purpose is increasing performance by avoiding to fork/exec
for each single argument;
and the usage of null-terminated
filenames avoids problems with unexpected filenames.
[3] |
GNU find introduced it with release 2.0 in nov '90.
NetBSD 1.0, FreeBSD 2.0.5, (and thus) Mac OS X came with it from start, and OpenBSD 2.1, the AST toolchest and HP-UX 11.23 incorporated this feature. |
However, various find implementations know about the expression
-exec +
(instead of -exec \;
).
This increases performance in the same way, but obsoletes xargs.
It's much simpler then:
$ find . -name xxx -exec command {} +This has become a standard: it's specified in the SUSv3, aka IEEE 1003.1-2001/2004.
find . -exec echo x {} +
" works correct.
Implementations without this feature:
Having -exec ... +
in mind, the usefulness and elegance of -print0
is debatable:
xargs
executes once, which might be unexpected.
Stéphane Chazelas pointed out that, however, at least
GNU grep (-Z), GNU sort (-z), perl (-0) and zsh (via IFS)
understand this output (at the time of this writing).
Apart from these, some utilities from the new AT&T ast toolchest
also understand it and the bash "read
" built-in knows an
appropriate option (-d delimiter) since release 2.04.
xargs
: possible problems with the character set encoding of arguments for xargs.
xargs
implementations (e.g. SunOS 5) yield an error,
if you run them in a multi-byte locale
find . -name '*')
-name
is a pattern. POSIX requires (plain) pattern matching, not filename expansion (file globbing).
case $var in pattern)
"
So, according to POSIX, "find . -name '*'
" shall match leading dots.
On Version 7, find implements matching with
glob(3)
and thus handles the dot special.
The same applies to System III and V and to the BSDs until 4.3BSD-Tahoe.
In the BSD line, find on 4.3BSD-Reno switched to
fnmatch(3)
, so the dot was not special anymore.
What about later implementations?
The dot is special on
*
before the dot,
that is, '*.
' matches '.
'.
/bin/find
and /bin/posix/find
bin/find
and /bin/posix/find
[sco6]
The dot is not special anymore on:
/u95/bin/find
[sco6]
/usr/xpg4/bin/find
-print
"?find(1)
required "-print
".
However, it was changed to be the default action on 4.3BSD-Reno
and in POSIX.
It's not required on:
It was required on:
Omitting -print
:
Be careful with omitting -print
when using logical operators (-a or -o):
An implicit -print
binds to the whole expression, while
an explicit -print
binds as if it was added as -a -print
.
Example:
find . -name omit-directory -prune -o -type f find . \( -name omit-directory -prune -o -type f \) -print find . -name omit-directory -prune -o -type f -print find . \( -name omit-directory -prune \) -o \( -type f -print \)
BSD fast-find
Another issue where omitting -print
can be confusing: the BSD fast-find feature.
It was first implemented in 4.3 BSD: If given only one single argument,
find filenamethen a database was searched for pathnames containing this filename as component.
But if -print
is the default action, this can be confused with the widely known syntax,
where one single argument is interpreted as a directory to descend into.
Thus the feature was removed with 4.3 BSD-Reno and implemented with another utility, locate
.
Some variants which implemented the fast-find feature:
-ls
"?find
didn't know "-ls
".
It is for instance available on
It is not available on
By the way, the resulting output imitates "ls -ldis
".
It includes inode number and size in blocks
(with system specific blocksize), sometimes it's documented as kilobytes.
-path
"?find
didn't know "-path
".
It is for example available on
It is not available on
-printf
"?find -print
") with
-H/-L
"-H
" was introduced with 4.4BSD-alpha; "-L
" with 4.4BSD-Lite.
Some implementations that support the options -H/-L
:
-H
)
-H
)
-H
)
XPG
find)
Several modern implementations substitute {} even if it is embedded in a string like this
find . -exec echo xx{}xx \;
However, the traditional find requires {} to be a separate argument.
The new behaviour was introduced around 386BSD,
according to the FreeBSD manpage archive.
POSIX/SUS only specifies {} standing alone, but allows them to be embedded
as implementation defined behaviour.
These variants for example accept an embedded {}:
but these don't accept an embedded {}:
find ... -exec sh -c '...' x {}
" mean?$SHELL -c 'cmd' x y z ...
", how
do shells convert the arguments to $0, $1, etc.?
find
and sh
is very versatile,
like a swiss army knife.
Consider the call
$SHELL -c 'command' arg0 arg1 ...Almost all shells set
$0
to arg0
and $1
to arg1
.
$0
yourself
"$@"
as usual, if you supply your arguments starting from arg1
.
Thus you will set arg0
to something which makes sense to become $0
, e.g. "sh" or "find-sh".
If you set it to something different, keep in mind that an embedded
(Update: I can't reproduce any of this,
r
might trigger a restricted mode,
or that zsh switches to different modes depending on $0
.
not even with early bourne shells. Thanks to Stéphane Chazelas for the hint.)
However: early variants 4
of both the Almquist shell and the the Korn shell (before ksh88f)
implement this differently: $0
is set to $SHELL
and $1
is set to arg0
. Here you need:
$SHELL -c 'command' arg1 arg2 ...
What does this mean, if you need a portable call?
Back to find: if you only want to make use of one argument
per call (that is, $1
but not $2
, $3
, etc.),
you can work around by merging both variants:
find ... -exec sh -c '...' {} {}But if you want to use more parameters (or just
"$@"
) in the shell:
SHELL -c 'shift $1; command' 2 1 arg1 arg2 ...
Example: You want to move all files in one directory to another directory,
but get an "arg list too long" due to ARG_MAX (too many files) with "mv * /dir
".
For reasons of speed you might want to avoid a shell for-loop if you have
a really high number of files.
And if you cannot exclude filenames with blanks or newlines,
you will use "find"
with its +
notation.
You will need the shell to hand over the last argument,
the target directory, to the "mv"
command separately. 5
Here are two solutions, the first assumes a modern shell implementation,
the second is completely robust,
find . ! -name . -prune -exec sh -c 'mv "$@" targetdirectory/' sh {} + find . ! -name . -prune -exec sh -c 'shift $1; mv "$@" targetdirectory/' 2 1 {} +
You do need the shell, because the following is not a valid syntax (it was not allowed to minimize possible confusion with a valid argument "+"):
find ... -exec mv {} target_directory +
[4] |
Shell implementations with the old behaviour about arg0 :
Older variants than ksh88f for example exist on · HP-UX 8-11 ("ksh" but not "sh"): ksh88c · AIX 3.2: ksh88d · Ultrix 4.5: ksh88 · Unicos 9: ksh88e · SVR4.0 v2.1: ksh88d Older variants of ash exist on · all traditional BSDs that come with an ash (4.3BSD-Net/2 ... 4.4BSD-Lite2) · FreeBSD before 2.1.0 (10/95) · NetBSD before 1.2 (10/96) · Minix before 3.1.3 (5/06) |
[5] |
GNU mv knows an option to prepend the target and allows to
find [...] -exec mv --target-directory=dir {} +
|
Gunnar Ritter's
Heirloom Toolchest
implements several traditional variants and POSIX/SUS.
The AT&T AST
toolchest is a POSIX/SUS implementation with numerous extensions.
The busybox toolchest is aiming at tiny implementations.
Jörg Schilling's
sfind
is a POSIX/SUS implementation.
find
.
What else can you do with find
?
Find prime numbers!