ARG_MAX
| Shells
| whatshell
| portability
| permissions
| UUOC
| ancient
| -
| ../Various
| HOME
$@
"
| echo/printf
| set -e
| test
| tty defs
| tty chars
| $()
vs )
| IFS
| using siginfo
| nanosleep
| line charset
| locale
#!
magic, details about the shebang/hash-bang mechanism
on various Unix flavours2001-08-13 .. 2021-10-20 (see recent changes)
Here you'll find
See an old mail from Dennis Ritchie introducing the new feature,
quoted in 4.0 BSD /usr/src/sys/newsys/sys1.c.
The path component newsys
was an option.
It is also mentioned in
/usr/src/sys/sys/TODO (that is, in the regular path),
6. Exec fixes Implement dmr's #! feature; pass string arguments through faster.
So this #!
mechanism origins from Bell Labs, between Version 7 and Version 8,
and was then available on 4.0BSD (~10/'80), although not activated per default.
Two important differences to current implementations are:
The length of the line was limited to 16 (Research Unix) or 32 (BSD) bytes.
"Arguments" were not delivered.
It was then implemented by default on 4.2BSD (~09/'83),
/usr/src/sys/sys/kern_exec.c
by Robert Elz.
This implementation delivered all #! arguments as a single one.
Less than a year after 4.0BSD, but more than two years before 4.2 BSD, #!
was also added to
2.8BSD (~07/'81), but not active by default.
2.x BSD is a different development line, independent from 4 BSD.
It's a 7th edition (V7) kernel with fixes activated by macros.
The macro for the #!
code is not present in a makefile, so you had to activate it yourself.
The code wording is slightly different from 4 BSD.
On 2.8 BSD, #!
seems to come from the U.S. Geological Survey in Menlo Park, not from Berkeley.
(Thanks to Gunnar Ritter for pointing out the origins in 4.0 and 4.2BSD in de.comp.os.unix.shell, to Jeremy C. Reed for mentioning Robert Elz, and to Richard Kettlewell for spotting 2.8BSD on TUHS mailing list.)
In 4.3BSD Net/2 the code was removed due to the license war and had to be reimplemented for the descendants (e.g., NetBSD, 386BSD, BSDI).
In Version 8 (aka 8th edition), #!
is implemented in
/usr/sys/sys/sys1.c
and documented in exec(2)
.
Among the public releases from Bell Labs, #!
was not added until SVR4 ('88) according to a
TUHS list discussion.
System III and SVR1 definitely had not implemented it, yet.
According to Dennis M. Ritchie (email answer to Alex North-Keys)
he got the idea from elsewhere, perhaps from one of the UCB conferences on BSD.
And it seems #!
had no name originally.
Doug McIllroy mentions in the TUHS mailing list, that the slang for #
was "sharp" at the time at Bell Labs.
The paragraph
"3.16)
Why do some scripts start with #! ... ?"
(local copy),
emphasizes the history concerning shells, not the kernel.
That document is incorrect about two details (and it seems not to be actively maintained at the moment):
#!
was not invented at Berkeley (but they implemented
it first in widely distributed releases), see above.
#
csh-hack: the document explicitly states that only csh was modified
on the BSDs.
#!
required?
There is a rumor, that a very few and very special, earlier
Unix versions (particularly 4.2BSD derivatives) require you to
separate the "#!
" from the following path with a blank.
You may also read, that (allegedly) such a kernel parses
"#! /
" as a 32-bit (long) magic.
But it turns out that it is virtually impossible to find
a Unix which actually required this.
4.2BSD in fact doesn't require it, although
previous versions
of the GNU autoconf tutorial claimed this
("10. Portable Shell Programming", corrected with release 2.64, 2009-07-26).
But instead, see
4.2BSD,
/usr/src/sys/sys/kern_exec.c (the first regular occurence).
A blank is accepted but not required.
All this pointed out by Gunnar Ritter in
<3B5B0BA4.XY112IX2@bigfoot.de>
(and thanks to the new Caldera license, the code can be cited here now.)
Instead, the origin of this myth "of the required blank"
might be a particular release of 4.1 BSD:
There is a manpage in a "4.1.snap" snapshot of 4.1BSD
on the CSRG CDs, /usr/man/man2/exec.2 (4/1/81),
where a space/tab after the #!
is mentioned as mandatory.
However, this is not true: the source itself remained unchanged.
(Hint to the existence of such a manpage from Bruce Barnett in
<ae3m9l$rti$0@208.20.133.66>).
It's not clear whether this is a bug or confusion in documentation or if Berkeley planned to modify the BSD source but eventually did not.
DYNIX is mentioned in the autoconf documentation, too.
It's unclear if this variant might have implemented it in a few releases
(perhaps following the abovementioned manual page).
At least Dynix 3.2.0
or Dynix PTS 1.2.0
were actually 4.2 BSD derived and did not require the blank.
I asked David MacKenzie, the author of the autoconf
documentation, about the actual origin of the autoconf note.
But unfortunately neither the reporting author nor
the very system are recorded anymore.
Even intensive search of usenet archives didn't reveal any further hints to me.
I found no evidence yet, that there's an implementation
which forbids a blank after #!
4.4BSD, however, didn't support setuid scripts, yet.
The UNIX FAQ claims this
(4.7. "How can I get setuid shell scripts to work?"),
but it's explicitly denied in kern_exec.c.
setuid for scripts had been disabled with 4.3BSD-Tahoe already.
And the successor to 4.4BSD, 4.4BSD-Lite lost its execve()
implementation due to the license war.
Instead, a very early NetBSD release seems to be the origin concerning free BSDs
1.
[1] |
NetBSD already implements it in the first cvs entry for
exec_script.c
(1994/01/16), some time before release 1.0.
Earlier code has been removed from netbsd.org. The filedescriptor filesystem ("fdescfs") had been added with release 0.8 (04/93). NetBSD was influenced by 386BSD, but I couldn't find it there (including patchkit 0.2.4, 06/93). FreeBSD, which is a direct descendant of 386BSD, doesn't implement it either. OpenBSD forked off from NetBSD later (10/95) and thus implements it like NetBSD. Jason Steven aka Neozeed meanwhile provides
NetBSD 0.8
and 0.9 via cvsweb
(announcement)
|
Set user id support is implemented by means of the fd filesystem for instance on:
SETUIDSCRIPTS
activated)
SETUIDSCRIPTS
activated)
kern.sugid_scripts
.
Set user id support is also implemented on:
#!
mechanism),
#!
interpreter file is used.
-p
. Without this flag, the EUID is set back
to the UID if different.
#!
mechanism,
because you have to be aware of numerous issues.
#!
script
or: can you nest #!
?
Most probably there isn't any Bell-Labs- or Berkeley-derived Unix
that accepts the interpreter to be a script, which starts with #!
again.
However, Linux since 2.6.27.9 2
and Minix accept this.
Be careful not to confuse whether the kernel accepts it,
or if the kernel has returned with an ENOEXEC
and your shell silently tries to take over,
parsing the #!
line itself.
argv[0]
becomes the invoked script.)
#!
mechanism was not present at compile time
(probably only in unix-like environments like cygwin).
#!
, but only if "BSD" was not defined at compile time.
Later variants de-facto do not recognize it.
[2] |
For more information about nested #! on Linux, see the
kernel patch [if link dead, then try this page, archive.org]
(patch to be applied to 2.6.27.9) and especially see binfmt_script.c which contains the important parts. Linux allows at most BINPRM_MAX_RECURSION , that is 4, levels of nesting.
(hint to me about the change by Mantas Mikulėnas.) |
A very few systems deliver only the first argument, some systems split up the arguments
like a shell to fill up argv[]
,
most systems deliver all arguments as a single string. See the table below.
I noticed that for Linux (delivering all arguments as one string),
a patch to split up was suggested on the
Linux kernel mailing list (if link dead, then try this page, archive.org), followed by a discussion of some portability issues.
env(1)
is often used with the #!
mechanism to start
an interpreter, which then only needs to be somewhere
in your PATH, e.g. "#!/usr/bin/env perl
".
However, the location of env(1)
might vary.
Free-, Net-, OpenBSD and some Linux distributions (e.g. Debian)
only come with /usr/bin/env.
On the other hand, there's only /bin/env at least on SCO OpenServer 5.0.6
and Cray Unicos 9.0.2 (although the latter is only of historical interest).
On some other Linux distributions (Redhat) it's located in /bin and
/usr/bin/ contains a symbolic link pointing to it.
The env-mechanism is highly increasing convenience, and almost all systems nowadays provide /usr/bin/env. Yet, it cannot strictly assure "portability" of a script.
In practice, env should not be a script. See "can you nest #!" above.
FreeBSD 4.0 introduced a comment-like handling of "#" in the arguments,
but release 6.0 revoked this (see also a discussion
on freebsd-arch).
MacOS X introduced comment-like handling of "#" with release 10.3(/xnu-517/Panther)
#!
line:
sizeof(struct a.out)
" or "sizeof(struct exec)
".
union
, which contains both this struct
a.out (or exec) and a string of the same size which will contain the #!
line.
kern_execve.v
(in the Attic), which
inherited
from 386BSD-0.1 patch 0.2.2, and soon
added
allowing one argument.
kern/exec_script.c
(MAXINTERP
in
<sys/param.h>
or
PATH_MAX
in
<sys/syslimits.h>
,
respectively).
imgact_shell.c
and
<sys/imgact.h>
<machine/param.h>
(i386,
ia64,
sparc64,
amd64,
alpha:
param.h and
alpha_cpu.h,
supported until 6.3) ,
and <sys/param.h>
.
MAXSHELLCMDLEN
now is set to PAGESIZE
, which in turn depends on the architecture.
kern/exec_script.c
(MAXINTERP
in
<sys/param.h>
).
BINPRM_BUF_SIZE
in
load_script()
in
linux/fs/binfmt_script.c
,
<linux/binfmts.h>
and
<uapi/linux/binfmts.h>
).
On Linux, #!
was introduced with kernel release 0.09 or 0.10
(0.08 had not implemented it, yet).
And in fact, the original maximum length was 1022,
see linux/fs/exec.c
from Linux 0.10.
But with Linux 0.12,
this was changed to 127 (parts of a diff).
limits.h
or syslimits.h
on the respective system.
Exceptions are BIG-IP4.2 (BSD/OS4.1) with 4096 and FreeBSD since 6.0 (PAGE_SIZE) with 4096 or 8192 depending on the architecture.
Minix also uses the limit of PATH_MAX
characters
(255 here) but the actual limit is 257 characters,
because patch_stack()
in src/mm/exec.c
first skips the "#!
" with an lseek()
and then reads in the rest.
#!
magic with a multi character constant
#define SCRMAG '#!'
SCRMAG
,
and even added its own multi character constant for a variant of the magic:
# define SCRMAG2 '/*#!'
# define ARGPLACE "$*"
Find more information about this in the end notes [Demos].
sys/i386/i386/exec_machdep.c
) shows an interesting way to construct the magic
[...] switch (magic) { /* interpreters (note byte order dependency) */ case '#' | '!' << 8: handler = exec_interpreter; break; case [...]
#!
only as a possible extension:
Shell Introduction [...] If the first line of a file of shell commands starts with the characters #!, the results are unspecified. The construct #! is reserved for implementations wishing to provide that extension. A portable application cannot use #! as the first line of a shell script; it might not be interpreted as a comment. [...] Command Search and Execution [...] This description requires that the shell can execute shell scripts directly, even if the underlying system does not support the common #! interpreter convention. That is, if file foo contains shell commands and is executable, the following will execute foo: ./foo
There was a Working Group Resolution trying to define the mechanism.
On the other hand, speaking about "#!/bin/sh
" on any Unix:
This is a really rocksolid and portable convention by tradition,
if you expect anything from the Bourne shell family and its descendants
to be called.
#!
was a great hack to make scripts look and feel like real executable binaries.
But, as a little summary, what's special about #!
? (list mostly courtesy of David Korn)
#!
is much smaller than the maximum path length
$PATH
is not searched for the interpreter
#!
line also accepts a relative path,
#!interpreter
is equivalent to #!./interpreter
,
#!
script again
#!
line itself is varying
#!$SHELL
ENOENT
.
This error can be misleading, because many shells then print the script name
instead of the interpreter in its #!
line:
$cat script.sh #!/bin/notexistent $ ./script.sh ./script.sh: not foundbash since release 3 subsequently itself reads the first line and gives a diagnostic concerning the interpreter
bash: ./script.sh: /bin/notexistent: bad interpreter: No such file or directory
#!
line is too long, at least three things can happen:
E2BIG
(IRIX, SCO OpenServer)
or ENAMETOOLONG
(FreeBSD, BIG-IP4.2, BSD/OS4.1)
ENOEXEC
.
In some shells this results in a silent failure.
I used the following as program "showargs":
#include <stdio.h> int main(argc, argv) int argc; char** argv; { int i; for (i=0; i<argc; i++) fprintf(stdout, "argv[%d]: \"%s\"\n", i, argv[i]); return(0); }
and a one line script named "invoker.sh" to call it, similar to this,
#!/tmp/showargs -1 -2 -3
to produce the following results (tried them myself, but I'd like to add your results from yet different systems).
Typically, a result from the above would look like this:
argv[0]: "/tmp/showargs" argv[1]: "-1 -2 -3" argv[2]: "./invoker.sh"
... but the following table lists the variations. The meaning of the columns is explained below.
OS (arch) | maximum length of #! line | cut-off (cut),
error (error) or ENOEXEC | all args in one,
no arguments, only the 1st arg, or separate args | handle # like a comment | argv[0]: invoker, instead of interpreter | not full path in argv[0] | remove trailing white- space | convert tabulator to space | accept inter- preter | do not search current directory | no suid
or allow suid or optional |
4.0BSD / 4.1BSD | 32 | no | n/a | X | n/a | suid | |||||
386BSD-0.1p2.3 | 32 | no | n/a | X | n/a | ||||||
4.2BSD | 32 | ? | ? | ? | ? | X | suid | ||||
4.3BSD | 32 | c / - [43bsd] | X | X | suid | ||||||
4.3BSD-Tahoe/Quasijarus | 32 | X | X | ||||||||
AIX 3.2.5/4.3.2 (rs6k) | 256 | X | X | ||||||||
BIG-IP4.2 [big-ip] | 4096 | err | args | ? | ? | X | n/a | ||||
Dynix 3.2 | 32 | ? | ? | X | ? | ||||||
EP/IX 2.2.1 (mips) | 1024 | X | suid | ||||||||
FreeBSD 1.1- / 4.0-4.4 | 64 | args | - / X | X | n/a | ? | |||||
FreeBSD 4.5- | 128 | err | args | X | X | n/a | ? | ||||
FreeBSD 6.0-8.1 (i386/amd64, ia64/sparc64/alpha) | 4096, 8192 | cut | X | X | |||||||
FreeBSD 8.1 9/2010 (i386/amd64, ia64/sparc64/alpha) | 4096, 8192 | X | X | ||||||||
HP-UX A.08.07/B.09.03 | 32 | X | ? | ? | ? | ||||||
HP-UX B.10.10 | 128 | X | X | ? | ? | ? | |||||
HP-UX B.10.20-11.31 | 128 | X | X | ? | |||||||
IRIX 4.0.5 (mips) | 64 | ? | ? | X | X | ||||||
IRIX 5.3/6.5 (mips) | 256 | err | X | suid | |||||||
Linux 0.10 / 0.12-0.99.1 | 1022 / 127 | [early-linux] | [early-linux] | X | ? | ||||||
Linux 0.99.2-2.2.26 | 127 | cut | X | X | ? | ||||||
Linux 2.4.0-2.6.27.8 / 2.6.27.9- | 127 | cut | X | / X | |||||||
MacOS X 10.0/.1/.2, xnu 123.5-344 | 512 | ? | ? | X | ? | ? | ? | ||||
MacOS X 10.3, xnu 517 | 512 | X | ? | ? | X | X | ? | ? | ? | ||
MacOS X 10.4/.5/.6, xnu 792-1504 | 512 | args | X | X | n/a | opt | |||||
Minix 2.0.3-3.1.1 | 257 | args | X | n/a | X | suid | |||||
Minix 3.1.8 | 257 | err | args | X | n/a | suid | |||||
MUNIX 3.1 (svr3.x, 68k) | 32 | X | ? | ? | ? | ||||||
NetBSD 0.9 | 32 | cut [netbsd0.9] | opt [netbsd0.9] | ||||||||
NetBSD 1.0-1.6Q / 1.6R- | 64 / 1024 | opt | |||||||||
OpenBSD 2.0-3.4 | 64 | opt | |||||||||
OSF1 V4.0B-T5.1 | 1024 | X | X | ||||||||
OpenServer 5.0.6 [sco] | 256 | err | 1st | X | X | ||||||
OpenServer 6.0.0: see UnixWare | |||||||||||
SINIX 5.20 (mx300/nsc) | 32 | ? | ? | ||||||||
Plan 9 v4 (i386) | 30 | args | X | X | X | n/a | ? | ||||
SunOS 4.1.4 (sparc) | 32 | cut | X | X | |||||||
SunOS 5.x (sparc) | 1024 | 1st | X | X | suid | ||||||
SVR4.0 v2.1 (x386) | 256 | error | 1st | ? | ? | X | X | suid | |||
Ultrix 4.0 (�vax 3900) | 31 | X | X | suid | |||||||
Ultrix 4.5 (�vax3900) | 32/31(suid) | cut | X | X | suid | ||||||
Ultrix 4.3 (vax/mips), 4.5 (vax3100) | 32 | cut | X | ? | ? | ||||||
Ultrix 4.5 (risc) | 80 | cut | X | ? | ? | ||||||
Unicos 9.0.2.2 (cray) | 32 | X | ? | ? | |||||||
UnixWare 7.1.4, OpenServer 6.0.0 [suid] | 256 | err | 1st | X | X | suid | |||||
GNU Hurd cvs-20020529, 0.3/Mach1.3.99 [hurd] | 4096 | cut | X | X | X | ||||||
UWIN 4.5 (WinXP prof 5.1) [uwin] | 512 | ||||||||||
Cygwin Beta19 (WinXP prof 5.1) [cygwin] | 263 | cut | args | X | n/a | X | ? | ||||
Cygwin 1.7.7 (WinXP prof 5.1) [cygwin] | 32764 | err | X | ||||||||
Cygwin 1.7.35 (Win7) [cygwin] | 65536 | err | X | X | |||||||
OS (arch) | maximum length of #! line | cut-off (cut),
error (error) or ENOEXEC | all args in one,
no arguments, only the 1st arg, or separate args | handle # like a comment | argv[0]: invoker, instead of interpreter | not full path in argv[0] | remove trailing white- space | convert tabulator to space | accept inter- preter | do not search current directory | no suid
or allow suid or optional |
Untested, but some information or even source available: | |||||||||||
first implementation between Version 7 and 8 (unreleased, see above) | 16 | no | n/a | ? | X | n/a | ? | suid | |||
Version 8 (aka 8th edition) | 32 | 1st | n/a | ? | X | ? | ? | suid | |||
Demos / "Демос" [Demos] | ? | ? | args | ? | ? | ? | ? | ? | ? | ? | ? |
argv[1]: "-1 -2 -3"
argv[1]: "-1"
argv[1]: "-1", argv[2]: "-2", argv[3]: "-3"
#
like a comment": if #
appears in the arguments,
then the #
and the rest of the line is ignored
argv[0]
doesn't contain
"/tmp/showargs
" but "./invoker.sh
"
argv[0]
contains
the basename of the called program instead of its full path.
#!
"-line may be an
interpreted script itself
#!sh
" doesn't work if called from /bin
n/a
" means that the attribute is not relevant in this case.
[orig] | 4.0BSD and 386BSD-0.1 don't hand over any argument at all.
The called interpreter only receives argv[0] with it's own path, argv[1] with the script, and optionally further arguments from the call of the script. | |
[43bsd] | The code in kern_exec.c tests if the byte after the struct containing the #! line is null.
Otherwise it throws an ENOEXEC.
However, reading the line from the file is also limited to 32 bytes, and the following byte (not from the file) is often zeroed out by coincidence. It then looks as if the line was cut to 32 bytes. But sometimes, you actually get an ENOEXEC. | |
[netbsd0.9] | If the line is longer than 32 bytes, it triggers a bug:
the scriptname is appended to argv[1] and argv[2] contains an environment variable.
setuid support is a compile time option, however not per Makefile but by activating it in kern_exec.c itself. | |
[big-ip] | This BIG-IP 4.2 (vendor is F5) is based on BSDi BSD/OS 4.1,
probably even with very few modifications:
The tools contain the string "BSD/OS 4.1" and there's also a kernel /bsd-generic, which contains "BSDi BSD/OS 4.1". I had no compiler available on this system, thus some tests are pending. | |
[sco] | John H. DuBois told me that
#! was introduced in SCO UNIX 3.2v4.0, but was disabled by default.
If you wanted to use it, it had to be enabled by setting hashplingenable
in kernel/space.c ("hashpling" because
it was implemented by programmers in Britain). It was apparently enabled by default in 3.2v4.2, but even then there were no #! scripts shipped with the OS as a customer might disable it.
The first #! scripts (tcl) were shipped in 3.2v5.0 then.
| |
[early-linux] | On linux 0.10 until 0.99.1, argv[0] contains both the interpreter and the arguments:
argv[0]: "/tmp/showargs -1 -2 -3"
| |
[hurd] | Nesting interpreters this way:
$ ./script2 -2 script2: #!/path/script1 -1 script1: #!/path/showargs -0results in argv[0]: "/path/showargs" argv[1]: "-0" argv[2]: "/path/script1" argv[3]: "-1" argv[4]: "./script2" argv[5]: "-2" | |
[uwin] | An example for a valid absolute interpreter path is C:/path/to/interpreter
A path with backslashes or without the drive letter is not accepted. Home of the UWIN package at AT&T | |
[cygwin] | Valid absolute interpreter paths are for example
C:/path/to/interpreter and /path/to/interpreter
Backslashes are not accepted. Nested script are only possible if a drive letter is used argv[0] becomes a path in windows notation C:\path\to\interpreter
nested cygwin.com: Web-Git (formerly Web-CVS)
On cygwin-1.7.55 the call even can succeed with values greater than 65536, but only occasionally. | |
[Demos] | DEMOS / ДЕМОС was a Soviet variant of 2.9BSD (PDP-11 version),
or 4.2 BSD (32bit VAX-version), respectively.
See also the Wikipedia entry and gunkies.org. Demos recognizes #!CMD A1 $* A2 A3 Demos also knows an alternative magic /*#!for interpreters which use /* as comment instead of # .
Thanks to Random821 for pointing out this
special implementation on the THUS list.
|
And why shebang? In music, '#' means sharp. So just
shorten #!
to sharp-bang. Or it might be derived from "shell
bang". All this probably under the influence of the american slang
idiom "the whole shebang" (everything, the works, everything
involved in what is under consideration).
See also the
wiktionary,
jargon
dictionary
or Merriam-Websters.
Sometimes it's also called hash-bang, pound-bang,
sha-bang/shabang, hash-exclam, or hash-pling
(british, isn't it?).
According to Dennis M. Ritchie (email answer to Alex North-Keys) it seems it had no name originally.
And Doug McIllroy mentioned in the TUHS mailing list, that the slang for #
at Bell Labs most probably was "sharp" at the time.
<http://www.in-ulm.de/~mascheck/various/shebang/>