If you're new to Perl, you should start with perlintro, which is a general intro for beginners and provides some background to help you navigate the rest of Perl's extensive documentation.
For ease of access, the Perl manual has been split up into several sections.
perl Perl overview (this section)
perlintro Perl introduction for beginners
perltoc Perl documentation table of contents
perlreftut Perl references short introduction
perldsc Perl data structures intro
perllol Perl data structures: arrays of arrays
perlrequick Perl regular expressions quick start
perlretut Perl regular expressions tutorial
perlboot Perl OO tutorial for beginners
perltoot Perl OO tutorial, part 1
perltooc Perl OO tutorial, part 2
perlbot Perl OO tricks and examples
perlstyle Perl style guide
perlcheat Perl cheat sheet
perltrap Perl traps for the unwary
perldebtut Perl debugging tutorial
perlfaq Perl frequently asked questions
perlfaq1 General Questions About Perl
perlfaq2 Obtaining and Learning about Perl
perlfaq3 Programming Tools
perlfaq4 Data Manipulation
perlfaq5 Files and Formats
perlfaq6 Regexes
perlfaq7 Perl Language Issues
perlfaq8 System Interaction
perlfaq9 Networking
perlsyn Perl syntax
perldata Perl data structures
perlop Perl operators and precedence
perlsub Perl subroutines
perlfunc Perl built-in functions
perlopentut Perl open() tutorial
perlpacktut Perl pack() and unpack() tutorial
perlpod Perl plain old documentation
perlpodspec Perl plain old documentation format specification
perlrun Perl execution and options
perldiag Perl diagnostic messages
perllexwarn Perl warnings and their control
perldebug Perl debugging
perlvar Perl predefined variables
perlre Perl regular expressions, the rest of the story
perlreref Perl regular expressions quick reference
perlref Perl references, the rest of the story
perlform Perl formats
perlobj Perl objects
perltie Perl objects hidden behind simple variables
perldbmfilter Perl DBM filters
perlipc Perl interprocess communication
perlfork Perl fork() information
perlnumber Perl number semantics
perlthrtut Perl threads tutorial
perlothrtut Old Perl threads tutorial
perlport Perl portability guide
perllocale Perl locale support
perluniintro Perl Unicode introduction
perlunicode Perl Unicode support
perlebcdic Considerations for running Perl on EBCDIC platforms
perlsec Perl security
perlmod Perl modules: how they work
perlmodlib Perl modules: how to write and use
perlmodstyle Perl modules: how to write modules with style
perlmodinstall Perl modules: how to install from CPAN
perlnewmod Perl modules: preparing a new module for distribution
perlutil utilities packaged with the Perl distribution
perlcompile Perl compiler suite intro
perlfilter Perl source filters
perlglossary Perl Glossary
perlembed Perl ways to embed perl in your C or C++ application
perldebguts Perl debugging guts and tips
perlxstut Perl XS tutorial
perlxs Perl XS application programming interface
perlclib Internal replacements for standard C library functions
perlguts Perl internal functions for those doing extensions
perlcall Perl calling conventions from C
perlapi Perl API listing (autogenerated)
perlintern Perl internal functions (autogenerated)
perliol C API for Perl's implementation of IO in Layers
perlapio Perl internal IO abstraction interface
perlhack Perl hackers guide
perlbook Perl book information
perltodo Perl things to do
perldoc Look up Perl documentation in Pod format
perlhist Perl history records
perldelta Perl changes since previous version
perl587delta Perl changes in version 5.8.7
perl586delta Perl changes in version 5.8.6
perl585delta Perl changes in version 5.8.5
perl584delta Perl changes in version 5.8.4
perl583delta Perl changes in version 5.8.3
perl582delta Perl changes in version 5.8.2
perl581delta Perl changes in version 5.8.1
perl58delta Perl changes in version 5.8.0
perl573delta Perl changes in version 5.7.3
perl572delta Perl changes in version 5.7.2
perl571delta Perl changes in version 5.7.1
perl570delta Perl changes in version 5.7.0
perl561delta Perl changes in version 5.6.1
perl56delta Perl changes in version 5.6
perl5005delta Perl changes in version 5.005
perl5004delta Perl changes in version 5.004
perlartistic Perl Artistic License
perlgpl GNU General Public License
perlcn Perl for Simplified Chinese (in EUC-CN)
perljp Perl for Japanese (in EUC-JP)
perlko Perl for Korean (in EUC-KR)
perltw Perl for Traditional Chinese (in Big5)
perlaix Perl notes for AIX
perlamiga Perl notes for AmigaOS
perlapollo Perl notes for Apollo DomainOS
perlbeos Perl notes for BeOS
perlbs2000 Perl notes for POSIX-BC BS2000
perlce Perl notes for WinCE
perlcygwin Perl notes for Cygwin
perldgux Perl notes for DG/UX
perldos Perl notes for DOS
perlepoc Perl notes for EPOC
perlfreebsd Perl notes for FreeBSD
perlhpux Perl notes for HP-UX
perlhurd Perl notes for Hurd
perlirix Perl notes for Irix
perllinux Perl notes for Linux
perlmachten Perl notes for Power MachTen
perlmacos Perl notes for Mac OS (Classic)
perlmacosx Perl notes for Mac OS X
perlmint Perl notes for MiNT
perlmpeix Perl notes for MPE/iX
perlnetware Perl notes for NetWare
perlopenbsd Perl notes for OpenBSD
perlos2 Perl notes for OS/2
perlos390 Perl notes for OS/390
perlos400 Perl notes for OS/400
perlplan9 Perl notes for Plan 9
perlqnx Perl notes for QNX
perlsolaris Perl notes for Solaris
perltru64 Perl notes for Tru64
perluts Perl notes for UTS
perlvmesa Perl notes for VM/ESA
perlvms Perl notes for VMS
perlvos Perl notes for Stratus VOS
perlwin32 Perl notes for Windows
By default, the manpages listed above are installed in the /usr/local/man/ directory.
Extensive additional documentation for Perl modules is available. The default configuration for perl will place this additional documentation in the /usr/local/lib/perl5/man directory (or else in the man subdirectory of the Perl library directory). Some of this additional documentation is distributed standard with Perl, but you'll also find documentation for third-party modules there.
You should be able to view Perl's documentation with your man(1) program by including the proper directories in the appropriate start-up files, or in the MANPATH environment variable. To find out where the configuration has installed the manpages, type:
perl -V:man.dir
If the directories have a common stem, such as /usr/local/man/man1 and /usr/local/man/man3, you need only to add that stem (/usr/local/man) to your man(1) configuration files or your MANPATH environment variable. If they do not share a stem, you'll have to add both stems.
If that doesn't work for some reason, you can still use the supplied perldoc script to view module information. You might also look into getting a replacement man program.
If something strange has gone wrong with your program and you're not sure where you should look for help, try the -w switch first. It will often point out exactly where the trouble is.
Perl combines (in the author's opinion, anyway) some of the best features of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it. (Language historians will also note some vestiges of csh, Pascal, and even BASIC-PLUS.) Expression syntax corresponds closely to C expression syntax. Unlike most Unix utilities, Perl does not arbitrarily limit the size of your data---if you've got the memory, Perl can slurp in your whole file as a single string. Recursion is of unlimited depth. And the tables used by hashes (sometimes called ``associative arrays'') grow as necessary to prevent degraded performance. Perl can use sophisticated pattern matching techniques to scan large amounts of data quickly. Although optimized for scanning text, Perl can also deal with binary data, and can make dbm files look like hashes. Setuid Perl scripts are safer than C programs through a dataflow tracing mechanism that prevents many stupid security holes.
If you have a problem that would ordinarily use sed or awk or sh, but it exceeds their capabilities or must run a little faster, and you don't want to write the silly thing in C, then Perl may be for you. There are also translators to turn your sed and awk scripts into Perl scripts.
But wait, there's more...
Begun in 1993 (see perlhist), Perl version 5 is nearly a complete rewrite that provides the following additional benefits:
Described in perlmod, perlmodlib, and perlmodinstall.
Described in perlembed, perlxstut, perlxs, perlcall, perlguts, and xsubpp.
Described in perltie and AnyDBM_File.
Described in perlsub.
Described in perlreftut, perlref, perldsc, and perllol.
Described in perlobj, perlboot, perltoot, perltooc, and perlbot.
Described in perlthrtut and threads.
Described in perluniintro, perllocale and Locale::Maketext.
Described in perlsub.
Described in perlre, with additional examples in perlop.
Described in perldebtut, perldebug and perldebguts.
Described in POSIX.
Okay, that's definitely enough hype.
If your Perl success stories and testimonials may be of help to others who wish to advocate the use of Perl in their applications, or if you wish to simply express your gratitude to Larry and the Perl developers, please write to perl-thanks@perl.org .
"@INC" locations of perl libraries
a2p awk to perl translator s2p sed to perl translator
http://www.perl.org/ the Perl homepage http://www.perl.com/ Perl articles (O'Reilly) http://www.cpan.org/ the Comprehensive Perl Archive http://www.pm.org/ the Perl Mongers
See perldiag for explanations of all Perl's diagnostics. The "use diagnostics" pragma automatically turns Perl's normally terse warnings and errors into these longer forms.
Compilation errors will tell you the line number of the error, with an indication of the next token or token type that was to be examined. (In a script passed to Perl via -e switches, each -e is counted as one line.)
Setuid scripts have additional constraints that can produce error messages such as ``Insecure dependency''. See perlsec.
Did we mention that you should definitely consider using the -w switch?
Perl is at the mercy of your machine's definitions of various operations such as type casting, atof(), and floating-point output with sprintf().
If your stdio requires a seek or eof between reads and writes on a particular stream, so does Perl. (This doesn't apply to sysread() and syswrite().)
While none of the built-in data types have any arbitrary size limits (apart from memory size), there are still a few arbitrary limits: a given variable name may not be longer than 251 characters. Line numbers displayed by diagnostics are internally stored as short integers, so they are limited to a maximum of 65535 (higher numbers usually being affected by wraparound).
You may mail your bug reports (be sure to include full configuration information as output by the myconfig program in the perl source tree, or by "perl -V") to perlbug@perl.org . If you've succeeded in compiling perl, the perlbug script in the utils/ subdirectory can be used to help mail in a bug report.
Perl actually stands for Pathologically Eclectic Rubbish Lister, but don't tell anyone I said that.
The three principal virtues of a programmer are Laziness, Impatience, and Hubris. See the Camel Book for why.
A mutt configuration file consists of a series of lqcommandsrq. Each line of the file may contain one or more commands. When multiple commands are used, they must be separated by a semicolon (lq;rq).
The hash mark, or pound sign (lq#rq), is used as a lqcommentrq character. You can use it to annotate your initialization file. All text after the comment character to the end of the line is ignored.
Single quotes (lq'rq) and double quotes (lq"rq) can be used to quote strings which contain spaces or other special characters. The difference between the two types of quotes is similar to that of many popular shell programs, namely that a single quote is used to specify a literal string (one that is not interpreted for shell variables or quoting with a backslash [see next paragraph]), while double quotes indicate a string which should be evaluated. For example, backquotes are evaluated inside of double quotes, but not single quotes.
rs quotes the next character, just as in shells such as bash and zsh. For example, if want to put quotes (lq"rq) inside of a string, you can use lqrsrq to force the next character to be a literal instead of interpreted character.
lqrsrsrq means to insert a literal lqrsrq into the line. lqrsnrq and lqrsrrq have their usual C meanings of linefeed and carriage-return, respectively.
A lqrsrq at the end of a line can be used to split commands over multiple lines, provided that the split points don't appear in the middle of command names.
It is also possible to substitute the output of a Unix command in an initialization file. This is accomplished by enclosing the command in backquotes (`command`).
UNIX environment variables can be accessed like the way it is done in shells like sh and bash: Prepend the name of the variable by a dollar (lqDorq) sign.
alias [-group name [...]] key address [, address [ ... ]] unalias [ * | key ]
group [-group name] [-rx EXPR [ ... ]] [-addr address [ ... ]] ungroup [-group name ] [ * | [[-rx EXPR [ ... ]] [-addr address [ ... ]]]
alternates [-group name] regexp [ , regexp [ ... ]] unalternates [ * | regexp [ , regexp [ ... ]] ]
alternative_order type[/subtype] [ ... ] unalternative_order [ * | type/subtype] [...]
auto_view type[/subtype] [ ... ] unauto_view type[/subtype] [ ... ]
mime_lookup type[/subtype] [ ... ] unmime_lookup type[/subtype] [ ... ]
color object foreground background [ regexp ] color index foreground background [ pattern ] uncolor index pattern [ pattern ... ]
mono object attribute [ regexp ] mono index attribute [ pattern ]
lists [-group name] regexp [ regexp ... ] unlists regexp [ regexp ... ] subscribe [-group name] regexp [ regexp ... ] unsubscribe regexp [ regexp ... ]
mailboxes filename [ filename ... ] unmailboxes [ * | filename ... ]
my_hdr string unmy_hdr field
set [no|inv|&|?]variable[=value] [ ... ] toggle variable [ ... ] unset variable [ ... ] reset variable [ ... ]
In various places with mutt, including some of the abovementioned hook commands, you can specify patterns to match messages.
A simple pattern consists of an operator of the form lq~characterrq, possibly followed by a parameter against which mutt is supposed to match the object specified by this operator. For some characters, the ~ may be replaced by another character to alter the behavior of the match. These are described in the list of operators, below.
With some of these operators, the object to be matched consists of several e-mail addresses. In these cases, the object is matched if at least one of these e-mail addresses matches. You can prepend a hat (lq^rq) character to such a pattern to indicate that all addresses must match in order to match the object.
You can construct complex patterns by combining simple patterns with logical operators. Logical AND is specified by simply concatenating two simple patterns, for instance lq~C mutt-dev ~s bugrq. Logical OR is specified by inserting a vertical bar (lq|rq) between two patterns, for instance lq~C mutt-dev | ~s bugrq. Additionally, you can negate a pattern by prepending a bang (lq!rq) character. For logical grouping, use braces (lq()rq). Example: lq!(~t mutt|~c mutt) ~f elkinsrq.
Mutt understands the following simple patterns:
In the above, EXPR is a regular expression.
With the ~m, ~n, ~X, and ~z operators, you can also specify ranges in the forms <MAX, >MIN, MIN-, and -MAX.
The ~d and ~r operators are used to match date ranges, which are interpreted to be given in your local time zone.
A date is of the form DD[/MM[/[cc]YY]], that is, a two-digit date, optionally followed by a two-digit month, optionally followed by a year specifications. Omitted fields default to the current month and year.
Mutt understands either two or four digit year specifications. When given a two-digit year, mutt will interpret values less than 70 as lying in the 21st century (i.e., lq38rq means 2038 and not 1938, and lq00rq is interpreted as 2000), and values greater than or equal to 70 as lying in the 20th century.
Note that this behaviour is Y2K compliant, but that mutt does have a Y2.07K problem.
If a date range consists of a single date, the operator in question will match that precise date. If the date range consists of a dash (lq-rq), followed by a date, this range will match any date before and up to the date given. Similarly, a date followed by a dash matches the date given and any later point of time. Two dates, separated by a dash, match any date which lies in the given range of time.
You can also modify any absolute date by giving an error range. An error range consists of one of the characters +, -, *, followed by a positive number, followed by one of the unit characters y, m, w, or d, specifying a unit of years, months, weeks, or days. + increases the maximum date matched by the given interval of time, - decreases the minimum date matched by the given interval of time, and * increases the maximum date and decreases the minimum date matched by the given interval of time. It is possible to give multiple error margins, which cumulate. Example: 1/1/2001-1w+2w*3d
You can also specify offsets relative to the current date. An offset is specified as one of the characters <, >, =, followed by a positive number, followed by one of the unit characters y, m, w, or d. > matches dates which are older than the specified amount of time, an offset which begins with the character < matches dates which are more recent than the specified amount of time, and an offset which begins with the character = matches points of time which are precisely the given amount of time ago.
Type: quadoption Default: ask-yes
Type: quadoption Default: yes
Type: path Default: lq~/.muttrcrq
Type: string Default: lq%4n %2f %t %-10a %rrq
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lq%u%D%I %t%4n %T%.40d%> [%.7m/%.10M, %.6e%?C?, %C?, %s] rq
Type: string Default: lq\nrq
Type: boolean Default: yes
Type: string Default: lqOn %d, %n wrote:rq
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: no
Type: quadoption Default: ask-yes
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: string Default: lqrq
Type: boolean Default: yes
Type: boolean Default: yes
Type: boolean Default: no
Type: string Default: lq-- Mutt: Compose [Approx. msg size: %l Atts: %a]%>-rq
Type: string Default: lqrq
Type: boolean Default: yes
Type: boolean Default: yes
Type: number Default: 30
Type: string Default: lqtext/plainrq
Type: quadoption Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: yes
Type: string Default: lq!%a, %b %d, %Y at %I:%M:%S%p %Zrq
Type: string Default: lq~f %s !~P | (~P ~C %s)rq
Type: quadoption Default: ask-yes
Type: boolean Default: yes
Type: boolean Default: yes
Type: path Default: lqrq
Type: path Default: lq/opt/mutt/bin/mutt_dotlockrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: boolean Default: yes
Type: boolean Default: no
Type: path Default: lqrq
Type: boolean Default: no
Type: e-mail address Default: lqrq
Type: string Default: lq~rq
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: no
Type: path Default: lq~/Mailrq
Type: string Default: lq%2C %t %N %F %2l %-8.8u %-8.8g %8s %d %frq
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: yes
Type: quadoption Default: yes
Type: string Default: lq[%a: %s]rq
Type: boolean Default: no
Type: e-mail address Default: lqrq
Type: regular expression Default: lq^[^,]*rq
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: yes
Type: number Default: 10
Type: path Default: lq~/.mutthistoryrq
Type: quadoption Default: yes
Type: string Default: lqrq
Type: boolean Default: no
Type: boolean Default: no
Type: string Default: lqrq
Type: boolean Default: no
Type: string Default: lq/.rq
Type: string Default: lqrq
Type: boolean Default: no
Type: number Default: 900
Type: boolean Default: no
Type: string Default: lqrq
Type: string Default: lqrq
Type: boolean Default: yes
Type: boolean Default: yes
Type: boolean Default: yes
Type: string Default: lqrq
Type: boolean Default: no
Type: quadoption Default: ask-yes
Type: boolean Default: no
Type: string Default: lq> rq
Type: string
Default: lq%4C %Z %{%b %d} %-15.15L (%?l?%4l&%4c?) %srq
Type: path Default: lq/usr/bin/ispellrq
Type: boolean Default: no
Type: string Default: lqCrq
Type: number Default: 5
Type: string Default: lqrq
Type: boolean Default: yes
Type: path Default: lqrq
Type: boolean Default: yes
Type: string Default: lq16384rq
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: yes
Type: regular expression Default: lq!^\.[^.]rq
Type: path Default: lq~/mboxrq
Type: folder magic Default: mbox
Type: boolean Default: no
Type: number Default: 0
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: string Default: lqflaggedrq
Type: string Default: lqrepliedrq
Type: string Default: lqunseenrq
Type: quadoption Default: no
Type: boolean Default: no
Type: quadoption Default: yes
Type: string Default: lq%4n %c %-16s %arq
Type: path Default: lqmixmasterrq
Type: quadoption Default: ask-no
Type: path Default: lqrq
Type: boolean Default: no
Type: string Default: lq%srq
Type: boolean Default: no
Type: number Default: 10
Type: path Default: lqbuiltinrq
Type: number Default: 0
Type: string Default: lq-%Z- %C/%m: %-20.20n %s%* -- (%P)rq
Type: number Default: 0
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: no
Type: quadoption Default: yes
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: yes
Type: string Default: lq%4n %t%f %4l/0x%k %-4a %2c %urq
Type: regular expression Default: lqrq
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: yes
Type: string Default: lqrq
Type: boolean Default: yes
Type: number Default: 300
Type: sort order Default: address
Type: quadoption Default: ask-yes
Type: boolean Default: no
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: boolean Default: yes
Type: number Default: 300
Type: string Default: lqrq
Type: path Default: lqrq
Type: path Default: lqrq
Type: path Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: path Default: lqrq
Type: boolean Default: no
Type: quadoption Default: yes
Type: path Default: lq~/.mutt_certificatesrq
Type: boolean Default: yes
Type: path Default: lqrq
Type: boolean Default: yes
Type: boolean Default: yes
Type: boolean Default: yes
Type: number Default: 0
Type: path Default: lqrq
Type: boolean Default: no
Type: boolean Default: no
Type: string Default: lq\nrq
Type: string Default: lqrq
Type: boolean Default: yes
Type: number Default: 60
Type: quadoption Default: ask-no
Type: string Default: lqrq
Type: boolean Default: no
Type: quadoption Default: ask-yes
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: quadoption Default: ask-yes
Type: path Default: lq~/postponedrq
Type: string Default: lqrq
Type: quadoption Default: ask-no
Type: path Default: lqlprrq
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: yes
Type: path Default: lqrq
Type: string Default: lq%4c %t %-25.25a %-25.25n %?e?(%e)?rq
Type: quadoption Default: yes
Type: regular expression Default: lq^([ \t]*[|>:}#])+rq
Type: number Default: 10
Type: boolean Default: no
Type: string Default: lqrq
Type: quadoption Default: ask-yes
Type: path Default: lq~/sentrq
Type: regular expression Default: lq^(re([\[0-9\]+])*|aw):[ \t]*rq
Type: boolean Default: no
Type: quadoption Default: ask-yes
Type: boolean Default: yes
Type: boolean Default: no
alias juser abd30425@somewhere.net (Joe User)
From: abd30425@somewhere.net
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: yes
Type: number Default: 0
Type: boolean Default: no
Type: boolean Default: yes
Type: number Default: -1
Type: number Default: 9999
Type: number Default: -1
Type: string Default: lqus-ascii:iso-8859-1:utf-8rq
Type: path Default: lq/usr/sbin/sendmail -oem -oirq
Type: number Default: 0
Type: path Default: lqrq
Type: boolean Default: yes
Type: boolean Default: no
Type: path Default: lq~/.signaturerq
Type: string Default: lq~f %s | ~s %srq
Type: boolean Default: yes
Type: regular expression
Default: lq(>From )|(:[-^]?[][)(><}{|/DP])rq
Type: number Default: 1
Type: string Default: lqrq
Type: string Default: lqrq
Type: string Default: lqrq
Type: sort order Default: date
date or date-sent date-received from mailbox-order (unsorted) score size spam subject threads to
Type: sort order Default: alias
address (sort alphabetically by email address) alias (sort alphabetically by alias name) unsorted (leave in order specified in .muttrc)
Type: sort order Default: date
Type: sort order Default: alpha
alpha (alphabetically) date size unsorted
Type: boolean Default: yes
Type: string Default: lq,rq
Type: path Default: lqrq
Type: string Default: lq-*%Arq
Type: string Default: lq-%r-Mutt: %f [Msgs:%?M?%M/?%m%?n? New:%n?%?o? Old:%o?%?d? Del:%d?%?F? Flag:%F?%?t? Tag:%t?%?p? Post:%p?%?b? Inc:%b?%?l? %l?]---(%s/%S)-%>-(%P)---rq
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: boolean Default: no
Type: number Default: 0
Type: number Default: 600
Type: path Default: lqrq
Type: string Default: lq +TCFLrq
Type: string Default: lqrq
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: no
Type: boolean Default: yes
Type: boolean Default: yes
Type: boolean Default: yes
Type: boolean Default: yes
Type: path Default: lqrq
Type: boolean Default: yes
Type: boolean Default: yes
Type: number Default: 0
Type: boolean Default: yes
Type: number Default: 0
Type: number Default: 10
Type: boolean Default: yes
iconv(1), iconv(3), mailcap(5), maildir(5), mbox(5), mutt(1), printf(3), regex(7), strftime(3)
The Mutt Manual
The Mutt home page: http://www.mutt.org/
Michael Elkins, and others. Use <mutt-dev@mutt.org> to contact the developers.
mutt [-nRyzZ] [-e cmd] [-F file] [-m type] [-f file]
mutt [-nx] [-e cmd] [-F file] [-H file] [-i file] [-s subj] [-b addr] [-c addr] [-a file [...]] [--] addr [...]
mutt [-n] [-e cmd] [-F file] -p
mutt [-n] [-e cmd] [-F file] -A alias
mutt [-n] [-e cmd] [-F file] -Q query
mutt -v[v]
Mutt is a small but very powerful text based program for reading and sending electronic mail under unix operating systems, including support for color terminals, MIME, OpenPGP, and a threaded sorting mode.
None. Mutts have fleas, not bugs.
Suspend/resume while editing a file with an external editor does not work under SunOS 4.x if you use the curses lib in /usr/5lib. It does work with the S-Lang library, however.
Resizing the screen while using an external pager causes Mutt to go haywire on some systems.
Suspend/resume does not work under Ultrix.
The help line for the index menu is not updated if you change the bindings for one of the functions listed while Mutt is running.
For a more up-to-date list of bugs, errm, fleas, please visit the mutt project's bug tracking system under http://bugs.mutt.org/.
curses(3), mailcap(5), maildir(5), mbox(5), mutt_dotlock(1), muttrc(5), ncurses(3), sendmail(1), smail(1).
Mutt Home Page: http://www.mutt.org/
Michael Elkins, and others. Use <mutt-dev@mutt.org> to contact the developers.
Grep searches the named input FILEs (or standard input if no files are named, or the file name - is given) for lines containing a match to the given PATTERN. By default, grep prints the matching lines.
In addition, two variant programs egrep and fgrep are available. Egrep is the same as grep -E. Fgrep is the same as grep -F.
A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions.
Grep understands two different versions of regular expression syntax: ``basic'' and ``extended.'' In GNU grep, there is no difference in available functionality using either syntax. In other implementations, basic regular expressions are less powerful. The following description applies to extended regular expressions; differences for basic regular expressions are summarized afterwards.
The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash.
A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of the list is the caret ^ then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit.
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value C.
Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [[:alnum:]] means [0-9A-Za-z], except the latter form depends upon the C locale and the ASCII character encoding, whereas the former is independent of locale and character set. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket list.) Most metacharacters lose their special meaning inside lists. To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal - place it last.
The period . matches any single character. The symbol \w is a synonym for [[:alnum:]] and \W is a synonym for [^[:alnum]].
The caret ^ and the dollar sign $ are metacharacters that respectively match the empty string at the beginning and end of a line. The symbols \< and \> respectively match the empty string at the beginning and end of a word. The symbol \b matches the empty string at the edge of a word, and \B matches the empty string provided it's not at the edge of a word.
A regular expression may be followed by one of several repetition operators:
Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions.
Two regular expressions may be joined by the infix operator |; the resulting regular expression matches any string matching either subexpression.
Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules.
The backreference \n, where n is a single digit, matches the substring previously matched by the nth parenthesized subexpression of the regular expression.
In basic regular expressions the metacharacters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).
Traditional egrep did not support the { metacharacter, and some egrep implementations support \{ instead, so portable scripts should avoid { in egrep patterns and should use [{] to match a literal {.
GNU egrep attempts to support traditional usage by assuming that { is not special if it would be the start of an invalid interval specification. For example, the shell command egrep '{1' searches for the two-character string {1 instead of reporting a syntax error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.
A locale LC_foo is specified by examining the three environment variables LC_ALL, LC_foo, LANG, in that order. The first of these variables that is set specifies the locale. For example, if LC_ALL is not set, but LC_MESSAGES is set to pt_BR, then Brazilian Portuguese is used for the LC_MESSAGES locale. The C locale is used if none of these environment variables are set, or if the locale catalog is not installed, or if grep was not compiled with national language support (NLS).
Normally, exit status is 0 if selected lines are found and 1 otherwise. But the exit status is 2 if an error occurred, unless the -q or --quiet or --silent option is used and a selected line is found.
Email bug reports to bug-grep@gnu.org.
Large repetition counts in the {n,m} construct may cause grep to use lots of memory. In addition, certain other obscure regular expressions require exponential time and space, and may cause grep to run out of memory.
Backreferences are very slow, and may require exponential time.
If no -e, --expression, -f, or --file option is given, then the first non-option argument is taken as the sed script to interpret. All remaining arguments are names of input files; if no input files are specified, then the standard input is read.
E-mail bug reports to: bonzini@gnu.org . Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
After the address (or address-range), and before the command, a ! may be inserted, which specifies that the command shall only be executed if the address (or address-range) does not match.
The following address types are supported:
GNU sed also supports some special 2-address forms:
E-mail bug reports to bonzini@gnu.org. Be sure to include the word ``sed'' somewhere in the ``Subject:'' field. Also, please include the output of ``sed --version'' in the body of your report if at all possible.
The full documentation for sed is maintained as a Texinfo manual. If the info and sed programs are properly installed at your site, the command
should give you access to the complete manual.
pgawk
[ POSIX or GNU style options ]
-f
program-file
[
--
] file ...
pgawk
[ POSIX or GNU style options ]
[
--
]
program-text
file ...
Pgawk is the profiling version of gawk. It is identical in every way to gawk, except that programs run more slowly, and it automatically produces an execution profile in the file awkprof.out when done. See the --profile option, below.
The command line consists of options to gawk itself, the AWK program text (if not supplied via the -f or --file options), and values to be made available in the ARGC and ARGV pre-defined AWK variables.
Gawk options may be either traditional POSIX one letter options, or GNU style long options. POSIX options start with a single ``-'', while long options start with ``--''. Long options are provided for both GNU-specific features and for POSIX-mandated features.
Following the POSIX standard, gawk-specific options are supplied via arguments to the -W option. Multiple -W options may be supplied Each -W option has a corresponding long option, as detailed below. Arguments to long options are either joined with the option by an = sign, with no intervening spaces, or they may be provided in the next command line argument. Long options may be abbreviated, as long as the abbreviation remains unique.
Gawk accepts the following options, listed alphabetically.
Having a list of all the global variables is a good way to look for typographical errors in your programs. You would also use this option if you have a large program with a lot of functions, and you want to be sure that your functions don't inadvertently use global variables that you meant to be local. (This is a particularly easy mistake to make with simple variable names like i, j, and so on.)
In compatibility mode, any other options are flagged as invalid, but are otherwise ignored. In normal operation, as long as program text has been supplied, unknown options are passed on to the AWK program in the ARGV array for processing. This is particularly useful for running AWK programs via the ``#!'' executable interpreter mechanism.
An AWK program consists of a sequence of pattern-action statements and optional function definitions.
pattern { action statements }
function name(parameter list) { statements }
Gawk first reads the program source from the program-file(s) if specified, from arguments to --source, or from the first non-option argument on the command line. The -f and --source options may be used multiple times on the command line. Gawk reads the program text as if all the program-files and command line source texts had been concatenated together. This is useful for building libraries of AWK functions, without having to include them in each new AWK program that uses them. It also provides the ability to mix library functions with command line programs.
The environment variable AWKPATH specifies a search path to use when finding source files named with the -f option. If this variable does not exist, the default path is ".:/usr/local/share/awk". (The actual directory may vary, depending upon how gawk was built and installed.) If a file name given to the -f option contains a ``/'' character, no path search is performed.
Gawk executes AWK programs in the following order. First, all variable assignments specified via the -v option are performed. Next, gawk compiles the program into an internal form. Then, gawk executes the code in the BEGIN block(s) (if any), and then proceeds to read each file named in the ARGV array. If there are no files named on the command line, gawk reads the standard input.
If a filename on the command line has the form var=val it is treated as a variable assignment. The variable var will be assigned the value val. (This happens after any BEGIN block(s) have been run.) Command line variable assignment is most useful for dynamically assigning values to the variables AWK uses to control how input is broken into fields and records. It is also useful for controlling state if multiple passes are needed over a single data file.
If the value of a particular element of ARGV is empty (""), gawk skips over it.
For each record in the input, gawk tests to see if it matches any pattern in the AWK program. For each pattern that the record matches, the associated action is executed. The patterns are tested in the order they occur in the program.
Finally, after all the input is exhausted, gawk executes the code in the END block(s) (if any).
As each input record is read, gawk splits the record into fields, using the value of the FS variable as the field separator. If FS is a single character, fields are separated by that character. If FS is the null string, then each individual character becomes a separate field. Otherwise, FS is expected to be a full regular expression. In the special case that FS is a single space, fields are separated by runs of spaces and/or tabs and/or newlines. (But see the discussion of --posix, below). NOTE: The value of IGNORECASE (see below) also affects how fields are split when FS is a regular expression, and how records are separated when RS is a regular expression.
If the FIELDWIDTHS variable is set to a space separated list of numbers, each field is expected to have fixed width, and gawk splits up the record using the specified widths. The value of FS is ignored. Assigning a new value to FS overrides the use of FIELDWIDTHS, and restores the default behavior.
Each field in the input record may be referenced by its position, $1, $2, and so on. $0 is the whole record. Fields need not be referenced by constants:
n = 5
print $n
prints the fifth field in the input record.
The variable NF is set to the total number of fields in the input record.
References to non-existent fields (i.e. fields after $NF) produce the null-string. However, assigning to a non-existent field (e.g., $(NF+2) = 5) increases the value of NF, creates any intervening fields with the null string as their value, and causes the value of $0 to be recomputed, with the fields being separated by the value of OFS. References to negative numbered fields cause a fatal error. Decrementing NF causes the values of fields past the new value to be lost, and the value of $0 to be recomputed, with the fields being separated by the value of OFS.
Assigning a value to an existing field causes the whole record to be rebuilt when $0 is referenced. Similarly, assigning a value to $0 causes the record to be resplit, creating new values for the fields.
Gawk's built-in variables are:
Thus, if IGNORECASE is not equal to zero, /aB/ matches all of the strings "ab", "aB", "Ab", and "AB". As with all AWK variables, the initial value of IGNORECASE is zero, so all regular expression and string operations are normally case-sensitive. Under Unix, the full ISO 8859-1 Latin-1 character set is used when ignoring case. As of gawk 3.1.4, the case equivalencies are fully locale-aware, based on the C <ctype.h> facilities such as isalpha(), and tolupper().
Arrays are subscripted with an expression between square brackets ([ and ]). If the expression is an expression list (expr, expr ...) then the array subscript is a string consisting of the concatenation of the (string) value of each expression, separated by the value of the SUBSEP variable. This facility is used to simulate multiply dimensioned arrays. For example:
assigns the string "hello, world\n" to the element of the array x which is indexed by the string "A\034B\034C". All arrays in AWK are associative, i.e. indexed by string values.
The special operator in may be used in an if or while statement to see if an array has an index consisting of a particular value.
if (val in array)
print array[val]
If the array has multiple subscripts, use (i, j) in array.
The in construct may also be used in a for loop to iterate over all the elements of an array.
An element may be deleted from an array using the delete statement. The delete statement may also be used to delete the entire contents of an array, just by specifying the array name without a subscript.
Variables and fields may be (floating point) numbers, or strings, or both. How the value of a variable is interpreted depends upon its context. If used in a numeric expression, it will be treated as a number, if used as a string it will be treated as a string.
To force a variable to be treated as a number, add 0 to it; to force it to be treated as a string, concatenate it with the null string.
When a string must be converted to a number, the conversion is accomplished using strtod(3). A number is converted to a string by using the value of CONVFMT as a format string for sprintf(3), with the numeric value of the variable as the argument. However, even though all numbers in AWK are floating-point, integral values are always converted as integers. Thus, given
CONVFMT = "%2.2f" a = 12 b = a ""
the variable b has a string value of "12" and not "12.00".
Gawk performs comparisons as follows: If two variables are numeric, they are compared numerically. If one value is numeric and the other has a string value that is a ``numeric string,'' then comparisons are also done numerically. Otherwise, the numeric value is converted to a string and a string comparison is performed. Two strings are compared, of course, as strings. Note that the POSIX standard applies the concept of ``numeric string'' everywhere, even to string constants. However, this is clearly incorrect, and gawk does not do this. (Fortunately, this is fixed in the next version of the standard.)
Note that string constants, such as "57", are not numeric strings, they are string constants. The idea of ``numeric string'' only applies to fields, getline input, FILENAME, ARGV elements, ENVIRON elements and the elements of an array created by split() that are numeric strings. The basic idea is that user input, and only user input, that looks numeric, should be treated that way.
Uninitialized variables have the numeric value 0 and the string value "" (the null, or empty, string).
String constants in AWK are sequences of characters enclosed between double quotes ("). Within strings, certain escape sequences are recognized, as in C. These are:
The escape sequences may also be used inside constant regular expressions (e.g., /[ \t\f\n\r\v]/ matches whitespace characters).
In compatibility mode, the characters represented by octal and hexadecimal escape sequences are treated literally when used in regular expression constants. Thus, /a\52b/ is equivalent to /a\*b/.
{ print }
which prints the entire record.
Comments begin with the ``#'' character, and continue until the end of the line. Blank lines may be used to separate statements. Normally, a statement ends with a newline, however, this is not the case for lines ending in a ``,'', {, ?, :, &&, or ||. Lines ending in do or else also have their statements automatically continued on the following line. In other cases, a line can be continued by ending it with a ``\'', in which case the newline will be ignored.
Multiple statements may be put on one line by separating them with a ``;''. This applies to both the statements within the action part of a pattern-action pair (the usual case), and to the pattern-action statements themselves.
BEGIN END /regular expression/ relational expression pattern && pattern pattern || pattern pattern ? pattern : pattern (pattern) ! pattern pattern1, pattern2
BEGIN and END are two special kinds of patterns which are not tested against the input. The action parts of all BEGIN patterns are merged as if all the statements had been written in a single BEGIN block. They are executed before any of the input is read. Similarly, all the END blocks are merged, and executed when all the input is exhausted (or when an exit statement is executed). BEGIN and END patterns cannot be combined with other patterns in pattern expressions. BEGIN and END patterns cannot have missing action parts.
For /regular expression/ patterns, the associated statement is executed for each input record that matches the regular expression. Regular expressions are the same as those in egrep(1), and are summarized below.
A relational expression may use any of the operators defined below in the section on actions. These generally test whether certain fields match certain regular expressions.
The &&, ||, and ! operators are logical AND, logical OR, and logical NOT, respectively, as in C. They do short-circuit evaluation, also as in C, and are used for combining more primitive pattern expressions. As in most languages, parentheses may be used to change the order of evaluation.
The ?: operator is like the same operator in C. If the first pattern is true then the pattern used for testing is the second pattern, otherwise it is the third. Only one of the second and third patterns is evaluated.
The pattern1, pattern2 form of an expression is called a range pattern. It matches all input records starting with a record that matches pattern1, and continuing until a record that matches pattern2, inclusive. It does not combine with any other sort of pattern expression.
Interval expressions are only available if either --posix or --re-interval is specified on the command line.
The escape sequences that are valid in string constants (see below) are also valid in regular expressions.
Character classes are a new feature introduced in the POSIX standard. A character class is a special notation for describing lists of characters that have a specific attribute, but where the actual characters themselves can vary from country to country and/or from character set to character set. For example, the notion of what is an alphabetic character differs in the USA and in France.
A character class is only valid in a regular expression inside the brackets of a character list. Character classes consist of [:, a keyword denoting the class, and :]. The character classes defined by the POSIX standard are:
For example, before the POSIX standard, to match alphanumeric characters, you would have had to write /[A-Za-z0-9]/. If your character set had other alphabetic characters in it, this would not match them, and if your character set collated differently from ASCII, this might not even match the ASCII alphanumeric characters. With the POSIX character classes, you can write /[[:alnum:]]/, and this matches the alphabetic and numeric characters in your character set.
Two additional special sequences can appear in character lists. These apply to non-ASCII character sets, which can have single symbols (called collating elements) that are represented with more than one character, as well as several characters that are equivalent for collating, or sorting, purposes. (E.g., in French, a plain ``e'' and a grave-accented e` are equivalent.)
These features are very valuable in non-English speaking locales. The library functions that gawk uses for regular expression matching currently only recognize POSIX character classes; they do not recognize collating symbols or equivalence classes.
The \y, \B, \<, \>, \w, \W, \`, and \' operators are specific to gawk; they are extensions based on facilities in the GNU regular expression libraries.
The various command line options control how gawk interprets characters in regular expressions.
The operators in AWK, in order of decreasing precedence, are
The control statements are as follows:
if (condition) statement [ else statement ]
while (condition) statement
do statement while (condition)
for (expr1; expr2; expr3) statement
for (var in array) statement
break
continue
delete array[index]
delete array
exit [ expression ]
{ statements }
The input/output statements are as follows:
Additional output redirections are allowed for print and printf.
The getline command returns 0 on end of file and -1 on an error. Upon an error, ERRNO contains a string describing the problem.
NOTE: If using a pipe or co-process to getline, or from print or printf within a loop, you must use close() to create new instances of the command. AWK does not automatically close pipes or co-processes when they return EOF.
The AWK versions of the printf statement and sprintf() function (see below) accept the following conversion specification formats:
NOTE: When using the integer format-control letters for values that are outside the range of a C long integer, gawk switches to the %g format specifier. If --lint is provided on the command line gawk warns about this. Other versions of awk may print invalid values or do something else entirely.
Optional, additional parameters may lie between the % and the control letter:
The dynamic width and prec capabilities of the ANSI C printf() routines are supported. A * in place of either the width or prec specifications causes their values to be taken from the argument list to printf or sprintf(). To use a positional specifier with a dynamic width or precision, supply the count$ after the * in the format string. For example, "%3$*2$.*1$s".
When doing I/O redirection from either print or printf into a file, or via getline from a file, gawk recognizes certain special filenames internally. These filenames allow access to open file descriptors inherited from gawk's parent process (usually the shell). These file names may also be used on the command line to name data files. The filenames are:
These are particularly useful for error messages. For example:
whereas you would otherwise have to use
The following special filenames may be used with the |& co-process operator for creating TCP/IP network connections.
Other special filenames provide access to information about the running gawk process. These filenames are now obsolete. Use the PROCINFO array to obtain the information they provide. The filenames are:
AWK has the following built-in arithmetic functions:
Gawk has the following built-in string functions:
The default domain is the value of TEXTDOMAIN. If directory is the null string (""), then bindtextdomain() returns the current binding for the given domain.
If you supply a value for category, it must be a string equal to one of the known locale categories described in GAWK: Effective AWK Programming. You must also supply a text domain. Use TEXTDOMAIN if you want to use the current domain.
If you supply a value for category, it must be a string equal to one of the known locale categories described in GAWK: Effective AWK Programming. You must also supply a text domain. Use TEXTDOMAIN if you want to use the current domain.
Functions are executed when they are called from within expressions in either patterns or actions. Actual parameters supplied in the function call are used to instantiate the formal parameters declared in the function. Arrays are passed by reference, other variables are passed by value.
Since functions were not originally part of the AWK language, the provision for local variables is rather clumsy: They are declared as extra parameters in the parameter list. The convention is to separate local variables from real parameters by extra spaces in the parameter list. For example:
function f(p, q, a, b) # a and b are local
{
...
}
/abc/ { ... ; f(1, 2) ; ... }
The left parenthesis in a function call is required to immediately follow the function name, without any intervening white space. This is to avoid a syntactic ambiguity with the concatenation operator. This restriction does not apply to the built-in functions listed above.
Functions may call each other and may be recursive. Function parameters used as local variables are initialized to the null string and the number zero upon function invocation.
Use return expr to return a value from a function. The return value is undefined if no value is provided, or if the function returns by ``falling off'' the end.
If --lint has been provided, gawk warns about calls to undefined functions at parse time, instead of at run time. Calling an undefined function at run time is a fatal error.
The word func may be used in place of function.
This function is provided and documented in GAWK: Effective AWK Programming, but everything about this feature is likely to change in the next release. We STRONGLY recommend that you do not use this feature for anything that you aren't willing to redo.
Print and sort the login names of all users:
BEGIN { FS = ":" }
{ print $1 | "sort" }
Count lines in a file:
{ nlines++ }
END { print nlines }
Precede each line by its number in the file:
{ print FNR, $0 }
Concatenate and line number (a variation on a theme):
{ print NR, $0 }
Run an external command for particular lines of data:
tail -f access_log |
awk '/myhome.html/ { system("nmap " $1 ">> logdir/myhome.html") }'
String constants are sequences of characters enclosed in double quotes. In non-English speaking environments, it is possible to mark strings in the AWK program as requiring translation to the native natural language. Such strings are marked in the AWK program with a leading underscore (``_''). For example,
always prints hello, world. But,
might print bonjour, monde in France.
There are several steps involved in producing and running a localizable AWK program.
BEGIN { TEXTDOMAIN = "myprog" }
This allows gawk to find the .mo file associated with your program. Without this step, gawk uses the messages text domain, which likely does not contain translations for your program.
The internationalization features are described in full detail in GAWK: Effective AWK Programming.
The book indicates that command line variable assignment happens when awk would otherwise open the argument as a file, which is after the BEGIN block is executed. However, in earlier implementations, when such an assignment appeared before any file names, the assignment would happen before the BEGIN block was run. Applications came to depend on this ``feature.'' When awk was changed to match its documentation, the -v option for assigning variables before program execution was added to accommodate applications that depended upon the old behavior. (This feature was agreed upon by both the Bell Laboratories and the GNU developers.)
The -W option for implementation specific features is from the POSIX standard.
When processing arguments, gawk uses the special option ``--'' to signal the end of arguments. In compatibility mode, it warns about but otherwise ignores undefined options. In normal operation, such arguments are passed on to the AWK program for it to process.
The AWK book does not define the return value of srand(). The POSIX standard has it return the seed it was using, to allow keeping track of random number sequences. Therefore srand() in gawk also returns its current seed.
Other new features are: The use of multiple -f options (from MKS awk); the ENVIRON array; the \a, and \v escape sequences (done originally in gawk and fed back into the Bell Laboratories version); the tolower() and toupper() built-in functions (from the Bell Laboratories version); and the ANSI C conversion specifications in printf (done first in the Bell Laboratories version).
a = length # Holy Algol 60, Batman!
is the same as either of
a = length()
a = length($0)
This feature is marked as ``deprecated'' in the POSIX standard, and gawk issues a warning about its use if --lint is specified on the command line.
The other feature is the use of either the continue or the break statements outside the body of a while, for, or do loop. Traditional AWK implementations have treated such usage as equivalent to the next statement. Gawk supports this usage if --traditional has been specified.
The following features of gawk are not available in POSIX awk.
The AWK book does not define the return value of the close() function. Gawk's close() returns the value from fclose(3), or pclose(3), when closing an output file or pipe, respectively. It returns the process's exit status when closing an input pipe. The return value is -1 if the named file, pipe or co-process was not opened with a redirection.
When gawk is invoked with the --traditional option, if the fs argument to the -F option is ``t'', then FS is set to the tab character. Note that typing gawk -F\t ... simply causes the shell to quote the ``t,'', and does not pass ``\t'' to the -F option. Since this is a rather ugly special case, it is not the default behavior. This behavior also does not occur if --posix has been specified. To really get a tab character as the field separator, it is best to use single quotes: gawk -F'\t' ....
If gawk is configured with the --enable-switch option to the configure command, then it accepts an additional control-flow statement:
switch (expression) {
case value|regex : statement
...
[ default: statement ]
}
If POSIXLY_CORRECT exists in the environment, then gawk behaves exactly as if --posix had been specified on the command line. If --lint has been specified, gawk issues a warning message to this effect.
The AWK Programming Language, Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger, Addison-Wesley, 1988. ISBN 0-201-07981-X.
GAWK: Effective AWK Programming, Edition 3.0, published by the Free Software Foundation, 2001.
Syntactically invalid single character programs tend to overflow the parse stack, generating a rather unhelpful message. Such programs are surprisingly difficult to diagnose in the completely general case, and the effort to do so really is not worth it.
Paul Rubin and Jay Fenlason, of the Free Software Foundation, wrote gawk, to be compatible with the original version of awk distributed in Seventh Edition UNIX. John Woods contributed a number of bug fixes. David Trueman, with contributions from Arnold Robbins, made gawk compatible with the new version of UNIX awk. Arnold Robbins is the current maintainer.
The initial DOS port was done by Conrad Kwok and Scott Garfinkle. Scott Deifik is the current DOS maintainer. Pat Rankin did the port to VMS, and Michal Jaegermann did the port to the Atari ST. The port to OS/2 was done by Kai Uwe Rommel, with contributions and help from Darrel Hankerson. Fred Fish supplied support for the Amiga, Stephen Davies provided the Tandem port, and Martin Brown provided the BeOS port.
Before sending a bug report, please do two things. First, verify that you have the latest version of gawk. Many bugs (usually subtle ones) are fixed at each release, and if yours is out of date, the problem may already have been solved. Second, please read this man page and the reference manual carefully to be sure that what you think is a bug really is, instead of just a quirk in the language.
Whatever you do, do NOT post a bug report in comp.lang.awk. While the gawk developers occasionally read this newsgroup, posting bug reports there is an unreliable way to report bugs. Instead, please use the electronic mail addresses given above.
If you're using a GNU/Linux system or BSD-based system, you may wish to submit a bug report to the vendor of your distribution. That's fine, but please send a copy to the official email address as well, since there's no guarantee that the bug will be forwarded to the gawk maintainer.
Permission is granted to make and distribute verbatim copies of this manual page provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this manual page under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
Permission is granted to copy and distribute translations of this manual page into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Foundation.
The Bash Man Page
Bash is intended to be a conformant implementation of the IEEE POSIX Shell and Tools specification (IEEE Working Group 1003.2). Bash can be configured to be POSIX-conformant by default.
Bash also interprets a number of multi-character options. These options must appear on the command line before the single-character options to be recognized.
An interactive shell is one started without non-option arguments and without the -c option whose standard input and error are both connected to terminals (as determined by isatty(3)), or one started with the -i option. PS1 is set and $- includes i if bash is interactive, allowing a shell script or a startup file to test this state.
The following paragraphs describe how bash executes its startup files. If any of the files exist but cannot be read, bash reports an error. Tildes are expanded in file names as described below under Tilde Expansion in the EXPANSION section.
When bash is invoked as an interactive login shell, or as a non-interactive shell with the --login option, it first reads and executes commands from the file /etc/profile, if that file exists. After reading that file, it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order, and reads and executes commands from the first one that exists and is readable. The --noprofile option may be used when the shell is started to inhibit this behavior.
When a login shell exits, bash reads and executes commands from the file ~/.bash_logout, if it exists.
When an interactive shell that is not a login shell is started, bash reads and executes commands from ~/.bashrc, if that file exists. This may be inhibited by using the --norc option. The --rcfile file option will force bash to read and execute commands from file instead of ~/.bashrc.
When bash is started non-interactively, to run a shell script, for example, it looks for the variable BASH_ENV in the environment, expands its value if it appears there, and uses the expanded value as the name of a file to read and execute. Bash behaves as if the following command were executed:
but the value of the PATH variable is not used to search for the file name.
If bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the POSIX standard as well. When invoked as an interactive login shell, or a non-interactive shell with the --login option, it first attempts to read and execute commands from /etc/profile and ~/.profile, in that order. The --noprofile option may be used to inhibit this behavior. When invoked as an interactive shell with the name sh, bash looks for the variable ENV, expands its value if it is defined, and uses the expanded value as the name of a file to read and execute. Since a shell invoked as sh does not attempt to read and execute commands from any other startup files, the --rcfile option has no effect. A non-interactive shell invoked with the name sh does not attempt to read any other startup files. When invoked as sh, bash enters posix mode after the startup files are read.
When bash is started in posix mode, as with the --posix command line option, it follows the POSIX standard for startup files. In this mode, interactive shells expand the ENV variable and commands are read and executed from the file whose name is the expanded value. No other startup files are read.
Bash attempts to determine when it is being run by the remote shell daemon, usually rshd. If bash determines it is being run by rshd, it reads and executes commands from ~/.bashrc, if that file exists and is readable. It will not do this if invoked as sh. The --norc option may be used to inhibit this behavior, and the --rcfile option may be used to force another file to be read, but rshd does not generally invoke the shell with those options or allow them to be specified.
If the shell is started with the effective user (group) id not equal to the real user (group) id, and the -p option is not supplied, no startup files are read, shell functions are not inherited from the environment, the SHELLOPTS variable, if it appears in the environment, is ignored, and the effective user id is set to the real user id. If the -p option is supplied at invocation, the startup behavior is the same, but the effective user id is not reset.
The following definitions are used throughout the rest of this document.
| & ; ( ) < > space tab
|| & && ; ;; ( ) | <newline>
! case do done elif else esac fi for function if in select then until while { } time [[ ]]
A simple command is a sequence of optional variable assignments followed by blank-separated words and redirections, and terminated by a control operator. The first word specifies the command to be executed, and is passed as argument zero. The remaining words are passed as arguments to the invoked command.
The return value of a simple command is its exit status, or 128+n if the command is terminated by signal n.
A pipeline is a sequence of one or more commands separated by the character |. The format for a pipeline is:
[time [-p]] [ ! ] command [ | command2 ... ]
The standard output of command is connected via a pipe to the standard input of command2. This connection is performed before any redirections specified by the command (see REDIRECTION below).
The return status of a pipeline is the exit status of the last command, unless the pipefail option is enabled. If pipefail is enabled, the pipeline's return status is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands exit successfully. If the reserved word ! precedes a pipeline, the exit status of that pipeline is the logical negation of the exit status as described above. The shell waits for all commands in the pipeline to terminate before returning a value.
If the time reserved word precedes a pipeline, the elapsed as well as user and system time consumed by its execution are reported when the pipeline terminates. The -p option changes the output format to that specified by POSIX. The TIMEFORMAT variable may be set to a format string that specifies how the timing information should be displayed; see the description of TIMEFORMAT under Shell Variables below.
Each command in a pipeline is executed as a separate process (i.e., in a subshell).
A list is a sequence of one or more pipelines separated by one of the operators ;, &, &&, or ||, and optionally terminated by one of ;, &, or <newline>.
Of these list operators, && and || have equal precedence, followed by ; and &, which have equal precedence.
A sequence of one or more newlines may appear in a list instead of a semicolon to delimit commands.
If a command is terminated by the control operator &, the shell executes the command in the background in a subshell. The shell does not wait for the command to finish, and the return status is 0. Commands separated by a ; are executed sequentially; the shell waits for each command to terminate in turn. The return status is the exit status of the last command executed.
The control operators && and || denote AND lists and OR lists, respectively. An AND list has the form
command1 && command2
command2 is executed if, and only if, command1 returns an exit status of zero.
An OR list has the form
command1 || command2
command2 is executed if and only if command1 returns a non-zero exit status. The return status of AND and OR lists is the exit status of the last command executed in the list.
A compound command is one of the following:
When the == and != operators are used, the string to the right of the operator is considered a pattern and matched according to the rules described below under Pattern Matching. If the shell option nocasematch is enabled, the match is performed without regard to the case of alphabetic characters. The return value is 0 if the string matches (==) or does not match (!=) the pattern, and 1 otherwise. Any part of the pattern may be quoted to force it to be matched as a string.
An additional binary operator, =~, is available, with the same precedence as == and !=. When it is used, the string to the right of the operator is considered an extended regular expression and matched accordingly (as in regex(3)). The return value is 0 if the string matches the pattern, and 1 otherwise. If the regular expression is syntactically incorrect, the conditional expression's return value is 2. If the shell option nocasematch is enabled, the match is performed without regard to the case of alphabetic characters. Substrings matched by parenthesized subexpressions within the regular expression are saved in the array variable BASH_REMATCH. The element of BASH_REMATCH with index 0 is the portion of the string matching the entire regular expression. The element of BASH_REMATCH with index n is the portion of the string matching the nth parenthesized subexpression.
Expressions may be combined using the following operators, listed in decreasing order of precedence:
The && and || operators do not evaluate expression2 if the value of expression1 is sufficient to determine the return value of the entire conditional expression.
A shell function is an object that is called like a simple command and executes a compound command with a new set of positional parameters. Shell functions are declared as follows:
Each of the metacharacters listed above under DEFINITIONS has special meaning to the shell and must be quoted if it is to represent itself.
When the command history expansion facilities are being used (see HISTORY EXPANSION below), the history expansion character, usually !, must be quoted to prevent history expansion.
There are three quoting mechanisms: the escape character, single quotes, and double quotes.
A non-quoted backslash (\) is the escape character. It preserves the literal value of the next character that follows, with the exception of <newline>. If a \<newline> pair appears, and the backslash is not itself quoted, the \<newline> is treated as a line continuation (that is, it is removed from the input stream and effectively ignored).
Enclosing characters in single quotes preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of $, `, \, and, when history expansion is enabled, !. The characters $ and ` retain their special meaning within double quotes. The backslash retains its special meaning only when followed by one of the following characters: $, `, ", \, or <newline>. A double quote may be quoted within double quotes by preceding it with a backslash. If enabled, history expansion will be performed unless an ! appearing in double quotes is escaped using a backslash. The backslash preceding the ! is not removed.
The special parameters * and @ have special meaning when in double quotes (see PARAMETERS below).
Words of the form $aqstringaq are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows:
The expanded result is single-quoted, as if the dollar sign had not been present.
A double-quoted string preceded by a dollar sign ($) will cause the string to be translated according to the current locale. If the current locale is C or POSIX, the dollar sign is ignored. If the string is translated and replaced, the replacement is double-quoted.
A parameter is set if it has been assigned a value. The null string is a valid value. Once a variable is set, it may be unset only by using the unset builtin command (see SHELL BUILTIN COMMANDS below).
A variable may be assigned to by a statement of the form
name=[value]
If value is not given, the variable is assigned the null string. All values undergo tilde expansion, parameter and variable expansion, command substitution, arithmetic expansion, and quote removal (see EXPANSION below). If the variable has its integer attribute set, then value is evaluated as an arithmetic expression even if the $((...)) expansion is not used (see Arithmetic Expansion below). Word splitting is not performed, with the exception of "$@" as explained below under Special Parameters. Pathname expansion is not performed. Assignment statements may also appear as arguments to the alias, declare, typeset, export, readonly, and local builtin commands.
In the context where an assignment statement is assigning a value to a shell variable or array index, the += operator can be used to append to or add to the variable's previous value. When += is applied to a variable for which the integer attribute has been set, value is evaluated as an arithmetic expression and added to the variable's current value, which is also evaluated. When += is applied to an array variable using compound assignment (see Arrays below), the variable's value is not unset (as it is when using =), and new values are appended to the array beginning at one greater than the array's maximum index. When applied to a string-valued variable, value is expanded and appended to the variable's value.
A positional parameter is a parameter denoted by one or more digits, other than the single digit 0. Positional parameters are assigned from the shell's arguments when it is invoked, and may be reassigned using the set builtin command. Positional parameters may not be assigned to with assignment statements. The positional parameters are temporarily replaced when a shell function is executed (see FUNCTIONS below).
When a positional parameter consisting of more than a single digit is expanded, it must be enclosed in braces (see EXPANSION below).
The shell treats several parameters specially. These parameters may only be referenced; assignment to them is not allowed.
The following variables are set by the shell:
The following variables are used by the shell. In some cases, bash assigns a default value to a variable; these cases are noted below.
MAILPATH=aq/var/mail/bfox?"You have mail":~/shell-mail?"$_ has mail!"aq
Bash supplies a default value for this variable, but the location of the user mail files that it uses is system dependent (e.g., /var/mail/$USER).
An array is created automatically if any variable is assigned to using the syntax name[subscript]=value. The subscript is treated as an arithmetic expression that must evaluate to a number greater than or equal to zero. To explicitly declare an array, use declare -a name (see SHELL BUILTIN COMMANDS below). declare -a name[subscript] is also accepted; the subscript is ignored. Attributes may be specified for an array variable using the declare and readonly builtins. Each attribute applies to all members of an array.
Arrays are assigned to using compound assignments of the form name=(value1 ... valuen), where each value is of the form [subscript]=string. Only string is required. If the optional brackets and subscript are supplied, that index is assigned to; otherwise the index of the element assigned is the last index assigned to by the statement plus one. Indexing starts at zero. This syntax is also accepted by the declare builtin. Individual array elements may be assigned to using the name[subscript]=value syntax introduced above.
Any element of an array may be referenced using ${name[subscript]}. The braces are required to avoid conflicts with pathname expansion. If subscript is @ or *, the word expands to all members of name. These subscripts differ only when the word appears within double quotes. If the word is double-quoted, ${name[*]} expands to a single word with the value of each array member separated by the first character of the IFS special variable, and ${name[@]} expands each element of name to a separate word. When there are no array members, ${name[@]} expands to nothing. If the double-quoted expansion occurs within a word, the expansion of the first parameter is joined with the beginning part of the original word, and the expansion of the last parameter is joined with the last part of the original word. This is analogous to the expansion of the special parameters * and @ (see Special Parameters above). ${#name[subscript]} expands to the length of ${name[subscript]}. If subscript is * or @, the expansion is the number of elements in the array. Referencing an array variable without a subscript is equivalent to referencing element zero.
The unset builtin is used to destroy arrays. unset name[subscript] destroys the array element at index subscript. Care must be taken to avoid unwanted side effects caused by filename generation. unset name, where name is an array, or unset name[subscript], where subscript is * or @, removes the entire array.
The declare, local, and readonly builtins each accept a -a option to specify an array. The read builtin accepts a -a option to assign a list of words read from the standard input to an array. The set and declare builtins display array values in a way that allows them to be reused as assignments.
The order of expansions is: brace expansion, tilde expansion, parameter, variable and arithmetic expansion and command substitution (done in a left-to-right fashion), word splitting, and pathname expansion.
On systems that can support it, there is an additional expansion available: process substitution.
Only brace expansion, word splitting, and pathname expansion can change the number of words of the expansion; other expansions expand a single word to a single word. The only exceptions to this are the expansions of "$@" and "${name[@]}" as explained above (see PARAMETERS).
Brace expansion is a mechanism by which arbitrary strings may be generated. This mechanism is similar to pathname expansion, but the filenames generated need not exist. Patterns to be brace expanded take the form of an optional preamble, followed by either a series of comma-separated strings or a sequence expression between a pair of braces, followed by an optional postscript. The preamble is prefixed to each string contained within the braces, and the postscript is then appended to each resulting string, expanding left to right.
Brace expansions may be nested. The results of each expanded string are not sorted; left to right order is preserved. For example, a{d,c,b}e expands into `ade ace abe'.
A sequence expression takes the form {x..y}, where x and y are either integers or single characters. When integers are supplied, the expression expands to each number between x and y, inclusive. When characters are supplied, the expression expands to each character lexicographically between x and y, inclusive. Note that both x and y must be of the same type.
Brace expansion is performed before any other expansions, and any characters special to other expansions are preserved in the result. It is strictly textual. Bash does not apply any syntactic interpretation to the context of the expansion or the text between the braces.
A correctly-formed brace expansion must contain unquoted opening and closing braces, and at least one unquoted comma or a valid sequence expression. Any incorrectly formed brace expansion is left unchanged. A { or , may be quoted with a backslash to prevent its being considered part of a brace expression. To avoid conflicts with parameter expansion, the string ${ is not considered eligible for brace expansion.
This construct is typically used as shorthand when the common prefix of the strings to be generated is longer than in the above example:
mkdir /usr/local/src/bash/{old,new,dist,bugs}
Brace expansion introduces a slight incompatibility with historical versions of sh. sh does not treat opening or closing braces specially when they appear as part of a word, and preserves them in the output. Bash removes braces from words as a consequence of brace expansion. For example, a word entered to sh as file{1,2} appears identically in the output. The same word is output as file1 file2 after expansion by bash. If strict compatibility with sh is desired, start bash with the +B option or disable brace expansion with the +B option to the set command (see SHELL BUILTIN COMMANDS below).
If a word begins with an unquoted tilde character (`~'), all of the characters preceding the first unquoted slash (or all characters, if there is no unquoted slash) are considered a tilde-prefix. If none of the characters in the tilde-prefix are quoted, the characters in the tilde-prefix following the tilde are treated as a possible login name. If this login name is the null string, the tilde is replaced with the value of the shell parameter HOME. If HOME is unset, the home directory of the user executing the shell is substituted instead. Otherwise, the tilde-prefix is replaced with the home directory associated with the specified login name.
If the tilde-prefix is a `~+', the value of the shell variable PWD replaces the tilde-prefix. If the tilde-prefix is a `~-', the value of the shell variable OLDPWD, if it is set, is substituted. If the characters following the tilde in the tilde-prefix consist of a number N, optionally prefixed by a `+' or a `-', the tilde-prefix is replaced with the corresponding element from the directory stack, as it would be displayed by the dirs builtin invoked with the tilde-prefix as an argument. If the characters following the tilde in the tilde-prefix consist of a number without a leading `+' or `-', `+' is assumed.
If the login name is invalid, or the tilde expansion fails, the word is unchanged.
Each variable assignment is checked for unquoted tilde-prefixes immediately following a : or the first =. In these cases, tilde expansion is also performed. Consequently, one may use file names with tildes in assignments to PATH, MAILPATH, and CDPATH, and the shell assigns the expanded value.
The `$' character introduces parameter expansion, command substitution, or arithmetic expansion. The parameter name or symbol to be expanded may be enclosed in braces, which are optional but serve to protect the variable to be expanded from characters immediately following it which could be interpreted as part of the name.
When braces are used, the matching ending brace is the first `}' not escaped by a backslash or within a quoted string, and not within an embedded arithmetic expansion, command substitution, or parameter expansion.
If the first character of parameter is an exclamation point, a level of variable indirection is introduced. Bash uses the value of the variable formed from the rest of parameter as the name of the variable; this variable is then expanded and that value is used in the rest of the substitution, rather than the value of parameter itself. This is known as indirect expansion. The exceptions to this are the expansions of ${!prefix*} and ${!name[@]} described below. The exclamation point must immediately follow the left brace in order to introduce indirection.
In each of the cases below, word is subject to tilde expansion, parameter expansion, command substitution, and arithmetic expansion. When not performing substring expansion, bash tests for a parameter that is unset or null; omitting the colon results in a test only for a parameter that is unset.
Command substitution allows the output of a command to replace the command name. There are two forms:
$(command)
Bash performs the expansion by executing command and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Embedded newlines are not deleted, but they may be removed during word splitting. The command substitution $(cat file) can be replaced by the equivalent but faster $(< file).
When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by $, `, or \. The first backquote not preceded by a backslash terminates the command substitution. When using the $(command) form, all characters between the parentheses make up the command; none are treated specially.
Command substitutions may be nested. To nest when using the backquoted form, escape the inner backquotes with backslashes.
If the substitution appears within double quotes, word splitting and pathname expansion are not performed on the results.
Arithmetic expansion allows the evaluation of an arithmetic expression and the substitution of the result. The format for arithmetic expansion is:
$((expression))
The expression is treated as if it were within double quotes, but a double quote inside the parentheses is not treated specially. All tokens in the expression undergo parameter expansion, string expansion, command substitution, and quote removal. Arithmetic expansions may be nested.
The evaluation is performed according to the rules listed below under ARITHMETICEVALUATION. If expression is invalid, bash prints a message indicating failure and no substitution occurs.
Process substitution is supported on systems that support named pipes (FIFOs) or the /dev/fd method of naming open files. It takes the form of <(list) or >(list). The process list is run with its input or output connected to a FIFO or some file in /dev/fd. The name of this file is passed as an argument to the current command as the result of the expansion. If the >(list) form is used, writing to the file will provide input for list. If the <(list) form is used, the file passed as an argument should be read to obtain the output of list.
When available, process substitution is performed simultaneously with parameter and variable expansion, command substitution, and arithmetic expansion.
The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting.
The shell treats each character of IFS as a delimiter, and splits the results of the other expansions into words on these characters. If IFS is unset, or its value is exactly <space><tab><newline>, the default, then any sequence of IFS characters serves to delimit words. If IFS has a value other than the default, then sequences of the whitespace characters space and tab are ignored at the beginning and end of the word, as long as the whitespace character is in the value of IFS (an IFS whitespace character). Any character in IFS that is not IFS whitespace, along with any adjacent IFS whitespace characters, delimits a field. A sequence of IFS whitespace characters is also treated as a delimiter. If the value of IFS is null, no word splitting occurs.
Explicit null arguments ("" or aqaq) are retained. Unquoted implicit null arguments, resulting from the expansion of parameters that have no values, are removed. If a parameter with no value is expanded within double quotes, a null argument results and is retained.
Note that if no expansion occurs, no splitting is performed.
After word splitting, unless the -f option has been set, bash scans each word for the characters *, ?, and [. If one of these characters appears, then the word is regarded as a pattern, and replaced with an alphabetically sorted list of file names matching the pattern. If no matching file names are found, and the shell option nullglob is disabled, the word is left unchanged. If the nullglob option is set, and no matches are found, the word is removed. If the failglob shell option is set, and no matches are found, an error message is printed and the command is not executed. If the shell option nocaseglob is enabled, the match is performed without regard to the case of alphabetic characters. When a pattern is used for pathname expansion, the character ``.'' at the start of a name or immediately following a slash must be matched explicitly, unless the shell option dotglob is set. When matching a pathname, the slash character must always be matched explicitly. In other cases, the ``.'' character is not treated specially. See the description of shopt below under SHELL BUILTIN COMMANDS for a description of the nocaseglob, nullglob, failglob, and dotglob shell options.
The GLOBIGNORE shell variable may be used to restrict the set of file names matching a pattern. If GLOBIGNORE is set, each matching file name that also matches one of the patterns in GLOBIGNORE is removed from the list of matches. The file names ``.'' and ``..'' are always ignored when GLOBIGNORE is set and not null. However, setting GLOBIGNORE to a non-null value has the effect of enabling the dotglob shell option, so all other file names beginning with a ``.'' will match. To get the old behavior of ignoring file names beginning with a ``.'', make ``.*'' one of the patterns in GLOBIGNORE. The dotglob option is disabled when GLOBIGNORE is unset.
Pattern Matching
Any character that appears in a pattern, other than the special pattern characters described below, matches itself. The NUL character may not occur in a pattern. A backslash escapes the following character; the escaping backslash is discarded when matching. The special pattern characters must be quoted if they are to be matched literally.
The special pattern characters have the following meanings:
Within [ and ], character classes can be specified using the syntax [:class:], where class is one of the following classes defined in the POSIX.2 standard:
Within
[
and
],
an equivalence class can be specified using the syntax
[=c=], which matches all characters with the
same collation weight (as defined by the current locale) as
the character c.
Within [ and ], the syntax [.symbol.] matches the collating symbol symbol.
If the extglob shell option is enabled using the shopt builtin, several extended pattern matching operators are recognized. In the following description, a pattern-list is a list of one or more patterns separated by a |. Composite patterns may be formed using one or more of the following sub-patterns:
After the preceding expansions, all unquoted occurrences of the characters \, aq, and " that did not result from one of the above expansions are removed.
In the following descriptions, if the file descriptor number is omitted, and the first character of the redirection operator is <, the redirection refers to the standard input (file descriptor 0). If the first character of the redirection operator is >, the redirection refers to the standard output (file descriptor 1).
The word following the redirection operator in the following descriptions, unless otherwise noted, is subjected to brace expansion, tilde expansion, parameter expansion, command substitution, arithmetic expansion, quote removal, pathname expansion, and word splitting. If it expands to more than one word, bash reports an error.
Note that the order of redirections is significant. For example, the command
ls > dirlist 2>&1
directs both standard output and standard error to the file dirlist, while the command
ls 2>&1 > dirlist
directs only the standard output to file dirlist, because the standard error was duplicated as standard output before the standard output was redirected to dirlist.
Bash handles several filenames specially when they are used in redirections, as described in the following table:
A failure to open or create a file causes the redirection to fail.
Redirections using file descriptors greater than 9 should be used with care, as they may conflict with file descriptors the shell uses internally.
Redirection of input causes the file whose name results from the expansion of word to be opened for reading on file descriptor n, or the standard input (file descriptor 0) if n is not specified.
The general format for redirecting input is:
[n]<word
Redirection of output causes the file whose name results from the expansion of word to be opened for writing on file descriptor n, or the standard output (file descriptor 1) if n is not specified. If the file does not exist it is created; if it does exist it is truncated to zero size.
The general format for redirecting output is:
[n]>word
If the redirection operator is >, and the noclobber option to the set builtin has been enabled, the redirection will fail if the file whose name results from the expansion of word exists and is a regular file. If the redirection operator is >|, or the redirection operator is > and the noclobber option to the set builtin command is not enabled, the redirection is attempted even if the file named by word exists.
Redirection of output in this fashion causes the file whose name results from the expansion of word to be opened for appending on file descriptor n, or the standard output (file descriptor 1) if n is not specified. If the file does not exist it is created.
The general format for appending output is:
[n]>>word
Bash allows both the standard output (file descriptor 1) and the standard error output (file descriptor 2) to be redirected to the file whose name is the expansion of word with this construct.
There are two formats for redirecting standard output and standard error:
&>word
Of the two forms, the first is preferred. This is semantically equivalent to
>word 2>&1
This type of redirection instructs the shell to read input from the current source until a line containing only word (with no trailing blanks) is seen. All of the lines read up to that point are then used as the standard input for a command.
The format of here-documents is:
<<[-]word
here-document
delimiter
No parameter expansion, command substitution, arithmetic expansion, or pathname expansion is performed on word. If any characters in word are quoted, the delimiter is the result of quote removal on word, and the lines in the here-document are not expanded. If word is unquoted, all lines of the here-document are subjected to parameter expansion, command substitution, and arithmetic expansion. In the latter case, the character sequence \<newline> is ignored, and \ must be used to quote the characters \, $, and `.
If the redirection operator is <<-, then all leading tab characters are stripped from input lines and the line containing delimiter. This allows here-documents within shell scripts to be indented in a natural fashion.
<<<word
The word is expanded and supplied to the command on its standard input.
The redirection operator
[n]<&word
is used to duplicate input file descriptors. If word expands to one or more digits, the file descriptor denoted by n is made to be a copy of that file descriptor. If the digits in word do not specify a file descriptor open for input, a redirection error occurs. If word evaluates to -, file descriptor n is closed. If n is not specified, the standard input (file descriptor 0) is used.
The operator
[n]>&word
is used similarly to duplicate output file descriptors. If n is not specified, the standard output (file descriptor 1) is used. If the digits in word do not specify a file descriptor open for output, a redirection error occurs. As a special case, if n is omitted, and word does not expand to one or more digits, the standard output and standard error are redirected as described previously.
The redirection operator
[n]<&digit-
moves the file descriptor digit to file descriptor n, or the standard input (file descriptor 0) if n is not specified. digit is closed after being duplicated to n.
Similarly, the redirection operator
[n]>&digit-
moves the file descriptor digit to file descriptor n, or the standard output (file descriptor 1) if n is not specified.
The redirection operator
[n]<>word
causes the file whose name is the expansion of word to be opened for both reading and writing on file descriptor n, or on file descriptor 0 if n is not specified. If the file does not exist, it is created.
Aliases are created and listed with the alias command, and removed with the unalias command.
There is no mechanism for using arguments in the replacement text. If arguments are needed, a shell function should be used (see FUNCTIONS below).
Aliases are not expanded when the shell is not interactive, unless the expand_aliases shell option is set using shopt (see the description of shopt under SHELL BUILTIN COMMANDS below).
The rules concerning the definition and use of aliases are somewhat confusing. Bash always reads at least one complete line of input before executing any of the commands on that line. Aliases are expanded when a command is read, not when it is executed. Therefore, an alias definition appearing on the same line as another command does not take effect until the next line of input is read. The commands following the alias definition on that line are not affected by the new alias. This behavior is also an issue when functions are executed. Aliases are expanded when a function definition is read, not when the function is executed, because a function definition is itself a compound command. As a consequence, aliases defined in a function are not available until after that function is executed. To be safe, always put alias definitions on a separate line, and do not use alias in compound commands.
For almost every purpose, aliases are superseded by shell functions.
Variables local to the function may be declared with the local builtin command. Ordinarily, variables and their values are shared between the function and its caller.
If the builtin command return is executed in a function, the function completes and execution resumes with the next command after the function call. Any command associated with the RETURN trap is executed before execution resumes. When a function completes, the values of the positional parameters and the special parameter # are restored to the values they had prior to the function's execution.
Function names and definitions may be listed with the -f option to the declare or typeset builtin commands. The -F option to declare or typeset will list the function names only (and optionally the source file and line number, if the extdebug shell option is enabled). Functions may be exported so that subshells automatically have them defined with the -f option to the export builtin. Note that shell functions and variables with the same name may result in multiple identically-named entries in the environment passed to the shell's children. Care should be taken in cases where this may cause a problem.
Functions may be recursive. No limit is imposed on the number of recursive calls.
Shell variables are allowed as operands; parameter expansion is performed before the expression is evaluated. Within an expression, shell variables may also be referenced by name without using the parameter expansion syntax. A shell variable that is null or unset evaluates to 0 when referenced by name without using the parameter expansion syntax. The value of a variable is evaluated as an arithmetic expression when it is referenced, or when a variable which has been given the integer attribute using declare -i is assigned a value. A null value evaluates to 0. A shell variable need not have its integer attribute turned on to be used in an expression.
Constants with a leading 0 are interpreted as octal numbers. A leading 0x or 0X denotes hexadecimal. Otherwise, numbers take the form [base#]n, where base is a decimal number between 2 and 64 representing the arithmetic base, and n is a number in that base. If base# is omitted, then base 10 is used. The digits greater than 9 are represented by the lowercase letters, the uppercase letters, @, and _, in that order. If base is less than or equal to 36, lowercase and uppercase letters may be used interchangeably to represent numbers between 10 and 35.
Operators are evaluated in order of precedence. Sub-expressions in parentheses are evaluated first and may override the precedence rules above.
Unless otherwise specified, primaries that operate on files follow symbolic links and operate on the target of the link, rather than the link itself.
If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment. If any of the assignments attempts to assign a value to a readonly variable, an error occurs, and the command exits with a non-zero status.
If no command name results, redirections are performed, but do not affect the current shell environment. A redirection error causes the command to exit with a non-zero status.
If there is a command name left after expansion, execution proceeds as described below. Otherwise, the command exits. If one of the expansions contained a command substitution, the exit status of the command is the exit status of the last command substitution performed. If there were no command substitutions, the command exits with a status of zero.
If the command name contains no slashes, the shell attempts to locate it. If there exists a shell function by that name, that function is invoked as described above in FUNCTIONS. If the name does not match a function, the shell searches for it in the list of shell builtins. If a match is found, that builtin is invoked.
If the name is neither a shell function nor a builtin, and contains no slashes, bash searches each element of the PATH for a directory containing an executable file by that name. Bash uses a hash table to remember the full pathnames of executable files (see hash under SHELL BUILTIN COMMANDS below). A full search of the directories in PATH is performed only if the command is not found in the hash table. If the search is unsuccessful, the shell prints an error message and returns an exit status of 127.
If the search is successful, or if the command name contains one or more slashes, the shell executes the named program in a separate execution environment. Argument 0 is set to the name given, and the remaining arguments to the command are set to the arguments given, if any.
If this execution fails because the file is not in executable format, and the file is not a directory, it is assumed to be a shell script, a file containing shell commands. A subshell is spawned to execute it. This subshell reinitializes itself, so that the effect is as if a new shell had been invoked to handle the script, with the exception that the locations of commands remembered by the parent (see hash below under SHELL BUILTIN COMMANDS) are retained by the child.
If the program is a file beginning with #!, the remainder of the first line specifies an interpreter for the program. The shell executes the specified interpreter on operating systems that do not handle this executable format themselves. The arguments to the interpreter consist of a single optional argument following the interpreter name on the first line of the program, followed by the name of the program, followed by the command arguments, if any.
When a simple command other than a builtin or shell function is to be executed, it is invoked in a separate execution environment that consists of the following. Unless otherwise noted, the values are inherited from the shell.
A command invoked in this separate environment cannot affect the shell's execution environment.
Command substitution, commands grouped with parentheses, and asynchronous commands are invoked in a subshell environment that is a duplicate of the shell environment, except that traps caught by the shell are reset to the values that the shell inherited from its parent at invocation. Builtin commands that are invoked as part of a pipeline are also executed in a subshell environment. Changes made to the subshell environment cannot affect the shell's execution environment.
If a command is followed by a & and job control is not active, the default standard input for the command is the empty file /dev/null. Otherwise, the invoked command inherits the file descriptors of the calling shell as modified by redirections.
The shell provides several ways to manipulate the environment. On invocation, the shell scans its own environment and creates a parameter for each name found, automatically marking it for export to child processes. Executed commands inherit the environment. The export and declare -x commands allow parameters and functions to be added to and deleted from the environment. If the value of a parameter in the environment is modified, the new value becomes part of the environment, replacing the old. The environment inherited by any executed command consists of the shell's initial environment, whose values may be modified in the shell, less any pairs removed by the unset command, plus any additions via the export and declare -x commands.
The environment for any simple command or function may be augmented temporarily by prefixing it with parameter assignments, as described above in PARAMETERS. These assignment statements affect only the environment seen by that command.
If the -k option is set (see the set builtin command below), then all parameter assignments are placed in the environment for a command, not just those that precede the command name.
When bash invokes an external command, the variable _ is set to the full file name of the command and passed to that command in its environment.
If a command is not found, the child process created to execute it returns a status of 127. If a command is found but is not executable, the return status is 126.
If a command fails because of an error during expansion or redirection, the exit status is greater than zero.
Shell builtin commands return a status of 0 (true) if successful, and non-zero (false) if an error occurs while they execute. All builtins return an exit status of 2 to indicate incorrect usage.
Bash itself returns the exit status of the last command executed, unless a syntax error occurs, in which case it exits with a non-zero value. See also the exit builtin command below.
Non-builtin commands run by bash have signal handlers set to the values inherited by the shell from its parent. When job control is not in effect, asynchronous commands ignore SIGINT and SIGQUIT in addition to these inherited handlers. Commands run as a result of command substitution ignore the keyboard-generated job control signals SIGTTIN, SIGTTOU, and SIGTSTP.
The shell exits by default upon receipt of a SIGHUP. Before exiting, an interactive shell resends the SIGHUP to all jobs, running or stopped. Stopped jobs are sent SIGCONT to ensure that they receive the SIGHUP. To prevent the shell from sending the signal to a particular job, it should be removed from the jobs table with the disown builtin (see SHELL BUILTIN COMMANDS below) or marked to not receive SIGHUP using disown -h.
If the huponexit shell option has been set with shopt, bash sends a SIGHUP to all jobs when an interactive login shell exits.
If bash is waiting for a command to complete and receives a signal for which a trap has been set, the trap will not be executed until the command completes. When bash is waiting for an asynchronous command via the wait builtin, the reception of a signal for which a trap has been set will cause the wait builtin to return immediately with an exit status greater than 128, immediately after which the trap is executed.
The shell associates a job with each pipeline. It keeps a table of currently executing jobs, which may be listed with the jobs command. When bash starts a job asynchronously (in the background), it prints a line that looks like:
[1] 25647
indicating that this job is job number 1 and that the process ID of the last process in the pipeline associated with this job is 25647. All of the processes in a single pipeline are members of the same job. Bash uses the job abstraction as the basis for job control.
To facilitate the implementation of the user interface to job control, the operating system maintains the notion of a current terminal process group ID. Members of this process group (processes whose process group ID is equal to the current terminal process group ID) receive keyboard-generated signals such as SIGINT. These processes are said to be in the foreground. Background processes are those whose process group ID differs from the terminal's; such processes are immune to keyboard-generated signals. Only foreground processes are allowed to read from or write to the terminal. Background processes which attempt to read from (write to) the terminal are sent a SIGTTIN (SIGTTOU) signal by the terminal driver, which, unless caught, suspends the process.
If the operating system on which bash is running supports job control, bash contains facilities to use it. Typing the suspend character (typically ^Z, Control-Z) while a process is running causes that process to be stopped and returns control to bash. Typing the delayed suspend character (typically ^Y, Control-Y) causes the process to be stopped when it attempts to read input from the terminal, and control to be returned to bash. The user may then manipulate the state of this job, using the bg command to continue it in the background, the fg command to continue it in the foreground, or the kill command to kill it. A ^Z takes effect immediately, and has the additional side effect of causing pending output and typeahead to be discarded.
There are a number of ways to refer to a job in the shell. The character % introduces a job name. Job number n may be referred to as %n. A job may also be referred to using a prefix of the name used to start it, or using a substring that appears in its command line. For example, %ce refers to a stopped ce job. If a prefix matches more than one job, bash reports an error. Using %?ce, on the other hand, refers to any job containing the string ce in its command line. If the substring matches more than one job, bash reports an error. The symbols %% and %+ refer to the shell's notion of the current job, which is the last job stopped while it was in the foreground or started in the background. The previous job may be referenced using %-. In output pertaining to jobs (e.g., the output of the jobs command), the current job is always flagged with a +, and the previous job with a -. A single % (with no accompanying job specification) also refers to the current job.
Simply naming a job can be used to bring it into the foreground: %1 is a synonym for ``fg %1'', bringing job 1 from the background into the foreground. Similarly, ``%1 &'' resumes job 1 in the background, equivalent to ``bg %1''.
The shell learns immediately whenever a job changes state. Normally, bash waits until it is about to print a prompt before reporting changes in a job's status so as to not interrupt any other output. If the -b option to the set builtin command is enabled, bash reports such changes immediately. Any trap on SIGCHLD is executed for each child that exits.
If an attempt to exit bash is made while jobs are stopped, the shell prints a warning message. The jobs command may then be used to inspect their status. If a second attempt to exit is made without an intervening command, the shell does not print another warning, and the stopped jobs are terminated.
The command number and the history number are usually different: the history number of a command is its position in the history list, which may include commands restored from the history file (see HISTORY below), while the command number is the position in the sequence of commands executed during the current shell session. After the string is decoded, it is expanded via parameter expansion, command substitution, arithmetic expansion, and quote removal, subject to the value of the promptvars shell option (see the description of the shopt command under SHELL BUILTIN COMMANDS below).
In this section, the emacs-style notation is used to denote keystrokes. Control keys are denoted by C-key, e.g., C-n means Control-N. Similarly, meta keys are denoted by M-key, so M-x means Meta-X. (On keyboards without a meta key, M-x means ESC x, i.e., press the Escape key then the x key. This makes ESC the meta prefix. The combination M-C-x means ESC-Control-x, or press the Escape key then hold the Control key while pressing the x key.)
Readline commands may be given numeric arguments, which normally act as a repeat count. Sometimes, however, it is the sign of the argument that is significant. Passing a negative argument to a command that acts in the forward direction (e.g., kill-line) causes that command to act in a backward direction. Commands whose behavior with arguments deviates from this are noted below.
When a command is described as killing text, the text deleted is saved for possible future retrieval (yanking). The killed text is saved in a kill ring. Consecutive kills cause the text to be accumulated into one unit, which can be yanked all at once. Commands which do not kill text separate the chunks of text on the kill ring.
Readline is customized by putting commands in an initialization file (the inputrc file). The name of this file is taken from the value of the INPUTRC variable. If that variable is unset, the default is ~/.inputrc. When a program which uses the readline library starts up, the initialization file is read, and the key bindings and variables are set. There are only a few basic constructs allowed in the readline initialization file. Blank lines are ignored. Lines beginning with a # are comments. Lines beginning with a $ indicate conditional constructs. Other lines denote key bindings and variable settings.
The default key-bindings may be changed with an inputrc file. Other programs that use this library may add their own commands and bindings.
For example, placing
M-Control-u: universal-argument
The following symbolic character names are recognized: RUBOUT, DEL, ESC, LFD, NEWLINE, RET, RETURN, SPC, SPACE, and TAB.
In addition to command names, readline allows keys to be bound to a string that is inserted when the key is pressed (a macro).
The syntax for controlling key bindings in the inputrc file is simple. All that is required is the name of the command or the text of a macro and a key sequence to which it should be bound. The name may be specified in one of two ways: as a symbolic key name, possibly with Meta- or Control- prefixes, or as a key sequence.
When using the form keyname:function-name or macro, keyname is the name of a key spelled out in English. For example:
In the above example, C-u is bound to the function universal-argument, M-DEL is bound to the function backward-kill-word, and C-o is bound to run the macro expressed on the right hand side (that is, to insert the text ``> output'' into the line).
In the second form, "keyseq":function-name or macro, keyseq differs from keyname above in that strings denoting an entire key sequence may be specified by placing the sequence within double quotes. Some GNU Emacs style key escapes can be used, as in the following example, but the symbolic character names are not recognized.
In this example, C-u is again bound to the function universal-argument. C-x C-r is bound to the function re-read-init-file, and ESC [ 1 1 ~ is bound to insert the text ``Function Key 1''.
The full set of GNU Emacs style escape sequences is
In addition to the GNU Emacs style escape sequences, a second set of backslash escapes is available:
When entering the text of a macro, single or double quotes must be used to indicate a macro definition. Unquoted text is assumed to be a function name. In the macro body, the backslash escapes described above are expanded. Backslash will quote any other character in the macro text, including " and aq.
Bash allows the current readline key bindings to be displayed or modified with the bind builtin command. The editing mode may be switched during interactive use by using the -o option to the set builtin command (see SHELL BUILTIN COMMANDS below).
Readline has variables that can be used to further customize its behavior. A variable may be set in the inputrc file with a statement of the form
set variable-name value
Except where noted, readline variables can take the values On or Off (without regard to case). Unrecognized variable names are ignored. When a variable value is read, empty or null values, "on" (case-insensitive), and "1" are equivalent to On. All other values are equivalent to Off. The variables and their default values are:
Readline implements a facility similar in spirit to the conditional compilation features of the C preprocessor which allows key bindings and variable settings to be performed as the result of tests. There are four parser directives used.
$if Bash # Quote the current or previous word "\C-xq": "\eb\"\ef\"" $endif
$include /etc/inputrc
Readline provides commands for searching through the command history (see HISTORY below) for lines containing a specified string. There are two search modes: incremental and non-incremental.
Incremental searches begin before the user has finished typing the search string. As each character of the search string is typed, readline displays the next entry from the history matching the string typed so far. An incremental search requires only as many characters as needed to find the desired history entry. The characters present in the value of the isearch-terminators variable are used to terminate an incremental search. If that variable has not been assigned a value the Escape and Control-J characters will terminate an incremental search. Control-G will abort an incremental search and restore the original line. When the search is terminated, the history entry containing the search string becomes the current line.
To find other matching entries in the history list, type Control-S or Control-R as appropriate. This will search backward or forward in the history for the next entry matching the search string typed so far. Any other key sequence bound to a readline command will terminate the search and execute that command. For instance, a newline will terminate the search and accept the line, thereby executing the command from the history list.
Readline remembers the last incremental search string. If two Control-Rs are typed without any intervening characters defining a new search string, any remembered search string is used.
Non-incremental searches read the entire search string before starting to search for matching history lines. The search string may be typed by the user or be part of the contents of the current line.
The following is a list of the names of the commands and the default key sequences to which they are bound. Command names without an accompanying key sequence are unbound by default. In the following descriptions, point refers to the current cursor position, and mark refers to a cursor position saved by the set-mark command. The text between the point and mark is referred to as the region.
When word completion is attempted for an argument to a command for which a completion specification (a compspec) has been defined using the complete builtin (see SHELL BUILTIN COMMANDS below), the programmable completion facilities are invoked.
First, the command name is identified. If a compspec has been defined for that command, the compspec is used to generate the list of possible completions for the word. If the command word is a full pathname, a compspec for the full pathname is searched for first. If no compspec is found for the full pathname, an attempt is made to find a compspec for the portion following the final slash.
Once a compspec has been found, it is used to generate the list of matching words. If a compspec is not found, the default bash completion as described above under Completing is performed.
First, the actions specified by the compspec are used. Only matches which are prefixed by the word being completed are returned. When the -f or -d option is used for filename or directory name completion, the shell variable FIGNORE is used to filter the matches.
Any completions specified by a filename expansion pattern to the -G option are generated next. The words generated by the pattern need not match the word being completed. The GLOBIGNORE shell variable is not used to filter the matches, but the FIGNORE variable is used.
Next, the string specified as the argument to the -W option is considered. The string is first split using the characters in the IFS special variable as delimiters. Shell quoting is honored. Each word is then expanded using brace expansion, tilde expansion, parameter and variable expansion, command substitution, and arithmetic expansion, as described above under EXPANSION. The results are split using the rules described above under Word Splitting. The results of the expansion are prefix-matched against the word being completed, and the matching words become the possible completions.
After these matches have been generated, any shell function or command specified with the -F and -C options is invoked. When the command or function is invoked, the COMP_LINE and COMP_POINT variables are assigned values as described above under Shell Variables. If a shell function is being invoked, the COMP_WORDS and COMP_CWORD variables are also set. When the function or command is invoked, the first argument is the name of the command whose arguments are being completed, the second argument is the word being completed, and the third argument is the word preceding the word being completed on the current command line. No filtering of the generated completions against the word being completed is performed; the function or command has complete freedom in generating the matches.
Any function specified with -F is invoked first. The function may use any of the shell facilities, including the compgen builtin described below, to generate the matches. It must put the possible completions in the COMPREPLY array variable.
Next, any command specified with the -C option is invoked in an environment equivalent to command substitution. It should print a list of completions, one per line, to the standard output. Backslash may be used to escape a newline, if necessary.
After all of the possible completions are generated, any filter specified with the -X option is applied to the list. The filter is a pattern as used for pathname expansion; a & in the pattern is replaced with the text of the word being completed. A literal & may be escaped with a backslash; the backslash is removed before attempting a match. Any completion that matches the pattern will be removed from the list. A leading ! negates the pattern; in this case any completion not matching the pattern will be removed.
Finally, any prefix and suffix specified with the -P and -S options are added to each member of the completion list, and the result is returned to the readline completion code as the list of possible completions.
If the previously-applied actions do not generate any matches, and the -o dirnames option was supplied to complete when the compspec was defined, directory name completion is attempted.
If the -o plusdirs option was supplied to complete when the compspec was defined, directory name completion is attempted and any matches are added to the results of the other actions.
By default, if a compspec is found, whatever it generates is returned to the completion code as the full set of possible completions. The default bash completions are not attempted, and the readline default of filename completion is disabled. If the -o bashdefault option was supplied to complete when the compspec was defined, the bash default completions are attempted if the compspec generates no matches. If the -o default option was supplied to complete when the compspec was defined, readline's default completion will be performed if the compspec (and, if attempted, the default bash completions) generate no matches.
When a compspec indicates that directory name completion is desired, the programmable completion functions force readline to append a slash to completed names which are symbolic links to directories, subject to the value of the mark-directories readline variable, regardless of the setting of the mark-symlinked-directories readline variable.
On startup, the history is initialized from the file named by the variable HISTFILE (default ~/.bash_history). The file named by the value of HISTFILE is truncated, if necessary, to contain no more than the number of lines specified by the value of HISTFILESIZE. When an interactive shell exits, the last $HISTSIZE lines are copied from the history list to $HISTFILE. If the histappend shell option is enabled (see the description of shopt under SHELL BUILTIN COMMANDS below), the lines are appended to the history file, otherwise the history file is overwritten. If HISTFILE is unset, or if the history file is unwritable, the history is not saved. After saving the history, the history file is truncated to contain no more than HISTFILESIZE lines. If HISTFILESIZE is not set, no truncation is performed.
The builtin command fc (see SHELL BUILTIN COMMANDS below) may be used to list or edit and re-execute a portion of the history list. The history builtin may be used to display or modify the history list and manipulate the history file. When using command-line editing, search commands are available in each editing mode that provide access to the history list.
The shell allows control over which commands are saved on the history list. The HISTCONTROL and HISTIGNORE variables may be set to cause the shell to save only a subset of the commands entered. The cmdhist shell option, if enabled, causes the shell to attempt to save each line of a multi-line command in the same history entry, adding semicolons where necessary to preserve syntactic correctness. The lithist shell option causes the shell to save the command with embedded newlines instead of semicolons. See the description of the shopt builtin below under SHELL BUILTIN COMMANDS for information on setting and unsetting shell options.
The shell supports a history expansion feature that is similar to the history expansion in csh. This section describes what syntax features are available. This feature is enabled by default for interactive shells, and can be disabled using the +H option to the set builtin command (see SHELL BUILTIN COMMANDS below). Non-interactive shells do not perform history expansion by default.
History expansions introduce words from the history list into the input stream, making it easy to repeat commands, insert the arguments to a previous command into the current input line, or fix errors in previous commands quickly.
History expansion is performed immediately after a complete line is read, before the shell breaks it into words. It takes place in two parts. The first is to determine which line from the history list to use during substitution. The second is to select portions of that line for inclusion into the current one. The line selected from the history is the event, and the portions of that line that are acted upon are words. Various modifiers are available to manipulate the selected words. The line is broken into words in the same fashion as when reading input, so that several metacharacter-separated words surrounded by quotes are considered one word. History expansions are introduced by the appearance of the history expansion character, which is ! by default. Only backslash (\) and single quotes can quote the history expansion character.
Several characters inhibit history expansion if found immediately following the history expansion character, even if it is unquoted: space, tab, newline, carriage return, and =. If the extglob shell option is enabled, ( will also inhibit expansion.
Several shell options settable with the shopt builtin may be used to tailor the behavior of history expansion. If the histverify shell option is enabled (see the description of the shopt builtin), and readline is being used, history substitutions are not immediately passed to the shell parser. Instead, the expanded line is reloaded into the readline editing buffer for further modification. If readline is being used, and the histreedit shell option is enabled, a failed history substitution will be reloaded into the readline editing buffer for correction. The -p option to the history builtin command may be used to see what a history expansion will do before using it. The -s option to the history builtin may be used to add commands to the end of the history list without actually executing them, so that they are available for subsequent recall.
The shell allows control of the various characters used by the history expansion mechanism (see the description of histchars above under Shell Variables).
An event designator is a reference to a command line entry in the history list.
Word designators are used to select desired words from the event. A : separates the event specification from the word designator. It may be omitted if the word designator begins with a ^, $, *, -, or %. Words are numbered from the beginning of the line, with the first word being denoted by 0 (zero). Words are inserted into the current line separated by single spaces.
If a word designator is supplied without an event specification, the previous command is used as the event.
After the optional word designator, there may appear a sequence of one or more of the following modifiers, each preceded by a `:'.
Unless otherwise noted, each builtin command documented in this section as accepting options preceded by - accepts -- to signify the end of the options. For example, the :, true, false, and test builtins do not accept options.
The return value is 0 unless an unrecognized option is given or an error occurred.
The matches will be generated in the same way as if the programmable completion code had generated them directly from a completion specification with the same flags. If word is specified, only those completions matching word will be displayed.
The return value is true unless an invalid option is supplied, or no matches were generated.
The process of applying these completion specifications when word completion is attempted is described above under Programmable Completion.
Other options, if specified, have the following meanings. The arguments to the -G, -W, and -X options (and, if necessary, the -P and -S options) should be quoted to protect them from expansion before the complete builtin is invoked.
The return value is true unless an invalid option is supplied, an option other than -p or -r is supplied without a name argument, an attempt is made to remove a completion specification for a name for which no specification exists, or an error occurs adding a completion specification.
Using `+' instead of `-' turns off the attribute instead, with the exception that +a may not be used to destroy an array variable. When used in a function, makes each name local, as with the local command. If a variable name is followed by =value, the value of the variable is set to value. The return value is 0 unless an invalid option is encountered, an attempt is made to define a function using ``-f foo=bar'', an attempt is made to assign a value to a readonly variable, an attempt is made to assign a value to an array variable without using the compound assignment syntax (see Arrays above), one of the names is not a valid shell variable name, an attempt is made to turn off readonly status for a readonly variable, an attempt is made to turn off array status for an array variable, or an attempt is made to display a non-existent function with -f.
The return value is 0 unless an invalid option is supplied or n indexes beyond the end of the directory stack.
The -n option suppresses the command numbers when listing. The -r option reverses the order of the commands. If the -l option is given, the commands are listed on standard output. Otherwise, the editor given by ename is invoked on a file containing those commands. If ename is not given, the value of the FCEDIT variable is used, and the value of EDITOR if FCEDIT is not set. If neither variable is set, vi is used. When editing is complete, the edited commands are echoed and executed.
In the second form, command is re-executed after each instance of pat is replaced by rep. A useful alias to use with this is ``r="fc -s"'', so that typing ``r cc'' runs the last command beginning with ``cc'' and typing ``r'' re-executes the last command.
If the first form is used, the return value is 0 unless an invalid option is encountered or first or last specify history lines out of range. If the -e option is supplied, the return value is the value of the last command executed or failure if an error occurs with the temporary file of commands. If the second form is used, the return status is that of the command re-executed, unless cmd does not specify a valid history line, in which case fc returns failure.
When the end of options is encountered, getopts exits with a return value greater than zero. OPTIND is set to the index of the first non-option argument, and name is set to ?.
getopts normally parses the positional parameters, but if more arguments are given in args, getopts parses those instead.
getopts can report errors in two ways. If the first character of optstring is a colon, silent error reporting is used. In normal operation diagnostic messages are printed when invalid options or missing option arguments are encountered. If the variable OPTERR is set to 0, no error messages will be displayed, even if the first character of optstring is not a colon.
If an invalid option is seen, getopts places ? into name and, if not silent, prints an error message and unsets OPTARG. If getopts is silent, the option character found is placed in OPTARG and no diagnostic message is printed.
If a required argument is not found, and getopts is not silent, a question mark (?) is placed in name, OPTARG is unset, and a diagnostic message is printed. If getopts is silent, then a colon (:) is placed in name and OPTARG is set to the option character found.
getopts returns true if an option, specified or unspecified, is found. It returns false if the end of options is encountered or an error occurs.
If the HISTTIMEFORMAT is set, the time stamp information associated with each history entry is written to the history file. The return value is 0 unless an invalid option is encountered, an error occurs while reading or writing the history file, an invalid offset is supplied as an argument to -d, or the history expansion supplied as an argument to -p fails.
If jobspec is given, output is restricted to information about that job. The return status is 0 unless an invalid option is encountered or an invalid jobspec is supplied.
If the -x option is supplied, jobs replaces any jobspec found in command or args with the corresponding process group ID, and executes command passing it args, returning its exit status.