123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165 |
- <html>
- <head>
- <title>mozilla cross-reference: search help</title>
- <link rel='stylesheet' title='' href='style/style.css' type='text/css'>
- </head>
- <body bgcolor="#ffffff" text="#000000"
- link="#0000EE" vlink="#551a8b" alink="#ff0000">
- <table bgcolor="#000000" width="100%" border=0 cellpadding=0 cellspacing=0>
- <tr><td><a href="//www.mozilla.org/"><img
- src="//www.mozilla.org/images/mozilla-banner.gif" alt=""
- border=0 width=600 height=58></a></td></td></table>
- <p>
- <table class=desc>
- <tr><td>
- <h1 align=center>search help<br>
- <font size=3>
- for the<br>
- <a href="./"><i>mozilla cross-reference</i></a>
- </font></h1>
- </td></tr></table>
- <p>
- <blockquote><blockquote>
- <i>
- This text is derived from the Glimpse manual page.
- For more information on glimpse, see the
- <a href="http://webglimpse.net/">Glimpse homepage</a>.
- </i>
- </blockquote></blockquote>
- <a name="Patterns"></a><h2>Patterns</h2>
- <ul>
- glimpse supports a large variety of patterns, including simple
- strings, strings with classes of characters, sets of strings,
- wild cards, and regular expressions (see <a href="#Limitations">Limitations</a>).
- </ul>
- <p> <h3>Strings</h3>
- <ul>
- Strings are any sequence of characters, including the special symbols
- `^' for beginning of line and `$' for end of line. The following
- special characters (`$', `^', `*', `[', `^', `|', `(', `)', `!', and
- `\' ) as well as the following meta characters special to glimpse (and
- agrep): `;', `,', `#', `>', `<', `-', and `.', should be preceded by
- `\' if they are to be matched as regular characters. For example,
- \^abc\\\\ corresponds to the string ^abc\\, whereas ^abc corresponds
- to the string abc at the beginning of a line.
- </ul>
- <p> <h3>Classes of characters</h3>
- <ul>
- a list of characters inside [] (in order) corresponds to any character
- from the list. For example, [a-ho-z] is any character between a and h
- or between o and z. The symbol `^' inside [] complements the list.
- For example, [^i-n] denote any character in the character set except
- character 'i' to 'n'.
- The symbol `^' thus has two meanings, but this is consistent with
- egrep.
- The symbol `.' (don't care) stands for any symbol (except for the
- newline symbol).
- </ul>
- <p> <h3>Boolean operations</h3>
- <ul>
- Glimpse
- supports an `AND' operation denoted by the symbol `;'
- an `OR' operation denoted by the symbol `,',
- a limited version of a 'NOT' operation (starting at version 4.0B1)
- denoted by the symbol `~',
- or any combination.
- For example, pizza;cheeseburger' will output all lines containing
- both patterns.
- '{political,computer};science' will match 'political science'
- or 'science of computers'.
- </ul>
- <p><h3>Wild cards</h3>
- <ul>
- The symbol '#' is used to denote a sequence
- of any number (including 0)
- of arbitrary characters (see <a href="#Limitations">Limitations</a>).
- The symbol # is equivalent to .* in egrep.
- In fact, .* will work too, because it is a valid regular expression
- (see below), but unless this is part of an actual regular expression,
- # will work faster.
- (Currently glimpse is experiencing some problems with #.)
- </ul>
- <p><h3>Combination of exact and approximate matching</h3>
- <ul>
- Any pattern inside angle brackets <> must match the text exactly even
- if the match is with errors. For example, <mathemat>ics matches
- mathematical with one error (replacing the last s with an a), but
- mathe<matics> does not match mathematical no matter how many errors are
- allowed. (This option is buggy at the moment.)
- </ul>
- <h3>Regular expressions</h3>
- <ul>
- Since the index is word based, a regular expression must match words
- that appear in the index for glimpse to find it. Glimpse first strips
- the regular expression from all non-alphabetic characters, and
- searches the index for all remaining words. It then applies the
- regular expression matching algorithm to the files found in the index.
- For example, glimpse 'abc.*xyz' will search the index for all files
- that contain both 'abc' and 'xyz', and then search directly for
- 'abc.*xyz' in those files. (If you use glimpse -w 'abc.*xyz', then
- 'abcxyz' will not be found, because glimpse will think that abc and
- xyz need to be matches to whole words.) The syntax of regular
- expressions in glimpse is in general the same as that for agrep. The
- union operation `|', Kleene closure `*', and parentheses () are all
- supported. Currently '+' is not supported. Regular expressions are
- currently limited to approximately 30 characters (generally excluding
- meta characters). The maximal number of errors
- for regular expressions that use '*' or '|' is 4.
- </ul>
- <a name="Limitations"></a><h2>Limitations</h2>
- <ul>
- The index of glimpse is word based. A pattern that contains more than
- one word cannot be found in the index. The way glimpse overcomes this
- weakness is by splitting any multi-word pattern into its set of words
- and looking for all of them in the index.
- For example, <i>'linear programming'</i> will first consult the index
- to find all files containing both <i>linear</i> and <i>programming</i>,
- and then apply agrep to find the combined pattern.
- This is usually an effective solution, but it can be slow for
- cases where both words are very common, but their combination is not.
- <p>
- As was mentioned in the section on <a href="#Patterns">Patterns</a> above, some characters
- serve as meta characters for glimpse and need to be
- preceded by '\\' to search for them. The most common
- examples are the characters '.' (which stands for a wild card),
- and '*' (the Kleene closure).
- So, "glimpse ab.de" will match abcde, but "glimpse ab\\.de"
- will not, and "glimpse ab*de" will not match ab*de, but
- "glimpse ab\\*de" will.
- The meta character - is translated automatically to a hyphen
- unless it appears between [] (in which case it denotes a range of
- characters).
- <p>
- Search patterns are limited to 29 characters.
- Lines are limited to 1024 characters.
- </ul>
- <p>
- <hr>
- <address>
- <a href="mailto:lxr@linux.no">
- Arne Georg Gleditsch and Per Kristian Gjermshus</a>
- </address>
- </body>
- </html>
|