Class 100
  Class 101

Regular Expressions

[Examples] [Tutorial 100] [Tutorial 101]

  1. What the heck are regular expressions ?
  2. When would I use them ?
  3. Where can I find more information about them ?
  4. How are Gravity's Regexs different from other packages ?
  5. Using Regexs in Rules
  6. Using Regexs in Display Filters
  7. Important User Pointers
  8. How to Test Regular Expressions
  9. Examples
  10. Tutorial for beginners - Class 100
  11. Tutorial continued - Class 101 Draft Version

What the heck are regular expressions ?

Regular expressions are odd-looking but powerful expressions that match patterns in text strings. In Gravity's case, the string is either the Subject line, the From line, the date, Message ID or text within the article body. Note that the string may contain spaces, which count as characters. You may use regexs in Gravity's rules, display filters, scoring, and search window.

When would I use them ?

They are used when you want to ..
  • specify upper or lower case for the target string.
  • specify the position of the target within the string.
  • match a range or exact count of characters
  • use wild cards or optional phrases
  • match a pattern rather than a fixed word or words

Where can I find more information about them ?

The "official" manual is in Gravity's on-line help. Look for reg, you will find working with regexs and the Regular Expression reference. This is the only "official" documentation there is.

Regexs are common in UNIX/Linux applications. If you have access to a shell you could read the man pages (or the info pages) for UNIX tools like ed, grep or perl. Javascript and Perl regexs are the same, so if you have a Javascript reference you may find some ideas there. However, note that other implementations will not match Gravity's syntax exactly. However, the basics should be almost the same. The best way to learn is to interpret examples and try some.

You can find a few tutorials on the web, but most cover the same things. If you are just starting out I wrote a two-part tutorial. It is very basic and advanced users may wish to skip it entirely. There are also examples listed on the examples page.

How are Gravity's Regexs different from other packages?

Gravity's regex package is REGX PLUS: Regular Expression Search and Replace Routines Copyright 1989, 1990, 1991, 1993 by English Knowledge Systems, Inc. All Rights Reserved. (Version 3.1 I think).

The main difference between Gravity and Perl-type expressions is that Gravity's expressions are not case-sensitive unless used within brackets.

Gravity uses simple, ordinary regular expressions. Things like boundaries (\b) and pre-defined character abbreviations (\w,\W,\d ..) like you find in Perl are not supported. This is usually no big deal. We can construct our own, it just takes a little more typing.

(Just a Note: Gravity Version 2.7 (Open Source versions) uses Perl Compatible Regular Expressions)

With Gravity the dollar sign $ (assignment) has special meaning, but is not available for use by the end user (that is you). It must be escaped (\$) unless it is used in brackets like [$].

The beginning of line, and end of line specifiers are different from most packages. To be more precise, it is the positional specifier < > that is different. Many packages don't have a positional at all, other than end or beginning of line. Otherwise, most of the simple specifiers are the same.

  Gravity Most others
Beginning of line <0> ^
End of line <~0> $

Using Regular Expressions in Rules

Rules Toolbar ButtonUse the rules editor (Tools - Rules, edit or add). On the rules conditions tab - check the box to add your string as a regular expression rather than a plain text string (default).

Check Box NOTE: You must remember to check the checkbox to add the string as a regular expression. Forgetting the check box is a common error.

The rules condition should look like this ...

Subject contains reg. expr. "[Gg]ravity"
but NOT this ...
Subject contains "[Gg]ravity"

Using Regular Expressions in Display Filters

You can use regular Expressions in Display Filters. Click the Filter button or go to "Newsgroup - Define Display Filter." Use the " Advanced" button in the Edit Filter dialog box. This allows you to create filters like this:
Unread articles
   Subject contains reg. expr. "[Gg]ravity"

(OK, its a pointless expression but you get the idea)

Important User Pointers

  • Unlike Perl or Javascript, Gravity's Regular Expressions are NOT case sensitive unless they are placed in brackets, i.e.
      tom        - matches tom or TOM
      Tom        - matches tom or TOM (same thing)
      [T]om      - matches Tom, does NOT match tom
      [T][O][M]  - matches TOM, does NOT match Tom or tom

  • When using a regular expression in a rule make sure you are checking the checkbox on the rules condition tab. If you forget they obviously will not work.

  • To find a dollar sign $ you must escape it thusly - \$ , unless it is used inside brackets like so [$].

  • The same goes for other meta characters. A good rule of thumb is that any symbol used in parsing an expression such as these

    < $ + * [ { ( ) .

    might need to be escaped, depending on its location. However, if used in brackets, most do not need the escape backslash. Sometimes this depends on their position within the brackets. You will have to experiment to see when this is true. The manual has some special cases for problem characters like dashes and brackets.

  • Gravity does not support the ? (zero to one replications). However, you can use {0,1} to do the same thing. Because it is treated as a regular character the ? does not need escaped.

  • If you are a Perl Hacker use <0> and <~0> in place of ^ and $ for beginning and end of line.

  • You can target a whole or complete word (similar to Perl's anchor boundary \b) by using something like the following:
        (<0>| )word_here( |<~0>)

    Be sure to include the spaces. This construction defines a "word" as being preceded by a space, unless it occurs at the beginning of line; or is followed by a space, unless at the end of line.

How to Test Regular Expressions

If you are using Gravity Version 2.6 (super) the easiest way to test regular expressions is to use the "Quick Filter" Be sure to check the regular expression check box. Open the Quick Filter box by typing a forward slash "/".

Quick Filter

Keep in mind the quick filter only works with articles available in the current display filter.

Note:There was a little bug in 2.6 builds before 2046, including 2039. If you select both From AND Subject the boolean logic was AND rather than OR. Later builds use the OR logic.

Another way to test (a little more tedious) is to enter your regular expression in a display filter and use your test filter to see if the expression is showing what you want. Set up a new filter under "newsgroup-define display filter" you need to go to "advanced" to enter a regular expression.

About posting test articles: If you need to post test articles, do NOT post them to discussion usenet groups. One of the better ways is to post test messages to a local ISP test group. This way the messages won't be propagated over the Net. Or, post to test groups like alt.test or alt.alt.test, which are intended for this purpose.

I have Hamster, a local news server, set up on my machine and I can post test articles without using my Internet connection.

Another way to test is this:

Create a rule called test, or whatever, and set the rule action to tag for download (see the following note). Enter your regular expressions in the rule conditions window. Then run the rule manually, you will see which subjects are hit, they will have the tagged symbol next to them. Switch to a tagged article filter to see only the results of the test. To reset the test switch filters to show tagged articles, hit CONTROL-A, then T to untag all.

Note: You cannot use the tag symbol if you are storing article bodies (you can't re-tag them). You will have to use another symbol like important.

- Top of Page -