Ajax software
Free javascripts
↑
Main Page
How does it work with the match for the part number
ABC
? When the regular expression engine is at the
position immediately before the uppercase
A
of the part number
ABC
, it attempts to match an uppercase
A
.
That matches. Next, an attempt is made to match an uppercase
B
. That too matches. Next, an attempt is
made to match an uppercase
C
. That too matches. At that stage, the first three characters in the regular
expression pattern have been matched. Finally, an attempt is made to match the pattern
[0-9]{0,2}
,
which means “Match a minimum of zero and a maximum of two numeric characters.” Zero numeric
digits follow the uppercase
C
in
ABC
. Because there are exactly zero numeric digits after the uppercase
C
of
ABC
, there is a match (zero numeric digits matches the criterion “a minimum of zero numeric digits”
specified by the minimum-occurrence specifier of the
{0,2}
quantifier). Because the final component
of the pattern matches, the whole pattern matches.
What happens when matching is attempted on the line that contains the part number
ABC8899
? Why do
the first five characters of the part number
ABC8899
match? When the regular expression engine is at the
position immediately before the
A
of
ABC8899
, it attempts to match the next character in the part number
with an uppercase
A
and finds it is a match. Next, an attempt is made to match an uppercase
B
. That too
matches. Then an attempt is made to match an uppercase
C
, which also matches. At that stage, the first
three characters in the regular expression pattern have been matched. Finally, an attempt is made to match
the pattern
[0-9]{0,2}
, which means “Match a minimum of zero and a maximum of two numeric charac-
ters.” Four numeric digits follow the uppercase
C
. Only two of those numeric digits are needed for a suc-
cessful match. Because there are four numeric digits after the uppercase
C
of
ABC
, there is a match (of two
numeric digits, which meets the criterion “a maximum of two numeric digits”), but the final two numeric
digits of
ABC8899
are not needed to form a match, so they are not highlighted. Because all components of
the pattern match, the whole pattern matches.
{n,m}
The minimum-occurrence specifier in the curly-brace syntax doesn’t have to be 0. It can be any number
you like, provided it is not larger than the maximum-occurrence specifier.
Look for one to three occurrences of a numeric digit. You can specify this in a problem definition as follows:
Match an uppercase
A
. If there is a match, attempt to match an uppercase
B
. If there is a match, attempt
to match an uppercase
C
. If all three uppercase characters match, attempt to match a minimum of one
and a maximum of three numeric digits.
So if you wanted to match one to three occurrences of a numeric digit in
Parts.txt
, you would use the
following pattern:
ABC[0-9]{1,3}
Figure A-20 shows the matches in OpenOffice.org Writer. Notice that the part number
ABC
does not
match, because it has zero numeric digits, and you are looking for matches that have one through
three numeric digits. Notice, too, that only the first three numeric digits of
ABC8899
form part of
the match.
The explanation in the preceding section for the {0,m} syntax should be sufficient to help you under-
stand what is happening in this example.
339
Appendix A: Simple Regular Expressions
bapp01.qxd:bapp01 10:47 339
Ajax software
Free javascripts
→