Testing a Lexer
Be sure to define a subclass of
to declare an
ANTLRTester in your test-case class.
(See the usage page for further
Your ANTLR Tester will handle your lexer; all you have to do is
scanInput(String) on an ANTLR Tester to get the
lexer to do its thing.
Your primary job is to make assertions about the tokens you
scan. ANTLR assertions are defined in
org.norecess.antlr.Assert; statically import the
methods from this class to clean up the code.
assertToken() is your friend:
assertToken(MyOwnLexer.IDENTIFIER, "foo", myTester.scanInput("foo")); assertToken("should scan 'foo' as an identifier", MyOwnLexer.IDENTIFIER, "foo", myTester.scanInput("foo"));
Much like the
assertX() methods from JUnit,
assertToken() has an optional message string which is
displayed in the output when the assertion fails.
assertToken() has two expected values: the
type of the token (e.g.,
the text recognized by the token (e.g.,
produced token is tested against both of these values. The token
stream is also checked to make sure just one token is on
Refuting a Token
It is just as important to make sure that a regular expression rejects the right things.
refuteToken(MyOwnLexer.IDENTIFIER, myTester.scanInput("@")); refuteToken(MyOwnLexer.IDENTIFIER, myTester.scanInput("123")); refuteToken(MyOwnLexer.IDENTIFIER, myTester.scanInput("1x"));
refuteToken() accepts any reason for
rejecting the input. In the example above,
probably never valid input;
123 is probably a
1x is probably two
valid tokens (an
INTEGER and an
IDENTIFIER). We're only asserting that they are not
identifiers. This is so that you can add the assertions now and
leave them in place as the grammar matures.
The expected text is useful to establish exactly what the lexer does with the input.
Consider this assertion which should remind me that hexadecimal numbers remain in hexadecimal format through my lexer (at least):
assertToken(MyOwnLexer.INTEGER, "-0x1234", myTester.scanInput("-0x1234"));
If you skip whitespace, the expected text is noticeably different:
assertToken(Hobbes2008Lexer.INTEGER, "8", myTester.scanInput("\t8\t"));
Similarly, if comments are skipped, the expected text should reflect that:
assertToken(MyOwnLexer.INTEGER, "123", myTester.scanInput("123 // comment\n"));
Keywords are their own token:
assertToken(MyOwnLexer.IF, "if", myTester.scanInput("if"));