org.sc3d.apt.sss.v3

Package

Class

Tree

Deprecated

Index

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package org.sc3d.apt.sss.v3

This package provides tools and data structures for manipulating data stored in SSS formats.

See:
Description

Class Summary
Bracket	A subclass of Token that represents a string of zero or more Tokens enclosed in brackets.
Calculator	This class is a worked example of how to use SSS in general, and the tools provided in this Java package in particular.
Calculator.Expression	The data structure that represents an arithmetic expression.
Calculator.Negation	An Expression whose outermost operator is a negation.
Calculator.Number	An Expression that consists only of a number.
Calculator.Operation	An Expression whose outermost operator is an addition, subtraction, multiplication of division.
Escape	Represents an escape sequence in a literal string or literal character.
Grammar	Represents an SSS grammar.
Grammar.Keyword	A subclass of Terminal which insists on an exact text match with a Token of type 'Token.TYPE_WORD'.
Grammar.NonTerminal	The subclass of Grammar with 'isTerminal==false'.
Grammar.Production	Represents a production of the grammar.
Grammar.Terminal	The subclass of Grammar with 'isTerminal==true'.
GrammarParser	A subclass of Parser which parses SSS grammar specifications.
Indentation	Represents an indentation analysis of a Sentence.
IntBuffer	Represents an auto-extending buffer for integers.
Lex	Represents a lexical analysis of a Sentence.
Match	Represents a bracket matching analysis of a Sentence.
NDFA	Represents a non-determinstic finite automaton for parsing a particular grammar.
NDFA.State	Represents a state of an NDFA.
NDFA.Transition	Represents a transition of an NDFA.
Parser	Represents a parser for a particular Grammar.
Sentence	Represents a sequence of characters that need to be parsed, and allows error messages to be attached to it.
SSSChar	A subclass of Token that represents a literal character.
SSSNumber	A subclass of Token that represents an SSS number, or any prefix thereof.
SSSString	A subclass of Token that represents a literal String.
Token	Represents a part of a Sentence that has some sort of syntactic significance.
TokenBuffer	An extensible buffer for Tokens.
Tree	Represents the parse-tree of a Sentence.
Tree.NonTerminal	The subclass of Tree with 'isTerminal==false', which represents zero or more parse-trees all matching the same non-terminal of the grammar.
Tree.Production	Represents something constructed using a Grammar.Production from a sequence of smaller things.
Tree.Terminal	The subclass of Tree with 'isTerminal==true', which represents a single Token.
Validator	A tool for checking that SSS files obey a grammar.

Package org.sc3d.apt.sss.v3 Description

This package provides tools and data structures for manipulating data stored in SSS formats. SSS stands for Semi-Structured Syntax, and is a proposed standard for inventing data formats for storing and sending data that needs to be readable both to humans and to machines. Adhering to the standard permits you to use programs that have already been written, such as syntax colouring modes, compression algorithms, lexers and parsers, thus saving you work and guaranteeing the quality of these tools. It also makes it easier for people to describe, learn and use your data format, especially if they already know the SSS standard.

The SSS specification is not described here. You can download it from the official SSS site.

This package provides:

An implementation of the SSS lexer, bracket matcher and indentation checker.
A parser-generator, which makes it trivial to write a decent parser for any SSS format, given an SSS grammar specification.
A mechanism for collecting error messages and presenting them to the user in a helpful form.
Data structures to represent grammars, parse-trees and the various kinds of lexographic token.
A worked example called Calculator.

This package is the reference implementation of the SSS tools, and is written with an emphasis on clarity. If it differs from the SSS specification, the specification is correct.

Overview

The principal class in this package is Parser, the abstract superclass of all parsers of SSS formats. A Parser hides within it the algorithm used to parse sentences, which you therefore do not need to understand. It also invokes the SSS lexer, bracket matcher and indentation checker for you, so you don't need to understand them either unless you want to. Parsers have no state and are therefore immutable.

To construct a Parser you need a Grammar, which represents an SSS grammar specification. You can construct a Grammar explicitly yourself, but the easiest way to construct a Grammar is using a GrammarParser, which is a subclass of Parser dedicated to parsing SSS grammar specifications. For example, you might include an SSS grammar specification as a resource file in your class path, and parse it and store it in a public static final field when your Parser subclass is loaded. Grammars are immutable.

The input to a Parser is a Sentence, which represents a string of characters and provides a mechanism for reporting errors. Sentences are immutable, except for the ability to add errors to them. The lexer, bracket matcher, indentation checker and parser all report errors using this mechanism. You should also use this mechanism to report errors, so that all the errors are reported together.

The output of a Parser is a Tree, which represents the parse-tree of a Sentence. The structure of a Tree mimics that of a Grammar, and provides methods for quick and convenient navigation around the sentence to syntactically significant places. After parsing a sentence, you can work with the Tree directly, but you will often want to convert the Tree into a data structure of your own devising, for example in order to add type checking. Trees are immutable.

The leaves of a Tree are lexographic tokens, represented by instances of Token or its subclasses. Tokens are the output of the lexer. A Token records a reference to the sentence in which it was found, and provides a convenience method for adding an error to exactly the part of the sentence it represents. Most kinds of lexographic token are represented by instances of Token itself, but literal numbers, strings and characters are represented by instances of SSSNumber, SSSString and SSSChar respectively. Also, the bracket matcher represents a region enclosed in brackets as an instance of Bracket, another subclass of Token. Tokens are immutable.

Other classes generally represent internal mechanisms, and are provided only so that you can get underneath the main API when you need to. For example, if you want to perform lexographic analysis only (i.e. without subsequent parsing) you can use the class Lex, which is normally invoked for you by a Parser. For antother example, if you find your subclass of Parser is using a lot of memory, you can examine the NDFAs it is using, and thereby optimise your grammar specification.

I recommend that you read the documentation of the classes in the order in which they are mentioned above, but that you read the SSS specification first. In parallel, you may want to look at the source code (not just the documentation) for the worked example called 'Calculator'.