The General Purpose Colorizer colorizes an input file so the syntax is easier to read. Here's a simple example. The overall project will be housed at SourceForge so check at http://sourceforge.net/projects/gp-colorizer/ for the latest version. GPC is released under the GNU Library or Lesser General Public License (LGPL) GNU Library or Lesser General Public License (LGPL).
There are a lot of code colorizers out there. To help you choose, these are the most significant features of GPC:
To give you a better idea how it operates, here is the set of rules that govern the colorizing of css (style sheets):
The colorizer itself is a state machine. These rules govern how transitions occur between the various states as can be see in the diagram below. The rule can be applied if its "From State" matches the current state and it's pattern (regular expression) matches text at the current position.
The name of the current state is also used as a stylesheet class name in the output from the colorizer:
In the example above the attributes from, pattern, to and transient made an appearance. The complete list is as follows:
| Name | Type | Description | Required? |
|---|---|---|---|
| Pattern | Regular Expression | The pattern that must be found for this rule to execute | Yes |
| From | State | The initial state to which this rule applies | Yes |
| To | State | The final state when this rule completes its transition | No |
| Transient | State | The state that exists during this rule's transition | No |
| Push | State | Push the specified state on the stack | No |
| Pop | Flag | Pop a state from the stack | No |
| Add | State | Add the specified state to the set of current states | No |
| Remove | Flag | Remove this rules initial state from the set of current states | No |
| Debug | Flag | Invoke the debugger before this rule is executed | No |
If you just want to use the colorizer, you can skip this section but if curiosity get's the better of you, read on ...
The rule engine is capable of managing multiple concurrent states. An additional state is used to handle the highlighting of code marked sections of code.
The two rules below operate as a fair. The first adds hilite1 to the set of current states when /*[hilite1]*/ is encountered.
The second removes this state when /*[/hilite1]*/ is encountered. (noemit is a special state that produces no output for the syntax piece.)
There are similar sets of rules to cover 4 different types of highlighting for the three languages
which bulk out the rules a bit (45 rules with almost half- 24 for highlighting). As these rules are all very similar, a smarter definition can reduce this in the future.
It is possible to embed two other languages within html, namely: css (style-sheets) and javascript. These are triggered by the tags: <style> and <script> respectively.
When these tags are encountered, html mode must continue in order
to colorize any attributes properly. The switch to css or javascript should only take place when the tag is closed. This behaviour is managed
by the following pair of rules. The first rule pushes a state corresponding to the embedded
language that is about to be encountered. The second rule pops this state and make it current when the tag is closed.
Currently the rule file supports html, css and javascript (although xml can also be colorized with html rules). I'll be extending it to address my own needs from time to time. However if you add an additional language yourself, please email me the rules so I can maintain a master copy. Similarly please let me know about
any bugs or suggested features in the Tracker section at
http://sourceforge.net/projects/gp-colorizer/ .
The rule execution engine itself is currently C# only. However it's a fairly concise piece of code (<300 lines excluding comments) that is crying out to be ported to other environments such as Java and Javascript.
Of course if you just want to use the colorizer as is, that's fine also.