Syntax highlighting#
The standard actions used in the bundled languages can be mapped to colors and other text formatting properties, so text in any language can be highlighted in a pretty way, to improve readability.
This is done by combining a Formatter
with a
Theme
.
A Formatter iterates over the tokens in a selected range and yields
FormatRange
tuples, describing how a certain piece
of text should be formatted. A format range consists of a starting position
pos
, an ending position end
and a format textformat
.
This textformat is provided by the theme, which maps a standard action to a
TextFormat
, and converted to something the formatter can
use using a factory function that is specified when creating the formatter.
A TextFormat provided by the theme is a simple data object with attributes
that define text properties, such as color, font, decoration etc. The default
Theme implementation reads these properties from a CSS (Cascading Style Sheets)
file. Some CSS themes are provided, in the themes
directory.
You can implement any kind of formatting by creating a Formatter with your
factory function, and then iterating over
format_ranges()
.
Optionally, you can inherit from Formatter to implement useful other methods.
In parce.out
there are some modules containing often used formatting
utilities.
Creating HTML output#
Using the parce.out.html
module, it is easy to convert tokenized text to
HTML. Here is an example. Let’s say we’ve got a document containing some XML
text:
>>> import parce
>>> doc = parce.Document(parce.find("xml"), '''<xml attr="value">text</xml>\n''')
We create a cursor that selects all text (a formatter can also just format a selection of the text):
>>> cur = parce.Cursor(doc, 0, None)
We load a theme, using the parce/themes/dark.css
included CSS theme:
>>> theme = parce.theme_by_name('dark')
And create an HTML formatter:
>>> from parce.out.html import HtmlFormatter
>>> f = HtmlFormatter(theme)
Now we call the formatter to format the selected part of the document:
>>> print(f.full_html(cur))
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
</head>
<body>
<div class="parce">
<pre style="white-space: pre; background-color: #000000; color: #fffff0; fon
t-family: monospace;"><<span style="color: #87cefa; font-weight: bold;">x
ml</span> <span style="color: #1e90ff;">attr</span>=<span style="color: #cd5
c5c;">"value"</span>>text</<span style="color: #87cefa; font-weight: b
old;">xml</span>>
</pre>
</div>
</body>
</html>
Creating your own themes#
The easiest way to create your own theme is by copying default.css
or
_template.css
in the themes/
directory to a new file and start editing
that.
CSS properties#
The following subset of CSS properties is supported by the default TextFormat used by the theming engine:
Property: |
Supported values: |
---|---|
|
named CSS color (like |
|
same as |
|
only colors are supported, same as |
|
same as |
|
same as |
|
one or more of |
|
one of |
|
in order a color, line, and style value |
|
one or more generic or quoted font names; generic names are:
|
|
one of |
|
one of |
|
one of |
|
|
|
one of |
|
one of |
|
one of |
|
all of the above |
Note
It is possible that not all formatters support all properties. For example Qt5’s QTextCharFormat does not support double underline.
CSS classes#
To determine the style properties to use for a token, the token’s action (which must be a standard action) is mapped to one or more CSS classes. This is described in The theme module, under “Mapping actions to CSS classes.” The matching CSS rules are then combined to determine the actual style properties to use for the action.
All rules should have a .parce
ancestor class selector, so that the theme
css file can directly be used in HTML (where tokens are mapped to class names,
e.g. using the SimpleHtmlFormatter
), without much
chance that other parts of a web page’s style are clobbered by the parce css
file, for example:
.parce
.comment {
color: dimgray;
font-family: serif;
font-style: italic;
}
This maps the Comment
standard action to these color and font settings.
General classes#
There are some special classes that define other style aspects than that of individual tokens:
CSS Selector |
defines properties to use for: |
---|---|
|
the text view or block as a whole; e.g. a text editor window, or an HTML
|
|
the line the cursor is in (only background probably makes sense) |
|
text selected by the user (also works in straight HTML in a modern browser) |
|
the current line when the window has focus |
|
selected text when the window has focus |
|
the text editor widget when it is disabled (i.e. the user can’t interact with it). If a text editor supports this at all, probably only the changed colors will be used (via a widget’s palette), not the font. |
|
the current line when the text widget is disabled |
|
selected text when the text widget is disabled |
Editor features#
The following CSS classes are not used by the parce formatter itself, but they can be used to define text editor styles so that they relate with the theme (other classes might be designed as well for custom usage of parce themes):
CSS Selector |
defines properties to use for: |
---|---|
|
highlighting leading whitespace, if desired. |
|
highlighting trailing whitespace, if desired. |
|
drawing an “end-of-line” marker, if desired. |
|
the region displaying line numbers. |
|
the region displaying folding markers. |
|
a (book)marked line (at least background should be defined) |
|
search results in the text (at least background should be defined) |
|
a line marked as containing e.g. a compile error in the text (at least background should be defined) |
|
a matching bracket or parenthesis, etc. |
|
a bracket or parenthesis that has no valid match. |
Using multiple themes together#
Suppose you want to highlight tokens from embedded pieces of a different language with a different theme. E.g. you have a document containing HTML markup and want to highlight embedded CSS with a different color theme.
To do this, create a formatter and then add other themes for specific languages:
>>> import parce
>>> doc = parce.Document(parce.find("html"), '''
... <html>
... <head>
... <style type="text/css">
... h2 {
... color: green;
... }
... </style>
... </head>
... </html>
... ''')
>>> from parce.out.html import HtmlFormatter
>>> f = HtmlFormatter(parce.theme_by_name('default'))
>>> f.add_theme(parce.find("css").language, parce.theme_by_name('dark'))
>>> print(f.full_html(parce.Cursor(doc, 0, None)))
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
</head>
<body>
<div class="parce">
<pre style="white-space: pre; background-color: #fffff0; color: #000000; font-family: monospace;">
<<span style="color: #00008b; font-weight: bold;">html</span>>
<<span style="color: #00008b; font-weight: bold;">head</span>>
<<span style="color: #00008b; font-weight: bold;">style</span> <span style="color: #1e90ff;">type</span>=<span style="color: #b22222;">"text/css"</span>>
<span style="color: #87cefa; font-weight: bold;">h2</span> <span style="font-weight: bold;">{</span>
<span style="color: #4169e1; font-weight: bold;">color</span>: <span style="color: #2e8b57;">green</span>;
<span style="font-weight: bold;">}</span>
</<span style="color: #00008b; font-weight: bold;">style</span>>
</<span style="color: #00008b; font-weight: bold;">head</span>>
</<span style="color: #00008b; font-weight: bold;">html</span>>
</pre>
</div>
</body>
</html>
We used the default
theme as default theme, and the dark
theme for
stuff that’s parsed by the CSS
language.
In your browser, the resulting HTML-formatted text looks like this:
HTML
<html> <head> <style type="text/css"> h2 { color: green; } </style> </head> </html>
This example is not particularly beautiful, because the two themes are not
really related; the css colors are quite light, because they expect a dark
background. By default, the background of embedded language themes is not used.
To force the formatter to use the default background color of embedded themes,
add them to the formatter with add_baseformat = True
:
>>> f.add_theme(parce.find("css").language, parce.theme_by_name('dark'), True)
>>> print(f.full_html(parce.Cursor(doc, 0, None)))
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
</head>
<body>
<div class="parce">
<pre style="white-space: pre; background-color: #fffff0; color: #000000; font-family: monospace;">
<<span style="color: #00008b; font-weight: bold;">html</span>>
<<span style="color: #00008b; font-weight: bold;">head</span>>
<<span style="color: #00008b; font-weight: bold;">style</span> <span style="color: #1e90ff;">type</span>=<span style="color: #b22222;">"text/css"</span>><span style="background-color: #000000; color: #fffff0; font-family: monospace;">
</span><span style="background-color: #000000; color: #87cefa; font-family: monospace; font-weight: bold;">h2</span><span style="background-color: #000000; color: #fffff0; font-family: monospace;"> </span><span style="background-color: #000000; color: #fffff0; font-family: monospace; font-weight: bold;">{</span><span style="background-color: #000000; color: #fffff0; font-family: monospace;">
</span><span style="background-color: #000000; color: #4169e1; font-family: monospace; font-weight: bold;">color</span><span style="background-color: #000000; color: #fffff0; font-family: monospace;">: </span><span style="background-color: #000000; color: #2e8b57; font-family: monospace;">green</span><span style="background-color: #000000; color: #fffff0; font-family: monospace;">;
</span><span style="background-color: #000000; color: #fffff0; font-family: monospace; font-weight: bold;">}</span>
</<span style="color: #00008b; font-weight: bold;">style</span>>
</<span style="color: #00008b; font-weight: bold;">head</span>>
</<span style="color: #00008b; font-weight: bold;">html</span>>
</pre>
</div>
</body>
</html>
When a theme is added to a formatter with add_baseformat = True
, two things
are done: 1) all unparsed text (text between tokens) is formatted using the
baseformat of the embedded theme, and 2) all textformats of the embedded theme
are combined with the theme’s baseformat.
The output looks like:
HTML
<html> <head> <style type="text/css"> h2 { color: green; } </style> </head> </html>
Of course the dark
and default
themes do not look good at all when used
together, but this example shows that you, with well-designed themes and
language definitions, can create sophisticated highlighting and code formatting
with parce.