The query module¶
Query the tree using the query property of Context and Token.
Normally you need not to import this module in order to use it. Using the
query property of any Token or (in most cases) Context, you
can query the token tree to find tokens and contexts, based on lexicons and/or
actions and text contents. You can chain calls in an XPath-like fashion.
This module supplements the various find_xxx methods of every Context object. A query is a generator, starts at the query property of a Context or Token object, and initially yields just that object.
You can navigate using children, all, first, last, [n], [n:n], [n:n:n], next, previous, right, left, right_siblings, left_siblings, map(), parent and ancestors. Use uniq to remove double occurrences of nodes, which can e.g. happen when navigating to the parent of all nodes.
You can narrow down the search using tokens, contexts, remove_ancestors, remove_descendants, slice() and filter().
You can search for tokens using (‘text’) or (lexicon), startingwith(), endingwith(), containing(), matching(), action() or in_action(). The special prefix is_not inverts the query, so query.is_not.containing(“bla”) yields Tokens that do not contain the text “bla”.
Find all tokens that are the first child of a Context with bla lexicon:
Find (in Xml) all attributes with name ‘name’ that are in a <bla> tag:
Find all tags containing “hi” in their text nodes:
Find all comments that have TODO in it:
Find all “\version” tokens in the root context, that have a “2” in the version string after it:
(t for t in root.query.children('\\version') if t.query.next.next.containing('2'))
Which could also be written as:
root.query.children('\\version').filter( lambda t: t.query.next.next.containing('2'))
A query is a generator, you can iterate over the results:
for attrs in q.all.action(Name.Tag)('origin').right: for atr in attrs.query.action(Name.Attribute): print(atr)
For debugging purposes, there are also the
list() construct, and the
root.query.all.action(Name.Tag)("img").count() # number of "img" tags list(root.query.all.action(Name.Tag)("img")) # list of all "img" tag name tokens
Note that a (partial) query can be reused, it simply restarts the iteration over the results. The above could also be written as:
q = root.query.all.action(Name.Tag)("img") q.count() # number of "img" tags list(q) # list of all "img" tag name tokens
A query resolves to False if there is no single result:
if token.query.ancestors(LilyPond.header): do_something() # the token is a descendant of a LilyPond.header context
You can also directly instantiate a Query object for a list of nodes, if you want to query those in one go:
q = Query.from_nodes(nodes)
Summary of the query methods:¶
Endpoint methods (some are mainly for debugging):
Selecting (filtering) nodes:¶
These methods filter out current nodes without adding new nodes to the selection:
is_not operator inverts the meaning of the
next query, e.g.:
The following query methods can be inverted by prepending is_not:
There is a subtle difference between action and in_action: with the first, the action should exactly match, with the latter the tokens are selected when the action exactly matches, or is a descendant of the given action.
A Query navigates and filters a node tree.
Create a Query object querying a list of nodes in one go.
Return True if there is at least one result.
List current selection of this Query, for debugging purposes.
Compute the length of the iterable.
Dump all selected nodes to the console (or to file).
Pick the first value, or return the default.
Pick the last value, or return the default.
Return the text range as a tuple (pos, end).
posis the lowest pos of the nodes in the current set, and
endis the highest end of the nodes. If the result set is empty, (-1, -1) is returned.
Delete all selected nodes from their parents.
remove_descendants, so that no unnecessary deletes are done. If a context would become empty, that context itself is deleted instead of all its children (except for the root of course). Returns the number of nodes that were deleted.
If you delete tokens from a tree which belong to a group, the tree cannot reliably be used by a treebuilder for a partial rebuild.
Get the specified item or items of every context node.
Note that the result nodes always form a flat iterable. No IndexError will be raised if an index would be out of range for any node.
All direct children of the current nodes.
All descendants, contexts and their nodes.
Shortcut for all.tokens.
Shortcut for all.contexts.
Yield the parent of every node.
This can lead to many double occurrences of the same node in the result set; use
uniqto fix that.
Yield the ancestor contexts of every node.
Yield the first node of every context node, same as .
Yield the last node of every context node, same as [-1].
Yield the next token, if any.
Yield the previous token, if any.
Yield Tokens in forward direction.
Yield Tokens in backward direction.
Yield the right sibling, if any.
Yield the left sibling, if any.
Yield the right siblings, if any.
Yield the left siblings, if any.
Call the function on every node and yield its results, which should be zero or more nodes as well.
Yield nodes for which the predicate returns a value that evaluates to True.
Get only the tokens.
Get only the contexts.
Remove double occurrences of the same node from the result set.
This can happen e.g. when you find the parent of multiple nodes.
Slice the full result set, using
This can help narrowing down the result set. For example:
will continue the query with only the first occurrence of a token “blaat”, and then look for at most three right siblings. If the slice(1) were not there, all the right siblings would become one large result set because you wouldn’t know how many tokens “blaat” were matched.
Remove nodes that have ancestors in the current node list.
Remove nodes that have descendants in the current node list.
Invert the next query.
Only yield contexts, with min_length, or with length between min and max.
Yield a restricted set, tokens and/or contexts must fall in start→end
Yield token if token has that text, or context if context has that lexicon.
You can even mix the types if you’d need to:
for n in tree.query.all("%", Lang.comment): # do something
yields tokens that are a percent sign and contexts that have the Lang.comment lexicon.
Yield tokens that start with text.
Yield tokens that end with text.
Yield tokens that contain the specified text.
Yield tokens matching the regular expression.
re.search()is used, so the expression can match anywhere unless you use ^ or $ characters).
Yield those tokens whose action is one of the given actions.
Yield those tokens whose action is or inherits from one of the given actions.