The work module¶
This module defines the Worker
class.
A Worker is designed to run a TreeBuilder and a Transformer as soon as source text is updated. It is possible to run those jobs in a background thread.
The whole process is divided in certain stages, and performed by exhausting the
Worker.process()
generator fully.
The Worker is intended to be used as the compagnon for the
Document
class and cause the TreeBuilder and (if set) the
Transformer to do their jobs in a configurable and flexible manner.
It is possible to wait for the parce tree of the transform result, or to
arrange for a callback to be called when the work is done. As Worker inherits
Observable
, you can connect to its events to get notified when
a tree or transform is updated.
Inherit of Worker to implement other features or another way to use a background thread for (parts of) the job.
-
class
Worker
(treebuilder, transformer=None)[source]¶ Bases:
parce.util.Observable
Runs the TreeBuilder and the Transformer.
Initialize with a
TreeBuilder
and optionally aTransformer
. It is not possible to change the treebuilder later; but you can set another transformer, or use no transformer at all.Call
update()
to re-run the treebuilder on changed text, or new text, or to use a new root lexicon. Callset_transformer()
to set another Transformer, which triggers a re-run of the transformer alone.You can
connect()
to the following signals:"started"
:emitted when a build process has started
"tree_updated"
:emitted when a tree (re)build has finished; the handler is called with two arguments:
start
,end
, that denote the updated text range"tree_finished"
:emitted when a (re)build has finished; the handler is called without arguments
"transform_finished"
:emitted when a transform rebuild has finished; the handler is called without arguments.
-
set_transformer
(transformer)[source]¶ Set the Transformer to use.
You may use one Transformer for multiple Workers. Use None to remove the current transformer.
Setting a new Transformer updates the transform result. This method should always be called from the main thread.
-
update
(text, root_lexicon=False, start=0, removed=0, added=None)[source]¶ Start a process to update the tree and the transform.
For the meaning of the arguments, see
treebuilder.TreeBuilder.rebuild()
.This method should always be called from the main thread.
-
start
()[source]¶ Start the update process.
Sets the initial state and then calls
run_process()
. This method should always be called from the main thread.
-
run_process
()[source]¶ Exhaust the
process()
generator.Called by
start()
; performs the work after initial state has been set up.This method should always be called from the main thread, but may be reimplemented to do (parts of the) work in a background thread.
-
process
()[source]¶ Generator performing the actual process, exhausted by
run_process()
.
-
wait_build
()[source]¶ Wait for the build job to be completed.
Immediately returns if there is no build job active.
-
wait_transform
()[source]¶ Wait for the transform job to be completed.
Immediately returns if there is no transform job active.
-
get_root
(wait=False, callback=None)[source]¶ Return the root element of the completed tree.
This is simply the builder’s
root
instance attribute, but this method only returns the tree when it is up-to-date.If wait is True, this call blocks until tokenizing is done, and the full tree is returned. If wait is False, None is returned if the tree is still busy being built.
If a callback is given and tokenizing is still busy, that callback is called once when tokenizing is ready, with the
Worker
as the sole argument.Note that, for the lifetime of a Worker and a TreeBuilder, the root element is always the same. The root element is also accessible in the builder’s root attribute. But using this method you can be sure that you are dealing with a complete and fully intact tree.
-
get_transform
(wait=False, callback=None)[source]¶ Return the transformed result.
If wait is True, the call blocks until (tokenizing and) transforming is finished. If wait is False, None is returned if the transform is not yet ready.
If a callback is given and transformation is not finished yet, that callback is called once when transforming is ready, with this
Worker
as the sole argument.If no Transformer was set, None is returned always.
-
slot_invalidate
(context)[source]¶ Called when TreeBuilder emits
("invalidate", context)
.Clears the node and its parents from the transform cache.
-
class
BackgroundWorker
(treebuilder, transformer=None)[source]¶ Bases:
parce.work.Worker
A Worker implementation that does the work in a background thread.
-
class
WorkerDocumentMixin
(root_lexicon=None, text='', worker=None, transformer=None)[source]¶ Bases:
object
Adds a Worker to a Document to automatically update the tokenized tree and the transformed result.
Combine this class with a subclass of AbstractDocument (see the
document
module).Everytime the text is modified, only the modified part is retokenized. If that changes the lexicon in which the last part (after the modified part) starts, that part is also retokenized, until the state (the list of active lexicons) matches the state of existing tokens.
Also the transformed result, if a transformer is set, is updated.
-
set_transformer
(transformer)[source]¶ Set a new Transformer in the worker.
Specify None to remove the current transformer.
See also
-
set_root_lexicon
(root_lexicon)[source]¶ Set the root lexicon to use to tokenize the text.
Triggers an update of the tokenized tree.
-
get_root
(wait=False, callback=None)[source]¶ Get the root element of the completed tree.
If wait is True, this call blocks until tokenizing is done, and the full tree is returned. If wait is False, None is returned if the tree is still busy being built.
If a callback is given and tokenizing is still busy, that callback is called once when tokenizing is ready, with this Document as the sole argument.
See also
-
open_lexicons
()[source]¶ Return the list of lexicons that were left open at the end of the text.
The root lexicon is not included; if parsing ended in the root lexicon, this list is empty, and the text can be considered “complete.”
-
modified_range
()[source]¶ Return a two-tuple(start, end) describing the range that was re-tokenized.
-
get_transform
(wait=False, callback=None)[source]¶ Return the transformed result (if a Transformer is active in the Worker).
If wait is True, the call blocks until (tokenizing and) transforming is finished. If wait is False, None is returned if the transform is not yet ready.
If a callback is given and transformation is not finished yet, that callback is called once when transforming is ready, with this Document as the sole argument.
See also
-