The document module

Document and Cursor form the basis of handling of documents in the parce package.

A Document contains a text string that is mutable via item and slice methods.

If you make modifications while inside a context (using the Python context manager protocol), the modifications are only applied when the context exits for the last time.

You can use a Cursor to keep track of positions in a document. The position (and selection) of a Cursor is adjusted when the text in the document is changed.

For tokenized documents, parce inherits from this base class.

class AbstractDocument[source]

Bases: object

Base class for a Document.

A Document is like a mutable string. E.g.:

d = Document('some string')
with d:
    d[5:5] = 'different '
d.text()  --> 'some different string'

You can also make modifications outside the a context, they will then be applied immediately, which is slower.

You can enter a context multiple times, and changes will be applied when the last exits. To make a Document work, you should at least implement:

  • text()

  • _update_contents()

The method text() should simply return the entire text string.

The method _update_contents() should read the (start, end, text) tuples from the list in self._changes, which is already sorted. These changes will never overlap. All start/end positions refer to the original state of the text.

For efficiency reasons, you might want to reimplement:

  • set_text()

  • __len__()

  • _get_contents() (called by __getitem__)

text()[source]

Should return the text.

set_text(text)[source]

Set the text.

__str__()[source]

Return the text.

__format__(formatstr)[source]

Format our text.

__setitem__(key, text)[source]
__delitem__(key)[source]

Delete the slice of text.

__getitem__(key)[source]

Get a character or a slice of text.

insert(pos, text)[source]

Insert text at pos.

replace(old, new, start=0, end=None, count=0)[source]

Replace occurrences of old with new in region start->end.

If count > 0, specifies the maximum number of occurrences to be replaced.

re_sub(pattern, replacement, start=0, end=None, count=0, flags=0)[source]

Replace regular expression matches of pattern with replacement.

Backreferences are allowed. The region can be set with start and end. If count > 0, specifies the maximum number of occurrences to be replaced.

The replacement argument can also be a funtion, which is then called with the match object and should return the replacement string.

trim(start=0, end=None)[source]

Remove trialing whitespace in the specified region.

translate(mapping, start=0, end=None, count=0, whole_words=False)[source]

Replace every occurrence of a key in mapping with its value.

If whole_words is True, only match the keys at word boundaries.

contents_changed(position, removed, added)[source]

Called by _apply(). The default implementation does nothing.

class Document(text='')[source]

Bases: parce.document.AbstractDocument, parce.util.Observable

A basic Document with undo and modified status.

This Document implements AbstractDocument by holding the text in a hidden _text attribute. It adds support for undo/redo and has a modified() state.

It also inherits from Observable and emits the following events:

"contents_change" (position, removed, added):

emitted with position, removed, added arguments whenever the text changes

"contents_changed":

emitted directly afther the previous event, but without arguments

"modification_changed" (bool):

emitted when the modified() state changes; True means the document was modified

"undo_available" (bool):

emitted when the availability of undo() changes

"redo_available" (bool):

emitted when the availability of redo() changes

undo_redo_enabled = True
modified()[source]

Return whether the text was modified.

set_modified(modified)[source]

Sets whether the text is modified, happens automatically normally.

text()[source]

Return all text.

undo()[source]

Undo the last modification.

redo()[source]

Redo the last undone modification.

clear_undo_redo()[source]

Clear the undo/redo stack.

can_undo()[source]

Return True whether undo is possible.

can_redo()[source]

Return True whether redo is possible.

contents_changed(position, removed, added)[source]

Called by _apply().

This implementation emit "contents_change" and "contents_changed" events.

class Cursor(document, start=0, end=-1)[source]

Bases: object

Describes a certain range (selection) in a Document.

You may change the start and end attributes yourself. Both must be an integer, end may also be None, denoting the end of the document.

As long as you keep a reference to the Cursor, its positions are updated when the document changes. When text is inserted at the start position, the position remains the same. But when text is inserted at the end of a cursor, the end position moves along with the new text. E.g.:

d = Document('hi there, folks!')
c = Cursor(d, 8, 8)
with d:
    d[8:8] = 'new text'
c.start, c.end --> (8, 16)

You can also use a Cursor as key while editing a document:

c = Cursor(d, 8, 8)
with d:
    d[c] = 'new text'

You cannot alter the document via the Cursor.

start
end
document()[source]
text()[source]

Return the selected text, if any.

select(start, end=-1)[source]

Change start and end in one go. End defaults to start.

select_all()[source]

Set start to 0 and end to None; selecting all text.

select_none()[source]

Set end to start.

has_selection()[source]

Return True if text is selected.

lstrip(chars=None)[source]

Move start to the right, if specified characters can be skipped.

By default whitespace is skipped, like Python’s lstrip() string method.

rstrip(chars=None)[source]

Move end to the left, if specified characters can be skipped.

By default whitespace is skipped, like Python’s rstrip() string method.

strip(chars=None)[source]

Adjust start and end, like Python’s strip() method.