Changeset: Minimize changeset docs, add links to code

This reduces the amount of duplicate documentation, and helps keep the
documentation and code in sync.
pull/5242/head
Richard Hansen 2021-10-24 19:42:59 -04:00
parent de3dfb5ce2
commit 4a8c72a38c
1 changed files with 27 additions and 139 deletions

View File

@ -1,156 +1,44 @@
# Changeset Library # Changeset Library
``` The [changeset
"Z:z>1|2=m=b*0|1+1$\n" library](https://github.com/ether/etherpad-lite/blob/develop/src/static/js/Changeset.js)
provides tools to create, read, and apply changesets.
## Changeset
```javascript
const Changeset = require('ep_etherpad-lite/static/js/Changeset');
``` ```
This is a Changeset. It's just a string and it's very difficult to read in this form. But the Changeset Library gives us some tools to read it. A changeset describes the difference between two revisions of a document. When a
user edits a pad, the browser generates and sends a changeset to the server,
which relays it to the other users and saves a copy (so that every past revision
is accessible).
A changeset describes the diff between two revisions of the document. The Browser sends changesets to the server and the server sends them to the clients to update them. These Changesets also get saved into the history of a pad. This allows us to go back to every revision from the past. A transmitted changeset looks like this:
## Changeset.unpack(changeset)
* `changeset` {String}
This function returns an object representation of the changeset, similar to this:
``` ```
{ oldLen: 35, newLen: 36, ops: '|2=m=b*0|1+1', charBank: '\n' } 'Z:z>1|2=m=b*0|1+1$\n'
``` ```
* `oldLen` {Number} the original length of the document. ## Attribute Pool
* `newLen` {Number} the length of the document after the changeset is applied.
* `ops` {String} the actual changes, introduced by this changeset.
* `charBank` {String} All characters that are added by this changeset.
## Changeset.opIterator(ops) ```javascript
const AttributePool = require('ep_etherpad-lite/static/js/AttributePool');
* `ops` {String} The operators, returned by `Changeset.unpack()`
Returns an operator iterator. This iterator allows us to iterate over all operators that are in the changeset.
You can iterate with an opIterator using its `next()` and `hasNext()` methods. Next returns the `next()` operator object and `hasNext()` indicates, whether there are any operators left.
## The Operator object
There are 3 types of operators: `+`,`-` and `=`. These operators describe different changes to the document, beginning with the first character of the document. A `=` operator doesn't change the text, but it may add or remove text attributes. A `-` operator removes text. And a `+` Operator adds text and optionally adds some attributes to it.
* `opcode` {String} the operator type
* `chars` {Number} the length of the text changed by this operator.
* `lines` {Number} the number of lines changed by this operator.
* `attribs` {attribs} attributes set on this text.
### Example
```
{ opcode: '+',
chars: 1,
lines: 1,
attribs: '*0' }
``` ```
## APool Changesets do not include any attribute keyvalue pairs. Instead, they use
numeric identifiers that reference attributes kept in an [attribute
pool](https://github.com/ether/etherpad-lite/blob/develop/src/static/js/AttributePool.js).
This attribute interning reduces the transmission overhead of attributes that
are used many times.
``` There is one attribute pool per pad, and it includes every current and
> var AttributePoolFactory = require("./utils/AttributePoolFactory"); historical attribute used in the pad.
> var apool = AttributePoolFactory.createAttributePool();
> console.log(apool)
{ numToAttrib: {},
attribToNum: {},
nextNum: 0,
putAttrib: [Function],
getAttrib: [Function],
getAttribKey: [Function],
getAttribValue: [Function],
eachAttrib: [Function],
toJsonable: [Function],
fromJsonable: [Function] }
```
This creates an empty apool. An apool saves which attributes were used during the history of a pad. There is one apool for each pad. It only saves the attributes that were really used, it doesn't save unused attributes. Let's fill this apool with some values ## Further Reading
```
> apool.fromJsonable({"numToAttrib":{"0":["author","a.kVnWeomPADAT2pn9"],"1":["bold","true"],"2":["italic","true"]},"nextNum":3});
> console.log(apool)
{ numToAttrib:
{ '0': [ 'author', 'a.kVnWeomPADAT2pn9' ],
'1': [ 'bold', 'true' ],
'2': [ 'italic', 'true' ] },
attribToNum:
{ 'author,a.kVnWeomPADAT2pn9': 0,
'bold,true': 1,
'italic,true': 2 },
nextNum: 3,
putAttrib: [Function],
getAttrib: [Function],
getAttribKey: [Function],
getAttribValue: [Function],
eachAttrib: [Function],
toJsonable: [Function],
fromJsonable: [Function] }
```
We used the fromJsonable function to fill the empty apool with values. the fromJsonable and toJsonable functions are used to serialize and deserialize an apool. You can see that it stores the relation between numbers and attributes. So for example the attribute 1 is the attribute bold and vise versa. An attribute is always a key value pair. For stuff like bold and italic it's just 'italic':'true'. For authors it's author:$AUTHORID. So a character can be bold and italic. But it can't belong to multiple authors
```
> apool.getAttrib(1)
[ 'bold', 'true' ]
```
Simple example of how to get the key value pair for the attribute 1
## AText
```
> var atext = {"text":"bold text\nitalic text\nnormal text\n\n","attribs":"*0*1+9*0|1+1*0*1*2+b|1+1*0+b|2+2"};
> console.log(atext)
{ text: 'bold text\nitalic text\nnormal text\n\n',
attribs: '*0*1+9*0|1+1*0*1*2+b|1+1*0+b|2+2' }
```
This is an atext. An atext has two parts: text and attribs. The text is just the text of the pad as a string. We will look closer at the attribs at the next steps
```
> var opiterator = Changeset.opIterator(atext.attribs)
> console.log(opiterator)
{ next: [Function: next],
hasNext: [Function: hasNext],
lastIndex: [Function: lastIndex] }
> opiterator.next()
{ opcode: '+',
chars: 9,
lines: 0,
attribs: '*0*1' }
> opiterator.next()
{ opcode: '+',
chars: 1,
lines: 1,
attribs: '*0' }
> opiterator.next()
{ opcode: '+',
chars: 11,
lines: 0,
attribs: '*0*1*2' }
> opiterator.next()
{ opcode: '+',
chars: 1,
lines: 1,
attribs: '' }
> opiterator.next()
{ opcode: '+',
chars: 11,
lines: 0,
attribs: '*0' }
> opiterator.next()
{ opcode: '+',
chars: 2,
lines: 2,
attribs: '' }
```
The attribs are again a bunch of operators like .ops in the changeset was. But these operators are only + operators. They describe which part of the text has which attributes
## Resources / further reading
Detailed information about the changesets & Easysync protocol: Detailed information about the changesets & Easysync protocol:
* Easysync Protocol - [/doc/easysync/easysync-notes.pdf](https://github.com/ether/etherpad-lite/blob/develop/doc/easysync/easysync-notes.pdf) * [Easysync Protocol](https://github.com/ether/etherpad-lite/blob/develop/doc/easysync/easysync-notes.pdf)
* Etherpad and EasySync Technical Manual - [/doc/easysync/easysync-full-description.pdf](https://github.com/ether/etherpad-lite/blob/develop/doc/easysync/easysync-full-description.pdf) * [Etherpad and EasySync Technical Manual](https://github.com/ether/etherpad-lite/blob/develop/doc/easysync/easysync-full-description.pdf)