Paging document


#1

Hi.

Long story short, I am trying to implement paging. I have nodes to represent the page with content. I’ve read the related topic(s) on this forum and decided to take the following approach:

Upon initial load of an existing text file (which is not split into pages), I put it in a single page and then run an algorithm that picks the right places to tr.split(), which creates the following pages and spreads the rest of the content into them. So far that works fine.

Another aspect is moving content between pages upon text manipulation. I decided to move the page boundary instead of moving the content. Example doc with nodes and positions:

<doc>1
   1<page id="p1">2
       2<foo>AAA</foo>7
       7<bar>BBB</bar>12
      12<baz>CCC</baz>17
  17</page>18
  18<page id="p2">19
      19<foo>AAA</foo>24
      24<bar>BBB</bar>29
      29<baz>CCC</baz>34
  34</page>35
35</doc>

To slice the </page><page id="p2"> and insert it between </bar>|HERE|<baz> I am using

const state = view.state;
const boundarySlice = state.doc.slice(17, 19);
const tr = state.tr;
tr.delete(17, 19);
tr.replace(12, boundarySlice);
view.dispatch(tr);

Is that the correct approach?


ProseMirror pagination
#2

There’s some related discussion in this thread, but this is definitely not something that ProseMirror supports—it tries to be an editor of semantic content, and paging is very much a presentation issue that, in my opinion, should be separated from the content editing process.


#3

The approach described in that thread is with respect to moving content, not the boundary, which ends up being the content technically if you ask me. it depends on how much pagination you need, and how lazy you can make it.

There is no great solution, even with prosemirror. The browser itself is really not built that well for structure like this. It certainly doesnt scale that well when you get to 100 pages to try typing/realtime manipulation.


#4

I do understand that, yet one of the requirements I have is to have a visual page representation, which will be the same as the one to be printed on A4 paper, so we’ve already picked the approach of having pages in the editor schema.

Basically I see 3 alternatives to move child nodes between different parents upon “content overflow”:

  1. Move the nodes directly
  2. Join the 2 parents, then split at a new position. That approach would also create a new parent if necessary (“content overflow in the second parent, followed by a new split()”).
  3. (Rejected) Move the boundary between parents and place it to a new position.

Do you see any disadvantages in using the split/join method for each pair of pages (1&2, 2&3, …, until the content does not overflow anymore)? An obvious one is that I have to store the attributes and marks of the second page before the join() and restore them after the next split().