Paging document

#1

Hi.

Long story short, I am trying to implement paging. I have nodes to represent the page with content. I’ve read the related topic(s) on this forum and decided to take the following approach:

Upon initial load of an existing text file (which is not split into pages), I put it in a single page and then run an algorithm that picks the right places to tr.split(), which creates the following pages and spreads the rest of the content into them. So far that works fine.

Another aspect is moving content between pages upon text manipulation. I decided to move the page boundary instead of moving the content. Example doc with nodes and positions:

<doc>1
   1<page id="p1">2
       2<foo>AAA</foo>7
       7<bar>BBB</bar>12
      12<baz>CCC</baz>17
  17</page>18
  18<page id="p2">19
      19<foo>AAA</foo>24
      24<bar>BBB</bar>29
      29<baz>CCC</baz>34
  34</page>35
35</doc>

To slice the </page><page id="p2"> and insert it between </bar>|HERE|<baz> I am using

const state = view.state;
const boundarySlice = state.doc.slice(17, 19);
const tr = state.tr;
tr.delete(17, 19);
tr.replace(12, boundarySlice);
view.dispatch(tr);

Is that the correct approach?

ProseMirror pagination
#2

There’s some related discussion in this thread, but this is definitely not something that ProseMirror supports—it tries to be an editor of semantic content, and paging is very much a presentation issue that, in my opinion, should be separated from the content editing process.

1 Like
#3

The approach described in that thread is with respect to moving content, not the boundary, which ends up being the content technically if you ask me. it depends on how much pagination you need, and how lazy you can make it.

There is no great solution, even with prosemirror. The browser itself is really not built that well for structure like this. It certainly doesnt scale that well when you get to 100 pages to try typing/realtime manipulation.

1 Like
#4

I do understand that, yet one of the requirements I have is to have a visual page representation, which will be the same as the one to be printed on A4 paper, so we’ve already picked the approach of having pages in the editor schema.

Basically I see 3 alternatives to move child nodes between different parents upon “content overflow”:

  1. Move the nodes directly
  2. Join the 2 parents, then split at a new position. That approach would also create a new parent if necessary (“content overflow in the second parent, followed by a new split()”).
  3. (Rejected) Move the boundary between parents and place it to a new position.

Do you see any disadvantages in using the split/join method for each pair of pages (1&2, 2&3, …, until the content does not overflow anymore)? An obvious one is that I have to store the attributes and marks of the second page before the join() and restore them after the next split().

#5

Sorry for late reply, Feel free to spam me in places I might be reached if you have a question.

We’ve done several iterations on pagination at this point. There are some disadvantages to only splitting and joining and one of those is how technically joining and splitting changes the node the subsequent page’s immutable reference. Thus it causes some things to fire differently in the prosemirror logic. For us, I try to move fragments of text between pages, except when there is overflow or underflow, I usually use join and split then. But, if its overflow and underflow between two pages, I try to move fragments around. Its a bit more efficient, a little more complex to do. But yes, there are far less considerations as you said, if you dont have to deal with the attrs on your second page.

Maybe thats your answer.

1 Like
#6

Thank you for the guidance! Most likely we will also have few more iterations on the paging. Most (if not all) of the calculations can be made based on the line height + characters per line (Monospace font) + node styles (margins and paddings from constants). I do believe we will end up with a similar hybrid solution - moving nodes between pages and using split/join for some rare cases.

#7

I’ve made a few posts on here you can go and look up that talk about this a bit. The pagination we have is a little more complicated since its based on rules for something else.