Paging document

nateriver · March 14, 2019, 1:16pm

Hi.

Long story short, I am trying to implement paging. I have nodes to represent the page with content. I’ve read the related topic(s) on this forum and decided to take the following approach:

Upon initial load of an existing text file (which is not split into pages), I put it in a single page and then run an algorithm that picks the right places to tr.split(), which creates the following pages and spreads the rest of the content into them. So far that works fine.

Another aspect is moving content between pages upon text manipulation. I decided to move the page boundary instead of moving the content. Example doc with nodes and positions:

<doc>1
   1<page id="p1">2
       2<foo>AAA</foo>7
       7<bar>BBB</bar>12
      12<baz>CCC</baz>17
  17</page>18
  18<page id="p2">19
      19<foo>AAA</foo>24
      24<bar>BBB</bar>29
      29<baz>CCC</baz>34
  34</page>35
35</doc>

To slice the </page><page id="p2"> and insert it between </bar>|HERE|<baz> I am using

const state = view.state;
const boundarySlice = state.doc.slice(17, 19);
const tr = state.tr;
tr.delete(17, 19);
tr.replace(12, boundarySlice);
view.dispatch(tr);

Is that the correct approach?

marijn · March 14, 2019, 1:35pm

There’s some related discussion in this thread, but this is definitely not something that ProseMirror supports—it tries to be an editor of semantic content, and paging is very much a presentation issue that, in my opinion, should be separated from the content editing process.

JCHollett · March 14, 2019, 2:05pm

The approach described in that thread is with respect to moving content, not the boundary, which ends up being the content technically if you ask me. it depends on how much pagination you need, and how lazy you can make it.

There is no great solution, even with prosemirror. The browser itself is really not built that well for structure like this. It certainly doesnt scale that well when you get to 100 pages to try typing/realtime manipulation.

nateriver · March 14, 2019, 2:14pm

I do understand that, yet one of the requirements I have is to have a visual page representation, which will be the same as the one to be printed on A4 paper, so we’ve already picked the approach of having pages in the editor schema.

Basically I see 3 alternatives to move child nodes between different parents upon “content overflow”:

Move the nodes directly
Join the 2 parents, then split at a new position. That approach would also create a new parent if necessary (“content overflow in the second parent, followed by a new split()”).
(Rejected) Move the boundary between parents and place it to a new position.

Do you see any disadvantages in using the split/join method for each pair of pages (1&2, 2&3, …, until the content does not overflow anymore)? An obvious one is that I have to store the attributes and marks of the second page before the join() and restore them after the next split().

JCHollett · April 11, 2019, 8:32pm

Sorry for late reply, Feel free to spam me in places I might be reached if you have a question.

We’ve done several iterations on pagination at this point. There are some disadvantages to only splitting and joining and one of those is how technically joining and splitting changes the node the subsequent page’s immutable reference. Thus it causes some things to fire differently in the prosemirror logic. For us, I try to move fragments of text between pages, except when there is overflow or underflow, I usually use join and split then. But, if its overflow and underflow between two pages, I try to move fragments around. Its a bit more efficient, a little more complex to do. But yes, there are far less considerations as you said, if you dont have to deal with the attrs on your second page.

Maybe thats your answer.

nateriver · April 15, 2019, 7:16am

Thank you for the guidance! Most likely we will also have few more iterations on the paging. Most (if not all) of the calculations can be made based on the line height + characters per line (Monospace font) + node styles (margins and paddings from constants). I do believe we will end up with a similar hybrid solution - moving nodes between pages and using split/join for some rare cases.

JCHollett · April 15, 2019, 10:16am

I’ve made a few posts on here you can go and look up that talk about this a bit. The pagination we have is a little more complicated since its based on rules for something else.

tstoev · August 22, 2019, 3:33pm

Hello guys, im aming for a pagination to.Can u guide me a little bit here, if i understood corect you slice the nodes that are in the overflow and move them in the next page node ? But how u know which nodes are are not visible ? One more question how this behave with large docs, over 50 pages for example?What do u think for the approach of Word online :

String · November 6, 2022, 4:19am

Hi, can you share your pagination implementation code?

String · February 14, 2023, 1:13am

https://emr.cymcar.com/

jair · February 15, 2023, 4:17pm

This looks awesome. Is there any repo where we can see how you implemented this? Also, how can I create more than 1 page in your app? https://emr.cymcar.com/. I’ve been adding a bunch of text, but it doesn’t create more than one page.

String · February 16, 2023, 8:53am

Hello, this is a demo video

jair · February 16, 2023, 4:58pm

That’s pretty cool! We were able to achieve something very similar to this. However, our approach is failing miserably when we have table elements in the document. So, two questions :

Does your approach work with tables?
At high level, are you using prosemirror decorations to handle page breaks?

Thanks a bunch!

String · February 6, 2024, 8:19am

This is my implementation https://github.com/Cassielxd/CassieEditor

etclub · February 6, 2024, 8:24am

It is 404.

String · February 6, 2024, 8:34am

It has been resolved

String · February 6, 2024, 8:43am

try again

etclub · February 6, 2024, 12:00pm

OK. Good job.

String · February 23, 2024, 3:10am