Hi, I’m new to prosemirror, and have to deal with some big document (~5 megabytes of utf8 txt) with it. The structure of document is rather simple, just a list of n x 10000 paragraphs/lines. When it is loaded into an prosemirror editor, editing becomes slow.
I find that one hot spot is iterDeco(…) in prosemirror-view. In iterDeco() the code seems have to iterate over the whole list of contents to find matching ones. I was wondering is it possible to optimize this procedure with intersection observer api? so that we can limit the iteration range down to leaf nodes being shown in the viewport and selection.
Thanks for making this awesome editor!
Viewport-based drawing is out of scope for ProseMirror. It’s just too much extra complexity and failure modes. (It might be possible to rig something up with an external plugin, but it’s not going to be easy.)
That being said, the expected bottleneck for huge documents is the DOM.
iterDeco being the slow part isn’t expected. Can you tell me a bit more about your document shape and the kind (and quantity) of decorations you have? Maybe even set up a simplified demo of the issue?
Hi marjin. The document I’m dealing with is an array of paragraphs. In one document there is about 20k~30k paragraphs and each paragraph contains about 100~200 words. No plugins or decorators are being used.
Though the DOM is slow, the overall latency seems acceptable, if the first several paragraphs are being editing, but for the last paragraphs of the whole document, the latency is very big.
So I tried to split the doc into section, each section with a fixed number of paragraphs. After this done, the overall latency become smaller, but a very rough profiling result shows that
iterDeco becomes the hotspot.
I looked around the code of prosemirror-view. It seems that
iterDeco will iterate over all nodes of my document ( a flat array of paragraph nodes) to locate the one should be updated or the position should be inserted at.
In order to alleviate this iteration, I was wondering that the is it possible to optimize by changing the data structure to some kind of balanced tree (like red-black tree, maybe)? or will it be useful to caching the editing locations?
I’ve made the interface to fragments so that it would be possible to use a tree data structure for large nodes, through right now they still all use arrays, since I haven’t ran into a situation where the array logic is the bottleneck.
This patch fixes a quadratic bit of complexity in node updating, which might be what you were running into (
iterDeco wasn’t really the source, but rather
updateNextNode, called via the closure passed to
iterDeco). Huge flat documents still aren’t fast, but somewhat better, and the remaining slowness seems mostly on the side of the browser.