Hi, I want to diff 2 rendered versions of tiptap jsons,like green highlight for inserted and red for deleted contents like (snapshot-compare)extension in tiptap pro.This has to work with custom nodes also.Sharing my ideas to approach this:
Traverse both documents - The obvious way is to use some diffing library like diff-match-patch , json-diff or something to obtain text differences,but handling structural differences like complete node deletions/insertions are very tricky and difficult especially for documents with custom nodes and to make highlights,we need to wrap insertions/deletions with marks,which should be accurate and shouldnt break the json structure(making invalid tiptap json)
Comparing rendered HTML - This is kinda unreliable way ,but it works for some cases. The approach is to get rendered HTML for 2 versions and use a npm package for diffing it. But the issues are dynamic content not rendered,different screensize issues,etc..
Wrap insertions/deletions with Marks - This approach is to intercept transactions and wrap them with marks(green for insertions and red for deletions) ,we can capture deletions by invertStep() but the tricky part is with cursor misalignment and ghost nodes.lets assume that we can calculate cursor positions by getting deletedNode size and some offsets,but the main problem is during editing i dont wanna show deleted content with red background as it would be a bad user experience so lets use Display:none for parsing marks with delete tag,but for some cases like listItems,the content is acutally there (since marks is just kinda styling),so even though nothing is in the editor visually through css,the editor sees the content as present through json and wont let me delete the bullet,hence im getting empty bullet,and also some text nodes
@marijn and any other experts in here,How shall i approach this problem.another idea is get mappings and content through prosemirror-changeset and map them at end,but the offsets may differ as the document content is changed.Please share your solutions.
We create a custom text representation from the prosemirror document, use git to find the differences and then visualize this as another prosemirror document. Works quite well.
Thanks for the response, it is possible through normalization and diffing, even i’ve tried normalizing my document into markdown, create html of 2 verisons and diff them using a npm package, it went well.But the goal is to maintain the structure as it is,for simple documents,these approaches will be fine,but i have custom nodes and re-structuring plain text back to custom nodes seems not practical!
Thanks though!
Hey @madhan! Did you have any luck on finding a solution for this? This would help me immensely as well since I’m stuck in the same situation and want to know if there is a solution you found
yeah, if your schema includes basic nodes like heading,paragrah,table,etc.. you can actually parse 2 documents which you wanna compare as html then diff them (preferably node-htmldiff) which wraps them with ins,del tags and you can use them to highlight the changes visually
Hey there i was tried two approaches one is position based that won’t work because positions are keep drifting so i choosed one more approach which is hybrid approach when comes to inline difffing for paragraph and heading i used normal diffring when comes to complex i chooosed node based diffing so for that i used align blocks using AST + LCS and apply inline diffs only to safe text blocks (`paragraph`, `heading`). All other nodes render as node-level replace but but i’m currently searching new approach why because user won’t satisfying with this when comes to complex changing we are showing the entire node block as red and along with new node block also green it seems too much red flag in some cases right like when comes to changes callout and card and accordion and table our hybrid won’t work so i’m currently looking new approaches if anyone know please drop the solution i searched lot but i couldn’t find
I actually kinda accomplished what i needed. i used a similar approach like AST like. what i did was actually traverse recursively into content and use diff-match-patch for marks. For non-text nodes i marked them (adding a attr wherever i needed) so that in my renderer i can make use of that. that way even for minor changes in components like table, callout and all, i was able to only show diff on acutal content instead of entirely marking one with red with old content and a green one with new content. Hope this gives you the overall picture!
Yeah thanks man we got it working with a per node mapping strategy too.
Same idea overall, but we added LCS pairing so nodes align correctly first.
Then we run inline diffs inside matched blocks (text + callout/card/accordion).
Non text nodes fall back in a structured way, not full red/green blocks.
Hey madhan my approach was working good madhan but the problem we’ve implemented some custom blocs callout and cards and accordion and columns as well like notion editor but my diffing approach also working fine but the real problem arises with title madhan i couldn’t show the inline diff for the title that’s the main concern last time i used input for title so for that prosemirror schema won’t understand input things i was changed to like normal dom still i couldn’t show the difference madhan if you don’t mind i need your help madhan
If this is gonna be a view-only editor, like to preview changes, you can use a text-diffing tool like diff-match-patch to diff and wrap them with ins/del html tags and add according css.