How to preserve hard breaks when pasting HTML into a plain-text schema?

I’m using ProseMirror with a very basic schema:

  • A document node that only accepts inline elements (no paragraphs)
  • Hard breaks (i.e., <br>), and Enter is tweaked to insert line breaks
  • And text nodes, of course

The idea here is to have plain-text editor augmented with extensions such as mentions (which are also inline elements).

However, I’m having an issue when I paste HTML content (for instance, copying code from VS Code and pasting into a ProseMirror editor). The content is pasted succesfully, but line breaks are ignored and the text is pasted into a single line.

What’s the best way to work around this? What’s the best event handler for this? Would love some pointers in the right direction, as I’m a bit clueless.

Thank you.

I’m guessing the pasted HTML contains either newlines or block nodes wrapping the text, instead of <br> nodes? You could try using transformPastedHTML to pre-process it into something your scheme’s DOM parser can handle.

1 Like

That was my first approach, but my problem is how to process that while knowing where to insert <br>. For instance, take this basic implementation:

transformPastedHTML(html) {
    const doc = new DOMParser().parseFromString(html, 'text/html')
    return doc.body.textContent || ''
},

And this input from VS Code:

class ExtendedDispatcher extends Dispatcher<Action> {
    dispatch(payload: Action) {
    }
}

And I get this output:

class ExtendedDispatcher extends Dispatcher<Action> {    dispatch(payload: Action) {    }}

New lines were lost, and it’s hard to figure out where to insert the <br> elements.

I think I’ve done some progress:

transformPastedHTML(html) {
    const document = new DOMParser().parseFromString(html, 'text/html')

    const textLines: string[] = []

    document.body.firstElementChild?.childNodes.forEach((childNode) => {
        textLines.push(childNode.textContent || '')
    })

    return textLines.join('<br>')
},

It’s still not finished, because it’s stripping spaces (imagine indented code), which I don’t want. Will need to look further into this tomorrow. But I’m open to ideas/improvements on this :slight_smile:

@marijn With the solution above, I wasn’t able to preserve spaces from pasted text for some reason (I have initialized the editor with preserveWhitespace: true), so I had to find a different solution.

Here’s what I came up with:

handlePaste(view, event) {
    const { state } = view
    const { tr } = state

    if (!state.schema.nodes.hardBreak) {
        return false
    }

    const clipboardText = event.clipboardData?.getData('text/plain').trim()

    if (!clipboardText) {
        return false
    }

    const textLines = clipboardText.split(/(?:\r\n|\r|\n)/g)

    const nodes = textLines.reduce<Node[]>((nodes, line, index) => {
        if (line.length > 0) {
            nodes.push(state.schema.text(line))
        }

        if (index < textLines.length - 1) {
            nodes.push(state.schema.nodes.hardBreak.create())
        }

        return nodes
    }, [])

    view.dispatch(
        tr.replaceSelection(Slice.maxOpen(Fragment.fromArray(nodes))).scrollIntoView(),
    )

    return true
}

Advantages of this implementation:

  • It works for both pasted text and pasted HTML (getting the text/plain data from the clipboard comes without HTML and spaces preserved).
  • I had clipboardTextParser overridden to do the same thing for plain-text, and now I have one solution that works for both.

I’m open to any suggestions you might have to improve this code further :slight_smile: