The art of HTML editors


 You have probably seen them in many CMS applications: the so called WYSIWYG editors where you can edit text in a browser window and it can be submitted as HTML in a form. I've always thought these editors were very challenging pieces of code and always resorted to a default one, like TinyMCE, CKEditor or Summernote.

In reality most use a non-standard feature introduced in IE 5.5(!). As it has been used by any WYSIWYG editor the non-standard features are still not removed from any browser!

contenteditable attribute

To make HTML editable, all you need is adding the attribute contenteditable to a HTML part. Now a user can edit the HTML. With intervals or with the mutation observers you can listen to changes of the HTML.

<div id="edit-html" contenteditable="true" style="background-color: #2b00fe; color: #01ffff;">
  <b><i>You can freely edit me in this blog!!!!</i></b>
</div>
<textarea readonly id="display"></textarea>
<script>
setInterval(function () {
    const display = document.querySelector('#display');
    const editHtml = document.querySelector('#edit-html');
    display.innerText = editHtml.innerHTML;
}, 1000);
</script>

You can freely edit me in this blog!!!!
Yes, it's that simple. Of course by default you can just select text and type text. The most advanced you can do out of the box is make text bold by selecting text and press ctrl+B with the keyboard.
Even though contenteditable attribute was introduced as a non-standard attribute in IE 5.5 it has now browser support for all browsers and has become a standard HTML attribute.

So how do I add something like a paste button? Again we use more non-standard javascript, but this time the age of this Javascript is very much visible:

<button onclick="document.execCommand('paste');">Paste from clipboard</button>
Yes, this magic global method allows you to perform paste action. This method expects some magic strings and will do something if you run it. If you browser does not understand magic string 'paste', the button will not be doing anything, not even throwing an error. The paste action will also only work if you have clicked inside a html element with contenteditable attribute.

So do current HTML editors still use this? Even though the more advanced HTML editors use their own handling to edit HTML, like having an emulated cursor or building the HTML with a library like slate.js, some functionality is currently only available with document.execCommand. The above example can be changed to using the Clipboard API, but the command 'undo' has no native javascript equivalence and implementing it yourself would require a lot of code. There's also the issue that shortcut keys are different on different operating systems.

Security considerations

Personally I prefer not to have HTML editors as they are often found to be easy candidates for XSS. So we have a few things you have to consider to avoid XSS:
  • A HTML editor often hides the true form element in a hidden textarea. You should always put the HTML in the textarea field and not the other way around without sanitizing the data. This prevents people hacking/putting some specific values in the textarea field to get a working XSS.
  • If you use your HTML editor in traditional backend form, make sure you will run HTML sanitization again before rendering the contents of the HTML Editor field if the form tries to display a validation error and try to render the current entered value. Often this is forgotten!
  • Using PHP strig_tags is not enough to protect you from XSS. For example if I keep image tags, I'll have to strip attributes too or I can be hacked with <img onerror="alert('xss');">. Also you do not want to display contents of script or style tags.
  • Thinking you can get around it with writing dynamic Javascript and generate static HTML as a javascript string with json_encode to encode your data are also not protected from XSS, because I can make the browser trying to fix incorrect HTML if the HTML contains invalid HTML. Most inline scripts do not have CDATA markers. Thankfully most browsers can detect these types of XSS nowadays.
  • On Mysql it's possible that the wrong database type of a database field is being used and the HTML will be truncated. On older versions of Mysql or if a specific config value is set, Mysql will secretly truncate the field data and not throw an error if the value is too long. This could lead to truncated HTML which could also cause a XSS.

SafeHTML value object

In Apie I created a SafeHTML value object. I can use this value object in any library.

I can just do

$object = new SafeHtml('<div>test<script>alert("hi");</script></div>');
$object->toNative(); // returns only the div without the script tag

In apie/cms typehinting something with SafeHTML gives me a HTML editor. It uses a webcomponent that gives my content editable contents a popup menu if I select text or type / like you can do in a chatbot similar like typing a chat message in Slack. The SafeHTML value object always sanitizes the input. It does this with symfony/ html-sanitizer. It only accepts complete URL's for example and removes any onclick, onerror or other event attribute. You can use it in any project.

The other methods are more related how it's being used with Apie library components. For example it makes an index for full text search or creates it's own faker on how to make fake instances of this object.

Apie will display the HTML on the resource details page with apie/cms with SafeHTMLDisplayProvider or in a form with SafeHTMLComponentProvider.



Comments