Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support astral symbols in encodeForHTMLAttribute #8

Closed
mathiasbynens opened this issue Aug 23, 2013 · 8 comments
Closed

Support astral symbols in encodeForHTMLAttribute #8

mathiasbynens opened this issue Aug 23, 2013 · 8 comments

Comments

@mathiasbynens
Copy link

Testing on http://rawgithub.com/chrisisbeef/jquery-encoder/master/site/index.html shows that invalid/incorrect HTML escape sequences are generated for astral symbols:

$.encoder.encodeForHTMLAttribute('\uD834\uDF06'); // U+1D306 TETRAGRAM FOR CENTRE; GitHub won’t let me use the raw symbol here
// → '��' which is incorrect
// it should be: '𝌆'

A robust library for escaping/encoding text for use in HTML (or decoding it) is he. Feel free to use it as a dependency for this project.

@ghost
Copy link

ghost commented Aug 26, 2013

Interesting - is there a security concern for Astral symbols - that is, can an astral be decoded to a javascript control character when not using unicode?

@mathiasbynens
Copy link
Author

You mean, when not using UTF-8? Not as far as I know, but it’s safest to escape them anyway.

@stuartf
Copy link
Contributor

stuartf commented Apr 1, 2014

Note that it's also broken when using the literal astral characters:

$.encoder.encodeForHTMLAttribute('value', '🍔');
// "value="��""
// should be: 🍔

@mathiasbynens
Copy link
Author

@stuartf Of course, since '\uD834\uDF06' == '𝌆' in JavaScript. (I only posted the encoded version because GitHub didn’t allow me to use the astral symbol at that time. They only recently fixed that.)

@stuartf
Copy link
Contributor

stuartf commented Apr 2, 2014

I get that escaping the characters is safer, but wouldn't it be a workable solution to do something like stuartf@6d0542e

@mathiasbynens
Copy link
Author

That doesn’t just ignore astral symbols, but also lone surrogates. But since you can’t encode those in HTML anyhow, I guess that’s fine in this case.

@chrisisbeef
Copy link
Owner

If you see value if adding the encoding for astral symbols I'd be more than happy to take a look at a patch and merge into master

@stuartf
Copy link
Contributor

stuartf commented Apr 2, 2014

@chrisisbeef does that mean my above patch would or wouldn't work for you as it doesn't encode the astrals, it just ignores them and the lone surrogates? If you do want them encoded, would it be ok if the patch depended on the https://github.com/mathiasbynens/he library mentioned above (it is MIT licensed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants