Wack Mac newline character in utf-8

So I had a user come to me with with an error with my WYSIWYG editor in our online website builder. He would save his content, which he had pasted from OSX’s TextEdit program, then try to apply that content to his website. I have a PHP function which escapes new lines and single quotes to make it compatible with inserting the content into a textarea for editing. But, I had missed the Unicode (UTF-8) “line separator” (0xE2 0x80 0xA8), so Javascript was throwing the “Unterminated String Literal” error. Anyway, after much angst, death, pain, hurting, and unfruitful googling, my co-worker found that UTF-8 has the “line separator” character. He also found other weird newline characters for UTF-8. Here is a list of the newline characters:

  • “line separator”
  • “page separator”
  • “next line”

Here is a link to a Wikipedia Article about it.

To fix the problem, I updated my JavaScript escape function to include this new character. You can see the new function below:

1
2
3
4
5
function jsesc($escString){
	$find = array( "'", '’', "\n", "\r", chr(226).chr(128).chr(168), chr(226).chr(128).chr(169), chr(194).chr(133));
	$replace = array( "\\'", "\\'", "\\n ", "\\n", "\\n", "\\n", "\\n");
	return str_replace($find, $replace, stripslashes($escString));
}

Leave a Reply

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>