Thursday, September 26, 2013

Unicode characters break String.length in JavaScript.

Consider:

<!DOCTYPE HTML>
<html>
<head>
   <meta charset="utf-8">
   <title>My Test</title>
</head>
<body>
   <div id="first">this and that</div>
   <div id="second">this & that</div>
   <div id="third">this &amp; that</div>
   <script type="text/javascript" src="jquery-1.8.2.js"></script>
   <script type="text/javascript">
      $(function() {
         var first = $('#first').html();
         var second = $('#second').html();
         var third = $('#third').html();
         alert(first.length);
         alert(second.length);
         alert(third.length);
      });
   </script>   
</body>
</html>

 
 

You would think that the three alerts would contain first a thirteen, second an eleven, and finally another eleven. Instead fifteens are returned instead of the elevens! A coworker mentioned yesterday that Unicode characters break String.length in JavaScript. I'm not sure exactly what he had in mind, but in doing some experimenting, I did find this dirtiness. However, I found other places where the dirtiness does not rear its ugly head. Below an eight, a seven, and a seven are returned as expected:

<!DOCTYPE HTML>
<html>
<head>
   <meta charset="utf-8">
   <title>My Test</title>
</head>
<body>
   <div id="first">Jaeschke</div>
   <div id="second">Jäschke</div>
   <div id="third">Jæschke</div>
   <script type="text/javascript" src="jquery-1.8.2.js"></script>
   <script type="text/javascript">
      $(function() {
         var first = $('#first').html();
         var second = $('#second').html();
         var third = $('#third').html();
         alert(first.length);
         alert(second.length);
         alert(third.length);
      });
   </script>   
</body>
</html>

 
 

This behaves exactly like the code before it too, returning 8, 7, and 7 as is ideal:

<!DOCTYPE HTML>
<html>
<head>
   <meta charset="utf-8">
   <title>My Test</title>
</head>
<body>
   <div id="first">Jaeschke</div>
   <div id="second">J&#x00E4;schke</div>
   <div id="third">J&#x00E6;schke</div>
   <script type="text/javascript" src="jquery-1.8.2.js"></script>
   <script type="text/javascript">
      $(function() {
         var first = $('#first').html();
         var second = $('#second').html();
         var third = $('#third').html();
         alert(first.length);
         alert(second.length);
         alert(third.length);
      });
   </script>   
</body>
</html>

No comments:

Post a Comment