I'm looking at a piece of code that's a custom control that embeds a bit of JavaScript into a page. Part of this JavaScript is generating some static string text directly into the page. I've been running this code for a while now as part of an application I'm working on with a customer.

But a couple of days ago I ran into a couple of problems with this control and as it turns out the problem is that the JavaScript strings embedded into the HTML stream aren't properly encoded. The code used is something like this (grossly simplified):

string markup = "Some Text";

string script = @"
embedHtml("{0}");
function embedHtml(result)
{{
    alert(result)
}}";

this.Page.ClientScript.RegisterStartupScript(typeof(Page), "embedHtml", 
            string.Format(script,markup), true);

The idea is that the code gets some text that comes from the server side and gets embedded into the page. The client script basically takes the embedded string and displays it when the page loads (the real thing embeds a bunch of HTML into the page in dynamic positions but same idea).

Can you spot the problem???

Actually this is all fine and dandy with the code above. It works fine.

But it starts becoming a problem if the text that you are embedding contains special characters. Say the string that you embed contains carriage returns, extended characters or maybe more pertinently - double quotes (which is what blew my code up originally not surprisingly since the embedded string contained HTML).

For example take this C# string assignment on the server:

string markup = "Hello \"Rick\"\r\nRock On";

which when generated into the client side with the code above results in:

embedHtml("Hello "Rick"
Rock On");

which clearly is going to cause a JavaScript  error when the page loads.

The problem is that using

embedHtml("{0}");

or

embedHtml('{0}');

is a string literal and it has to be embedded into the page properly or else code will blow up sporadically as certain characters are part of the strings embedded.

The fix for this is to encode the string to embed. The easiest way to use proper JavaScript string encoding is to use JSON encoding on  the string and you can do that with the following code:

/// <summary>
/// Encodes a string to be represented as a string literal. The format
/// is essentially a JSON string.
/// 
/// The string returned includes outer quotes 
/// Example Output: "Hello \"Rick\"!\r\nRock on"
/// </summary>
/// <param name="s"></param>
/// <returns></returns>
public static string EncodeJsString(string s)
{
    StringBuilder sb = new StringBuilder();
    sb.Append("\"");
    foreach (char c in s)
    {
        switch (c)
        {
            case '\"':
                sb.Append("\\\"");
                break;
            case '\\':
                sb.Append("\\\\");
                break;
            case '\b':
                sb.Append("\\b");
                break;
            case '\f':
                sb.Append("\\f");
                break;
            case '\n':
                sb.Append("\\n");
                break;
            case '\r':
                sb.Append("\\r");
                break;
            case '\t':
                sb.Append("\\t");
                break;
            default:
                int i = (int)c;
                if (i < 32 || i > 127)
                {
                    sb.AppendFormat("\\u{0:X04}", i);
                }
                else
                {
                    sb.Append(c);
                }
                break;
        }
    }
    sb.Append("\"");

    return sb.ToString();
}

 

So now we can change the code to:

string markup = wwWebUtils.EncodeJsString("Hello \"Rick\"\r\nRock On");

string script = @"
embedHtml({0});
function embedHtml(result)
{{
    alert(result)
}}";

this.Page.ClientScript.RegisterStartupScript(typeof(Page), "embedHtml", 
            string.Format(script,markup), true);

And voila - that works correctly. The embedded string in the JavaScript now looks like this:

"Hello \"Rick\"\r\nRock On"

Note that the embedHtml({0}) code has removed the quotes around the format/replace parameter as EncodeJsString will create the string with quotes around it so there's no ambiguity about which string delimiters to use. This can also reduce the complexity of code that requires nested string expressions.

This same logic applies if you use script expressions inside of a page:

alert( <%= wwWebUtils.EncodeJsString("My name is Sam\r\nRoll on") %> );

One place where I've actually used this a lot in the past is for client script localization. If you do something like this for example:

alert( <%= HttpContext.GetGlobalResourceObject("Resources","WarrantyDetail") %> );

you can run into the same encoding problems and this code should be changed to:

alert( <%= wwWebUtils.EncodeJsString(HttpContext.GetGlobalResourceObject("Resources","WarrantyDetail")) %> );

I know a lot of people truly disdain 'legacy' ASP classic script tags, but in some cases - localization especially - they are the easiest and most readable way to accomplish the task. Of course the same rules could be implied with a Label or Literal control and encoding the text explicitly in code.

I've run into this problem on a few occasions myself and I see it frequently in other people's code. While it may not be all that common to embed string literals into JavaScript when you do need to do it  it's very important to encode the string.

It's these little details that are easy to miss when working with JavaScript so I'd thought I pass this one along... Hope this helps somebody out.