Matthias Shapiro

I need fewer hobbies

International UTF-8 Characters in Windows Phone 7 WebBrowser Control

I haven’t blogged for a while not because I haven’t had anything to say but because I felt I need time to triage all cool stuff I’ve been learning about Windows Phone 7 Silverlight development. However, one thing that I’ve learned cannot wait. That is support for international characters in the WebBrowser control.

Basically, the problem is as follows: We want to show HTML that uses international characters. The most straightforward way to show HTML in a Windows Phone 7 app is to use the WebBrowser control and the "NavigateToString(string myString)” method to input the HTML.

However, when we hook international text (like Japanese, Arabic, Korean, Russian or Chinese characters) using this method, we get a mess. The following code:

string testString = "<html><body>日本列島の占領の最初の兆候が縄文時代で約14,000のBC、竪穴住居の中石器時代新石器時代
; BrowserControl.NavigateToString(testString);
produces the following result:


In case you’re not familiar with Japanese, this is not Japanese. This is, instead, the ASCII version of the Japanese characters we want to see. Why does it do this? I’m not sure. But my effort to show the actual international text in the WebBrowser control was met with tears time and time again.

Until I found this post unhelpfully titled “Windows Phone 7 Character Testing…”. Here the author gives us this extremely helpful method for delivering the string we need to show international characters:

private static string ConvertExtendedASCII(string HTML)
    string retVal = "";
    char[] s = HTML.ToCharArray();

    foreach (char c in s)
        if (Convert.ToInt32(c) > 127)
            retVal += "&#" + Convert.ToInt32(c) + ";";
            retVal += c;

    return retVal;

With this in place, we can very simply run our string through the method to give us properly encoded HTML so that


gives us:


And we’re happy. Very happy.

18 Responses to International UTF-8 Characters in Windows Phone 7 WebBrowser Control

  1. This is a process in which the hip of the patients is replaced
    with prosthesis. Exercise techniques and stretches to
    increase flexibility and ROM. How do you measure the amount of inward
    curve or kissing knees a person has.

  2. Manh Hoang Xuan

    Thanks! It’s very helpful for Vietnamese Charactors

  3. Pingback: Edit long content of text in Win Phone 7 | Danilo Ercoli

  4. EngDev

    It didn’t work correctly in arabic text

  5. SiONYX

    retVal += c; instantiates new string every iteration that is significantly slows dows conversion.

    Following function works thousand times faster:

    private static string FastConvertExtendedASCII(string HTML)
    char[] s = HTML.ToCharArray();

    // Getting number of characters to be converted
    // and calculate extra space
    int n = 0;
    int value;
    foreach (char c in s)
    if ((value = Convert.ToInt32(c)) > 127)
    if (value > 9999)
    n += 7;
    else if (value > 999)
    n += 6;
    n += 5;

    // To avoid new string instantiating
    // allocate memory buffer for final string
    char[] res = new char[HTML.Length + n];

    // Conversion
    int i = 0;
    int div;
    const int zero = (int)’0′;
    foreach (char c in s)
    if ((value = Convert.ToInt32(c)) > 127)
    res[i++] = ‘&’;
    res[i++] = ‘#’;

    if (value > 9999)
    div = 10000;
    else if (value > 999)
    div = 1000;
    div = 100;

    while (div > 0)
    res[i++] = (char)(zero + value / div);
    value %= div;
    div /= 10;

    res[i++] = ‘;’;
    res[i] = c;

    return new string(res);

  6. Pingback: 如何解决WP7中WebBrowser的中文乱码? | Tmango

  7. Thanks, Man. Works well! )

  8. Pingback: UGG 5815

  9. Masoud

    Hey man
    you great
    it works

  10. dalmate

    Thanks. It saved my life. :))

  11. Joel

    I found another way to reslove the problem;

    StreamReader reader = new StreamReader(TitleContainer.OpenStream(“731999031.htm”), Encoding.GetEncoding(“unicode”));

    I works very well!

  12. Joel

    It works!
    But it will take a very long time to work with the ConvertExtendedASCII, Any idea to reslove it.

  13. string.Format(@”{0}”, content)。在前面添加head

  14. Ken Sadahiro

    Let’s try again – sorry:

    <head><meta content=”text/html; charset=utf-16”/></head>

  15. Ken Sadahiro

    OOPS – the HTML got eaten up – what I said after “and tags?” was this:

    <head><meta content=”text/html; charset=utf-16”/>&lt/head>

  16. Ken Sadahiro

    Have you tried adding the following in your “testString” variable between the and tags?

    That might help at least with Unicode BMP characters. For characters that are in one of the Unicode SMPs, you will need to use one of the available System.Text.Encoding classes (see ), or do the trick you did.

  17. Very helpful post! I’ve tuned the code sample:

    private static string ConvertExtendedAscii(string html)
    StringBuilder sb = new StringBuilder();

    foreach (var c in html)
    int charInt = Convert.ToInt32(c);
    if (charInt > 127)
    sb.AppendFormat(“&#{0};”, charInt);

    return sb.ToString();