International UTF-8 Characters in Windows Phone 7 WebBrowser Control
I haven’t blogged for a while not because I haven’t had anything to say but because I felt I need time to triage all cool stuff I’ve been learning about Windows Phone 7 Silverlight development. However, one thing that I’ve learned cannot wait. That is support for international characters in the WebBrowser control.
Basically, the problem is as follows: We want to show HTML that uses international characters. The most straightforward way to show HTML in a Windows Phone 7 app is to use the WebBrowser control and the "NavigateToString(string myString)” method to input the HTML.
However, when we hook international text (like Japanese, Arabic, Korean, Russian or Chinese characters) using this method, we get a mess. The following code:
string testString = "<html><body>日本列島の占領の最初の兆候が縄文時代で約14,000のBC、竪穴住居の中石器時代新石器時代
に半定住狩猟採集文化と農業の初歩的なフォームから続いて、30,000年頃旧石器文化と登場しました。</body></html>"; BrowserControl.NavigateToString(testString);
produces the following result:
In case you’re not familiar with Japanese, this is not Japanese. This is, instead, the ASCII version of the Japanese characters we want to see. Why does it do this? I’m not sure. But my effort to show the actual international text in the WebBrowser control was met with tears time and time again.
Until I found this post unhelpfully titled “Windows Phone 7 Character Testing…”. Here the author gives us this extremely helpful method for delivering the string we need to show international characters:
private static string ConvertExtendedASCII(string HTML) { string retVal = ""; char[] s = HTML.ToCharArray(); foreach (char c in s) { if (Convert.ToInt32(c) > 127) retVal += "&#" + Convert.ToInt32(c) + ";"; else retVal += c; } return retVal; }
With this in place, we can very simply run our string through the method to give us properly encoded HTML so that
BrowserControl.NavigateToString(ConvertExtendedASCII(testString));
And we’re happy. Very happy.
18 Responses to International UTF-8 Characters in Windows Phone 7 WebBrowser Control
This is a process in which the hip of the patients is replaced
with prosthesis. Exercise techniques and stretches to
increase flexibility and ROM. How do you measure the amount of inward
curve or kissing knees a person has.
Thanks! It’s very helpful for Vietnamese Charactors
Pingback: Edit long content of text in Win Phone 7 | Danilo Ercoli
It didn’t work correctly in arabic text
retVal += c; instantiates new string every iteration that is significantly slows dows conversion.
Following function works thousand times faster:
private static string FastConvertExtendedASCII(string HTML)
{
char[] s = HTML.ToCharArray();
// Getting number of characters to be converted
// and calculate extra space
int n = 0;
int value;
foreach (char c in s)
{
if ((value = Convert.ToInt32(c)) > 127)
{
if (value > 9999)
n += 7;
else if (value > 999)
n += 6;
else
n += 5;
}
}
// To avoid new string instantiating
// allocate memory buffer for final string
char[] res = new char[HTML.Length + n];
// Conversion
int i = 0;
int div;
const int zero = (int)’0′;
foreach (char c in s)
{
if ((value = Convert.ToInt32(c)) > 127)
{
res[i++] = ‘&’;
res[i++] = ‘#’;
if (value > 9999)
div = 10000;
else if (value > 999)
div = 1000;
else
div = 100;
while (div > 0)
{
res[i++] = (char)(zero + value / div);
value %= div;
div /= 10;
}
res[i++] = ‘;’;
}
else
{
res[i] = c;
i++;
}
}
return new string(res);
}
Check this helpful link too it also explained very well about WebBrowser control in windows phone 7 development….
http://www.mindstick.com/Articles/e5957dee-a494-47da-9cd5-61fb870c02e2/?WebBrowser%20Control%20in%20Windows%207%20Phone%20Development
http://dotnet.dzone.com/news/windows-phone-7-development
Pingback: 如何解决WP7中WebBrowser的中文乱码? | Tmango
Thanks, Man. Works well! )
Pingback: UGG 5815
Hey man
you great
thanks
it works
Thanks. It saved my life. :))
I found another way to reslove the problem;
StreamReader reader = new StreamReader(TitleContainer.OpenStream(“731999031.htm”), Encoding.GetEncoding(“unicode”));
I works very well!
It works!
But it will take a very long time to work with the ConvertExtendedASCII, Any idea to reslove it.
string.Format(@”{0}”, content)。在前面添加head
Let’s try again – sorry:
<head><meta content=”text/html; charset=utf-16”/></head>
OOPS – the HTML got eaten up – what I said after “and tags?” was this:
<head><meta content=”text/html; charset=utf-16”/></head>
Have you tried adding the following in your “testString” variable between the and tags?
That might help at least with Unicode BMP characters. For characters that are in one of the Unicode SMPs, you will need to use one of the available System.Text.Encoding classes (see http://msdn.microsoft.com/en-us/library/system.text.encoding(v=VS.95).aspx ), or do the trick you did.
Very helpful post! I’ve tuned the code sample:
private static string ConvertExtendedAscii(string html)
{
StringBuilder sb = new StringBuilder();
foreach (var c in html)
{
int charInt = Convert.ToInt32(c);
if (charInt > 127)
sb.AppendFormat(“&#{0};”, charInt);
else
sb.Append(c);
}
return sb.ToString();
}