Ok, I feel like an idiot, but I’ve been experimenting with this for an hour now and I cannot for the life of me figure how to get the encoding correct to run output from an ASP.NET page into a file. Well, no I can get it to work with explicitly setting the encoding to the Windows 1252, but this is not really what I want…
Here’s the setup. I’m using the ASP.NET runtime inside of a desktop app. The ProcessRequest method needs to pass a TextWriter to the ASP.NET runtime into which it will then render the output. This is the same TextWriter that Page.Render writes into for example.
All works well, except I can’t get the output written to file to look correct. What I want is to get the output written in UTF-8. So I thought I can use:
TextWriter Output;
try
{
// *** Note you have to write the right 'codepage'. If you use the default UTF-8
// *** everything will be double encoded.
Output = new StreamWriter(this.OutputFile,false, Encoding.UTF8);
}
catch (Exception ex)
{
this.Error = true;
this.ErrorMessage = ex.Message;
return false;
}
// *** Reset the Response settings
this.ResponseHeaders = null;
this.Cookies = null;
this.ResponseStatusCode = 200;
wwWorkerRequest Request = new wwWorkerRequest(Page, QueryString, Output);
if (this.Context != null)
Request.Context = this.Context;
Request.PostData = this.PostData;
Request.PostContentType = this.PostContentType;
Request.RequestHeaders = this.RequestHeaders;
Request.PhysicalPath= this.PhysicalDirectory;
try
{
HttpRuntime.ProcessRequest(Request);
}
catch(Exception ex)
{
Output.Close();
this.ResponseStatusCode = 500;
this.ErrorMessage = ex.Message;
this.Error = true;
return false;
}
Output.Close();
this.ResponseHeaders = Request.ResponseHeaders;
this.ResponseStatusCode = Request.ResponseStatusCode;
// *** Capture the Cookies that were set by the server
this.Cookies = Request.Cookies;
if (Request.Context != null)
this.Context = Request.Context;
return true;
The ASP.NET application is setup to encode to UTF-8 in Web.config:
<globalization requestEncoding="utf-8" responseEncoding="utf-8" />
So, what happens? Output gets generated but the output actually gets double encoded. I have a string like this embedded in the HTML of the ASPX rendered:
¢ª
After running the code above I get (raw output):
¢ª
Which is some funky double encoded wanna-be UTF-8 output of the above characters.
Next, I thought Ok, so we’re double encoding – let’s try Encoding.Ascii on the stream, but that gives me invalid characters (??????), so that’s no good either. Using Encoding.Default produces different results yet:
¢ª
which is just plain garbage.
I did manage to get this to work by using Encoding.Default (Windows 1252 basically) and then also setting the web.config to use Windows-1252 for its encoding, but this is not really what I want. Using a specific Encoding works to get me through, but it's not a good generic solution. Certainly UTF8 would be a better choice.
I don’t really understand what I should be passing in for a TextWriter here when I need to dump to file. Why is this double encoding occurring when I use Encoding.UTF8 on the stream? It seems what I need is raw binary stream into which the encoding TextWriter is writing. But then I’m stilling missing the byte order mark too…
What am I missing here? How do I set up my stream and TextWriter to get ASP.NET to write my output to file as properly encoded UTF-8 including the UTF-8 PreAmble and properly encoded upper characters?
Other Posts you might also like