Rick Strahl's Web Log

Wind, waves, code and everything in between...
ASP.NET • C# • HTML5 • JavaScript • AngularJs
Contact   •   Articles   •   Products   •   Support   •   Search
Ad-free experience sponsored by:
ASPOSE - the market leader of .NET and Java APIs for file formats – natively work with DOCX, XLSX, PPT, PDF, images and more

XmlWriter and Schema


:P
On this page:

In the ResX exporter for my data driven Resource Provider I use a bit of code that iterates over the database resources and then spits out ResX resources from the data as an option to get your resources into your Web site. The code I've used in this stretch of code uses an XmlWriter to quickly spit out the data.

But there are a couple of non-critical problems in that code. Several people have pointed out that the XML generated doesn't exactly match the ResX format that Visual Studio uses (although as far as I can tell there's no problem).

The problem is related to an xml:space="preserve" attribute that Visual Studio sticks onto each resource. It does this so leading and trailing spaces are preserved when the values are read. Visual Studio generates:

  <data name="Today" xml:space="preserve">
    <value>Heute </value>
  </data>

but my code generates:

   <data name="Today" d2p1:space="preserve" xmlns:d2p1="xml">
      <value>Heute </value>
   </data>

So where's that coming from?

The code I use to generate this Xml Fragment is pretty simple:

xWriter.WriteStartElement("data");
xWriter.WriteAttributeString("name", ResourceId);                    
xWriter.WriteAttributeString("space","xml","preserve");                                        
xWriter.WriteElementString("value", Value);
xWriter.WriteEndElement(); // data

As you can see the xml:space attribute is in fact written out directly, but the XmlWriter thinks it knows better and tries to explicitly define the Xml namespace. For the resource editor this causes no harm as far as I can tell, but it is kinda ugly.

Now the document does have a schema, but originally had written out the schema simply as a raw text string. There's a raw header and schema definition:

        public const string ResXDocumentTemplate =
@"<?xml version=""1.0"" encoding=""utf-8""?>
<root>
  <xsd:schema id=""root"" xmlns="""" xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
xmlns:msdata=""urn:schemas-microsoft-com:xml-msdata""> <xsd:import namespace=""http://www.w3.org/XML/1998/namespace"" /> <xsd:element name=""root"" msdata:IsDataSet=""true""> <xsd:complexType> <xsd:choice maxOccurs=""unbounded""> <xsd:element name=""metadata""> <xsd:complexType> <xsd:sequence> <xsd:element name=""value"" type=""xsd:string"" minOccurs=""0"" /> </xsd:sequence> <xsd:attribute name=""name"" use=""required"" type=""xsd:string"" /> <xsd:attribute name=""type"" type=""xsd:string"" /> <xsd:attribute name=""mimetype"" type=""xsd:string"" /> <xsd:attribute ref=""xml:space"" /> </xsd:complexType> </xsd:element> <xsd:element name=""assembly""> <xsd:complexType> <xsd:attribute name=""alias"" type=""xsd:string"" /> <xsd:attribute name=""name"" type=""xsd:string"" /> </xsd:complexType> </xsd:element> <xsd:element name=""data""> <xsd:complexType> <xsd:sequence> <xsd:element name=""value"" type=""xsd:string"" minOccurs=""0"" msdata:Ordinal=""1"" /> <xsd:element name=""comment"" type=""xsd:string"" minOccurs=""0"" msdata:Ordinal=""2"" /> </xsd:sequence> <xsd:attribute name=""name"" type=""xsd:string"" use=""required"" msdata:Ordinal=""1"" /> <xsd:attribute name=""type"" type=""xsd:string"" msdata:Ordinal=""3"" /> <xsd:attribute name=""mimetype"" type=""xsd:string"" msdata:Ordinal=""4"" /> <xsd:attribute ref=""xml:space"" /> </xsd:complexType> </xsd:element> <xsd:element name=""resheader""> <xsd:complexType> <xsd:sequence> <xsd:element name=""value"" type=""xsd:string"" minOccurs=""0"" msdata:Ordinal=""1"" /> </xsd:sequence> <xsd:attribute name=""name"" type=""xsd:string"" use=""required"" /> </xsd:complexType> </xsd:element> </xsd:choice> </xsd:complexType> </xsd:element> </xsd:schema> <resheader name=""resmimetype""> <value>text/microsoft-resx</value> </resheader> <resheader name=""version""> <value>2.0</value> </resheader> <resheader name=""reader""> <value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value> </resheader> <resheader name=""writer""> <value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value> </resheader> "
;

which is simply written into the XmlWriter as a raw string:

XmlTextWriter Writer = new XmlTextWriter(this.FormatResourceSetPath(ResourceSet, LocalResources) + Loc, Encoding.UTF8);
Writer.Indentation = 3;
Writer.IndentChar = ' ';
Writer.Formatting = Formatting.Indented;
xWriter = Writer as XmlWriter;

xWriter.WriteRaw(ResXDocumentTemplate);

I suppose that's a very lazy way of doing things but honestly I didn't see much value in actually hand-coding the schema into individual XmlWriter commands.

But the above has a couple of problems. First there's the xml:space issue. There's also a formatting problem - although I have document indentation enabled the formatting doesn't actually work properly - the XmlWriter ends up writing a streaming document despite the settings. The Xml document needs to be treated as an Xml Fragment rather than a document because the writer doesn't know the the doc header was written with WriteRaw. Finally the end document tag needs to be explicitly written out as raw text.

All of this results in quite an ugly looking, albeit functional document.

So I started experimenting a little bit with writing out this content more cleanly. I could write out the PI and root tag and then just inject the scheme with WriteRaw which improved the document parsing somewhat, but still there were schema issues.

But I think a better way is to do something like this instead - load up the 'template' as a full XmlDocument first:

// *** Load the document schema
XmlDocument doc = new XmlDocument();
doc.LoadXml(ResXDocumentTemplate);

and then write out portions of this XmlDocument into the XmlWriter. So when each ResX file is created there's code like this:

XmlTextWriter Writer = new XmlTextWriter(this.FormatResourceSetPath(ResourceSet, LocalResources) + Loc, Encoding.UTF8);
...
xWriter.WriteStartElement("root"); // *** Write out the schema doc.DocumentElement.ChildNodes[0].WriteTo(xWriter); // *** Write out the leading resheader elements XmlNodeList Nodes = doc.DocumentElement.SelectNodes("resheader"); foreach(XmlNode Node in Nodes) { Node.WriteTo(xWriter); }

This actually writes out each of the components properly into the XmlWriter which in fact ends up producing a clean and properly constructed XmlWriter document.

This solves all the formatting issues, with the exception of the xml:space issue that started me down this path in the first place. It still fails even with the schema properly written through the XmlWriter.

One solution and probably the cleanest at that is to modify the schema to automatically default to the preserve setting:

           <xsd:element name=""data"">
            <xsd:complexType>
              <xsd:sequence>
                <xsd:element name=""value"" type=""xsd:string"" minOccurs=""0"" msdata:Ordinal=""1"" />
                <xsd:element name=""comment"" type=""xsd:string"" minOccurs=""0"" msdata:Ordinal=""2"" />
              </xsd:sequence>
              <xsd:attribute name=""name"" type=""xsd:string"" use=""required"" msdata:Ordinal=""1"" />
              <xsd:attribute name=""type"" type=""xsd:string"" msdata:Ordinal=""3"" />
              <xsd:attribute name=""mimetype"" type=""xsd:string"" msdata:Ordinal=""4"" />
              <xsd:attribute ref=""xml:space"" default=""preserve""/>
            </xsd:complexType>

and then simply don't generate the tag. I checked this out with two-way conversion - saving values with spaces and exporting to Resx checking for the spaces still being in the XML (they are) and then round tripping the data back into the Resource Provider and double checking the value to ensure that the spaces have made it (they do).

This effectively solves the problem above, but I still wonder how I would go about generating xml:space="preserve" explicitly into the XmlWriter. In other situations modifying the schema generically may simply not be an option.

I'm not sure how to make the XmlWriter recognize the schema properly. It does work for the Resx editor and for parsing the Xml file as input using XmlDocument (ie. it passes validation), but now I am really wondering how to make the proper Xml definition work.

So, how do I get the XmlWriter generate: xml:space="preserve"?  Anybody of you Xml Wizards know how to do this? <s>

Posted in .NET  Localization  XML  

The Voices of Reason


 

Matt Brooks
August 09, 2007

# re: XmlWriter and Schema

Rick,

The XmlTextWriter.XmlSpace Property (http://msdn2.microsoft.com/en-us/library/system.xml.xmltextwriter.xmlspace(vs.80).aspx) documentation suggests using a different overload of the WriteAttributeString() method. Does this help you?

Regards,
Matt

Kevin Dente
August 09, 2007

# re: XmlWriter and Schema

Is there a reason why you don't just use ResXResourceWriter?

Rick Strahl
August 09, 2007

# re: XmlWriter and Schema

@Matt - ah, yes you pointed me in the right direction. Actually the problem isn't the XmlSpace setting itself but the way I called the AddAttribute method. The example for XmlSpace shows how they write out the xml:space attribute. I used xml as a namespace, but they are using xml as a *prefix*.

Changing the code to:
xWriter.WriteAttributeString("xml","space",null,"preserve"); 

fixes the issue and properly generates xml:space="preserve".

Rick Strahl
August 09, 2007

# re: XmlWriter and Schema

@Kevin - I can't remember the details now - it had something to do with non-string resources. ASP.NET writes out the files explicitly and I think the resource provider just keeps the binary data directly in the XML as a binary string. I wanted to match the ASP.NET behavior. I haven't looked at this in a while though so it's possible that this can be done right, but the above now works and is only marginally more complicated <s>.

Lav G
August 09, 2007

# re: XmlWriter and Schema


Check out how to validate an XML against an XSD Schema.
http://devlav.blogspot.com/2007_07_01_archive.html.

typo in previous comment :)

Rick Strahl
August 09, 2007

# re: XmlWriter and Schema

@Lav - that's useful, but that assumes you're READING the document. When you're writing how exactly do you validate as you go? The doc's not complete at this point. I don't think the writer actually validates. Above my problem was that I used a namespace instead of a namespace prefix accidentally which is what caused the problem.

I think during generation with XmlWriter/XmlTextWriter all the responsibility on creating the correct XML markup is on the developer - I suppose when you're through you can load into a Reader or DOM to validate to make sure the generated doc is valid.

Tom Groeger
August 10, 2007

# re: XmlWriter and Schema

There is also a problem importing Visual-Studio Resource-Files:

I had to modify ImportResourceFile() from

string Value = Node.ChildNodes[0].InnerText;


to:

string Value =  Node.ChildNodes[0].InnerText.Trim();
                          if (Value == "" && Node.ChildNodes.Count > 1)
                              Value = Node.ChildNodes[1].InnerText.Trim();


The original code only gave me blank values in my table; Node.ChildNodes[0].InnerText returned '/r/n' and some whitespace when I imported existing Visual Studio-created Resource-Files. This hack might not be the most elegant way but it works <g>

BTW: Great tool, I love it! The Editor may need some more TLC, but what a progress after working with the VS built-in Editor!

Greetings from Husen,Germany
... Tom Groeger

Brett J
January 07, 2008

# re: XmlWriter and Schema

I had the same problem Tom did; I used this which worked for my resx files:

string Value = Node.SelectSingleNode("value").InnerText;

West Wind  © Rick Strahl, West Wind Technologies, 2005 - 2020