XML Invalid Data and Byte Order Marker

While working on some InfoPath and Workflow I got bit again by the Byte Order Marker and I felt like I should document what’s going on. I was getting an exception… “The data at the root level is invalid. Line 1, position 1.” Here’s why:

The XML encoding that InfoPath uses is UTF-8. UTF-8 will make the first byte of the file (when decoded with UTF-8) a byte order marker. When XmlDocument sees this it’s confused. It expects the XML tag to appear at the very first character of the string you provide it. It’s simple to deal with – but frustrating to find. This code create a new XmlDocument, extracts the file contents from SharePoint, and loads it into the document.

XmlDocument wfDoc = new XmlDocument();

Byte[] fileBytes = wfFile.OpenBinary();

string fileAsString = (new System.Text.UTF8Encoding()).GetString(fileBytes);

wfDoc.LoadXml(fileAsString.Substring(1)); // SKIP Byte Order Marker @ beginning of file

A better approach is to hand off the stream to XmlDocument:

XmlDocument wfDoc = new XmlDocument();

using (Stream wfFileStrm = wfFile.OpenBinaryStream())

{

wfDoc.Load(wfFileStrm);

}

This will load fine without stripping the Byte Order Marker – but in my case, this isn’t supported in SharePoint Sandbox code because the System.IO.Stream type isn’t allowed.

2 Comments

Edward Wilde
April 29, 2010 at 10:11 am
Ah the beloved BOM. We’ve all been there :)

Reminds me of a handy extension method I wrote for detecting and removing ‘the mark’

http://blogs.edwardwilde.com/2009/09/09/xslcompiledtransform-the-utf-8-bom-systemxmlxmlexception-data-at-the-root-level-is-invalid-line-1-position-1/
- Reply
- Link
Kevin Dostalek
November 14, 2011 at 4:40 pm
Thanks… helped
- Reply
- Link

Add a Comment Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share this:

Public Speaking