There have been many online discussions about how Visual Studio messes up the formatting of HTML source code, I must admit I have been involved in a few of these, including Mikhail Arkhipov’s Weblog for instance. Which explains that the responsibility for code-mangling is down to MSHTML.DLL and its’ tokeniser and grammar analyser recreating the source HTML from the browser output (in visual studios’ design view) when switching to HTML view.
It appears that MSHTML.DLL is not fully XHTML compliant (not that I am suggesting someone said it was), consider the following case:
<UL>
<LI>test1</LI>
<LI>test2</LI>
<LI>test3</LI>
</UL>
Now, I would expect the indentation to be stripped along with the odd CR/LF only to have extra CR/LF inserted elsewhere. This is what I normally get so there would be no surprises there. But the removal of closing tags? The extra whitespace raised an eyebrow too…
The new source as created by VS.NET after switching to design view and back again:
<UL>
<LI>test1
<LI>test2
<LI>test3 </LI></UL>
Does this not break one of the XHTML commandments?
So, as a word of warning, don’t expect VS.NET projects to get you a big thumbs up from any XHTML validators. I guess I’ll just have to wait anxiously (but not without shame I hasten to add) for the arrival of Whidbey (VS 2005), which claims to give you the option to leave your code untouched, well almost. Atleast the solution has been to only update the original HTML with changes to that code from MSHTML.
But how significant must a “change” be before it’s considered necessary to propagate back to the users code? I would like to assume that Whidbey was XHTML compliant and doesn’t remove closing tags, but I did assume for some time that Visual Studio 2003 was. From Mikhails’ article it appears that formatting as the user intended will prevail, but apart from that Whidbey will assume command surely?
Maybe I should crack open a beta or two…
