4 users online. Create an account or sign in to join them.Users

Search

Hi Everyone,

This is my first post, so I'd like to say how impressed I am with Symphony so far.

I have a quick question for those XSLT experts out there that I'm struggling to solve. I recently switched the output doctype from XHTML 1.0 Strict to XHTML+RDFa 1.0. What's interesting, is that just the act of doing this has caused the XSLT processor to switch from closing empty tags in this style...

<div></div>

with the XML only style

<div/>

Putting any other issues aside, does anyone know why simply switching between these 2 doctypes would do this?

To be more precise, I've replaced:

<xsl:output method="xml"
    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
    omit-xml-declaration="yes"
    encoding="UTF-8"
    indent="yes" />

with this...

<xsl:output method="xml"
    doctype-public="-//W3C//DTD XHTML+RDFa 1.0//EN"
    doctype-system="http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"
    omit-xml-declaration="yes"
    encoding="UTF-8"
    indent="yes" />

Switching back confirms it's definitely just this change which is causing the difference in closing tag style. What's also odd is that this occurs when both versions have xml as the output method.

Any input greatly appreciated,

deebee

Ooops, should have stuck this one in the XSLT category. Apologies.

I wonder whether this is somehow related to the XHTML 1.0 Compatibility Guidelines. Perhaps the serializer is detecting the XHTML 1.0 (without RDFa) doctype and attempting to produce XHTML which won't choke HTML based parsers? I'd be interested to find out exactly what functionality is causing this to happen.

After much Googling and flicking through the source of libxslt/libxml2, it seems my suspicion was correct. libxml2 actually looks for one of the standard XHTML doctypes and produces output which conforms to the compatibility guidelines if one of the standard XHTML doctypes is found. Unfortunately, this hard coded list of doctypes doesn't include the the XHTML+RDFa doctype.

Has anyone had any experience in producing output which has the XHTML + RDFa doctype? I'd be interested to hear how you get around this issue.

Note that my comment is based on pretty fuzzy memory. If anyone has the time and is keen, some fact-checking would be appreciated.

From what I can recall, in older versions of LibXSLT, all empty elements are converted to self-closing elements under the XML method output regardless of doctype. The processor was then later (some time after 2005, I think) updated to accomodate for XHTML to ensure syntactical validity.

It would seem that XHTML+RDFa does not fall under the same formatting rule as standard XHTML doctype.

Heh, I had a response written up and ready to post but went out to a meeting. I posted my comment when I got back and realised you've done the legwork.

Hi Allen,

Thanks for taking the time to reply. I've had a bit more of a look through the libxml code (fairly sure libxslt uses libxml to serialize the results tree into actual xml/html). The actual API which libxml provides (and libxslt presumably uses), has options to override the default behaviour and force libxml to output either pure XML or XML that follows the compatibility guidelines, regardless of the doctype. At least that's what the documentation suggests. As to whether libxslt allows the user to make use of this capability, I have yet to find out. It might be available as an option to xsltproc or something, but that's obviously no good for Symphony users!

Just about to go and see what's going on in the actual source to get a better idea of what's possible.

After a bit of investigation, it seems there may be no way to change this behaviour. Although libxml does provide an option to forcibly control whether or not to enable the compatibility guidelines, libxslt doesn't make use of it. It just seems to use libxml's default behaviour. This just looks at the doctype and enables the compatibility guidelines if the doctype is one of the following hard coded values...

#define XHTML_STRICT_PUBLIC_ID BAD_CAST 
   "-//W3C//DTD XHTML 1.0 Strict//EN"
#define XHTML_STRICT_SYSTEM_ID BAD_CAST 
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
#define XHTML_FRAME_PUBLIC_ID BAD_CAST 
   "-//W3C//DTD XHTML 1.0 Frameset//EN"
#define XHTML_FRAME_SYSTEM_ID BAD_CAST 
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"
#define XHTML_TRANS_PUBLIC_ID BAD_CAST 
   "-//W3C//DTD XHTML 1.0 Transitional//EN"
#define XHTML_TRANS_SYSTEM_ID BAD_CAST 
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"

Since this list doesn't include XHTML+RDFa, it ends up outputting pure XML, which is no good when you're serving your pages as text/html and therefore asking the browser to use it's tag soup parser that chokes on the likes of <br/>

Aside from my issue, what about people who want to output XHTML that's going to be served with the proper XML Content-Type rather than text/html? I appreciate that this isn't usually a good idea, but there are some circumstances when it might be appropriate. I suppose the HTML compatible output is still valid XML, so it would still work, but the perfectionist in me would want to output pure XML.

Well, I've sent my question off to the libxslt mailing list. Will be interesting to see what comes back.

Hi deebee, I'm not sure but a way around this might be to use a post-parsing hack such as the one that is used for a HTML5 doctype. I's kind of ugly and not what we would like to use, but gets the job done...

Checkout the discussion on Symphony and HTML5, e.g. http://symphony-cms.com/discuss/thread/43003/3/#position-57 and onwards. The HTML5 doctype extension by Bauhause should be easily adapted for your doctype needs...

Thanks for the suggestion. It was as easy as you claimed it would be :)

Create an account or sign in to comment.

Symphony • Open Source XSLT CMS

Server Requirements

  • PHP 5.2 or above
  • PHP's LibXML module, with the XSLT extension enabled (--with-xsl)
  • MySQL 5.0 or above
  • An Apache or Litespeed webserver
  • Apache's mod_rewrite module or equivalent

Compatible Hosts

Sign in

Login details