SXML

SXML
Filename extension .sxml, .scm
Type code TEXT
Type of format markup language

SXML is an alternative syntax for writing XML data (more precisely, XML Infosets[1]) as S-expressions, to facilitate working with XML data in Lisp and Scheme. An associated suite of tools implements XPath, SAX and XSLT for SXML in Scheme[2][3] and are available in the GNU Guile implementation of that language.

Textual correspondence between SXML and XML for a sample XML snippet is shown below:

XML SXML
<tag attr1="value1"
     attr2="value2">
  <nested>Text node</nested>
  <empty/>
</tag>
(tag (@ (attr1 "value1")
        (attr2 "value2"))
  (nested "Text node")
  (empty))

Compared to other alternative representations for XML and its associated languages, SXML has the benefit of being directly parsable by existing Scheme implementations. The associated tools and documentation were praised in many respects by David Mertz in his IBM developerWorks column, though he also criticized the preliminary nature of its documentation and system.[4]

Example

Take the following simple XHTML page:

 <html xmlns="http://www.w3.org/1999/xhtml"
         xml:lang="en" lang="en">
    <head>
       <title>An example page</title>
    </head>
    <body>
       <h1 id="greeting">Hi, there!</h1>
       <p>This is just an &gt;&gt;example&lt;&lt; to show XHTML &amp; SXML.</p>
    </body>
 </html>

After translating it to SXML, the same page now looks like this:

 (*TOP* (@ (*NAMESPACES* (x "http://www.w3.org/1999/xhtml")))
  (x:html (@ (xml:lang "en") (lang "en"))
    (x:head
       (x:title "An example page"))
    (x:body
       (x:h1 (@ (id "greeting")) "Hi, there")
       (x:p  "This is just an >>example<< to show XHTML & SXML."))))

Each element's tag pair is replaced by a set of parentheses. The tag's name is not repeated at the end, it is simply the first symbol in the list. The element's contents follow, which are either elements themselves or strings. There is no special syntax required for XML attributes. In SXML they are simply represented as just another node, which has the special name of @. This can't cause a name clash with an actual "@" tag, because @ is not allowed as a tag name in XML. This is a common pattern in SXML: anytime a tag is used to indicate a special status or something that is not possible in XML, a name is used that does not constitute a valid XML identifier.

We can also see that there's no need to "escape" otherwise meaningful characters like & and > as &amp; and &gt; entities. All string content is automatically escaped because it is considered to be pure content, and has no tags or entities in it. This also means it is much easier to insert autogenerated content and that there is no danger that we might forget to escape user input when we display it to other users (which could lead to all kinds of nasty cross-site scripting attacks or other annoyances).

Citations

  1. Kiselyov, Oleg (2002). "SXML Specification". ACM SIGPLAN Notices 37 (6): 52–58. doi:10.1145/571727.571736.
  2. Kiselyov, Oleg; Lisovsky, Kirill (2002). XML, XPath, XSLT Implementations as SXML, SXPath, and SXSLT (PDF). International Lisp Conference.
  3. Kiselyov, Oleg; Krishnamurthi, Shriram (2003). SXSLT: Manipulation Language for XML. Practical Aspects of Declarative Languages. Lecture Notes in Computer Science. pp. 256–272. doi:10.1007/3-540-36388-2_18. ISBN 978-3-540-00389-2.
  4. Mertz, David (23 October 2003). "XML Matters: Investigating SXML and SSAX". IBM developerWorks. Archived from the original on 4 December 2004. Retrieved 10 January 2015.

External links

This article is issued from Wikipedia - version of the Saturday, August 29, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.