10:35 am, 23 Dec 03
ocaml + xml
I was bitching to
gaal about the usefulness of LogJam's XML output in generating PDFs, when I started Googling for "ocaml xml".
OX uses OCaml's meta-language support to support XML terms in the language directly (see tutorial for some EBNF¹). They also have some regular-expression-like pattern matching over XML.
CDuce isn't really ocaml-related (besides being written in it) but appears to use DTDs to statically verify / optimize XML processing. They say "a CDuce program can run faster (30% to 60%) than an equivalent XSLT style-sheet" but I don't really understand how valid the comparison is.
I've played with PXP before but it hurt my head. But I think they were doing what seemed to me to be the sensible goal: use the DTD to process the XML directly into a data structure. A straight DOM tree is annoying to work with, because every node must have a list of children, even for the common cases where they only have one string child; using a DTD allows you to specialize your tree to the document format.
(LogJam's XML processing is really painful and makes me feel gross because it's so repetitive. But that's in C and I don't expect much out of C.)
1 I hate it when people introduce language features via EBNF. It's much easier for me to see some examples and infer abstraction than it is for me to see abstraction and generate examples.
OX uses OCaml's meta-language support to support XML terms in the language directly (see tutorial for some EBNF¹). They also have some regular-expression-like pattern matching over XML.
CDuce isn't really ocaml-related (besides being written in it) but appears to use DTDs to statically verify / optimize XML processing. They say "a CDuce program can run faster (30% to 60%) than an equivalent XSLT style-sheet" but I don't really understand how valid the comparison is.
I've played with PXP before but it hurt my head. But I think they were doing what seemed to me to be the sensible goal: use the DTD to process the XML directly into a data structure. A straight DOM tree is annoying to work with, because every node must have a list of children, even for the common cases where they only have one string child; using a DTD allows you to specialize your tree to the document format.
(LogJam's XML processing is really painful and makes me feel gross because it's so repetitive. But that's in C and I don't expect much out of C.)
1 I hate it when people introduce language features via EBNF. It's much easier for me to see some examples and infer abstraction than it is for me to see abstraction and generate examples.
The problem I generally gave whenever I try to do something interesting with LJ entries serialized to XML is the escaped, tag soup HTML. First you need an XML library to find the HTML content, then an HTML library to parse the tag soup, and finally something which can take the HTML and render it in some useful way such as as a PDF.
Three separate processes, one of which has to mimick someone else's undocumented behavior and special cases. Not fun.
but i think you could get by with a pretty simplistic tagsoup->xml processor that handles stuff like <lj user="foo"> specially, in which case you could generate one full xml document and then deal with the pdf stuff (which is independent anyway).
ocaml + xml
> CDuce isn't really ocaml-related (besides being written in it) but > appears to use DTDs to statically verify / optimize XML processing. Well that will not no longer true, very soon. For the moment is all undocumented in our CVS but you can already write a .mli file for a CDuce object file and call the functions you defined in CDuce as if it were in a OCaml module. Hope to have it ready by February. Cheers ---Beppe--- (Giuseppe.Castagna __at__ ens.fr)