<slides>
<title>Qexo</title>
<slide id="splash">
<caption>Compiling XQuery to Java bytecodes</caption>
<h2>Per Bothner</h2>
<h3><code>&lt;per@bothner.com&gt;</code></h3>
<h5>June 2004</h5>
</slide>

<slide id="Qexo-Kawa">
<caption>Qexo and Kawa</caption>
<ul>
<li>Qexo compiles an XQuery module to Java bytecode.</li>
<li>No <q>plan</q> - eager <q>natural</q> evaluation.</li>
<li>Provides full access to Java libraries.</li>
<li>Qexo is based on and part of Kawa, which was originally (1996)
written to compile the functional language Scheme.</li>
<li>Open-source GNU software with a liberal license.</li>
</ul>
</slide>

<slide id="push-or-pull">
<caption>Lazy or eager evaluation</caption>
<ul>
<li>Many implementations are <i>demand-driven</i>:
a client requests (pulls) successive result items.</li>
<li>An expression is only evaluated when result needed.</li>
<li>Good strategy for certain queries, but large interpretative overhead.</li>
<li>Qexo's eager evaluation maps XQuery program structure
more directly into VM state.</li>
<li>Avoids interpretative overhead.</li>
<li>Allows more natural debugging.</li>
</ul>
</slide>

<slide id="events">
<caption>Event-driven data</caption>
<ul>
<li>Qexo creates explicit nodes and sequences only when need.</li>
<li>Otherwise, nodes and values are <q>written</q> using SAX-like <q>events</q>to an abstract <tt>Consumer</tt> interface.</li>
</ul>
<pre>
for $i in (10, 20) return &lt;p&gt;{$i+1}&lt;/p&gt;
</pre>
<p>becomes</p>
<pre>
void main (Consumer output) {
  temp_1(10, output);  temp_1(20, output);
}
void temp_1 (Object i, Consumer output) {
  output.beginElement("p");
  output.writeItem(NumOps.add(i, 1));
  output.endElement("p"); }
</pre>
</slide>

<slide id="Nodes">
<caption>Node representation</caption>
<ul>
<li>A document (fragment) is stored in a <tt>TreeList</tt>.</li>
<li><tt>TreeList</tt> is a compact representation using 2 arrays.</li>
<li>Space-efficient, improves locality and GC times.</li>
<li>A node is an index into a <tt>TreeList</tt> array.</li>
<li>Abstraction layer allows for other representations, such
as database keys.</li>
<li>Wrapper classes provide a subset of <tt>www.w3c.dom</tt> functionality.</li>
</ul>
</slide>

<slide id="Compiler-overview">
<caption>Compiler overview</caption>
<ol>
<li>Parsing.</li>
<li>QName expansion.</li>
<li>Module import.</li>
<li>Name resolution.</li>
<li>Analysis and optimalization.</li>
<li>Code generation.</li>
<li>Output of bytecode to <tt>.class</tt> files or
for immediate execution.</li>
</ol>
</slide>

<slide id="Expressions">
<caption>Expressions</caption>
<ul>
<li>Parsing yields an <tt>Expression</tt> tree.</li>
<li>Representation is language-independent.</li>
<li>XQuery operations are represented by calls to built-in functions.</li>
<li>Control structures like <tt>for</tt> are represented a
<i>mapping</i> function that applies a compiler-generated function
on each item of a sequence.</li>
<li>Analysis and optimization are performed on <tt>Expression</tt>s.</li>
<li>Hooks for special optimizations and inlining of built-in functions.</li>
</ul>
</slide>

<slide id="Code-generation">
<caption>Code generation</caption>
<ul>
<li>Generate bytecode by recursively walking <tt>Expression</tt>.</li>
<li>Default strategy: leave expression result on JVM stack.</li>
<li>Outer expression can ask to leave result elsewhere,
perhaps as event calls on a passed-in <tt>Consumer</tt>.</li>
<li>Can generate custom bytecode for built-in functions.</li>
</ul>
</slide>

<slide id="for-expressions">
<caption>Compiling <tt>for</tt>-expressions</caption>
<ul>
<li><tt>FLWOR</tt> expressions are hardest to optimize.</li>
<li>Treat <tt>return</tt> body as anonymous function,
called as needed from <tt>for</tt> clause.</li>
<li>More general solution: compile <tt>return</tt> body
to new <tt>Consumer</tt> class.</li>
<li>No attempt yet at join optimizations or using indexes.</li>
</ul>
</slide>

<slide id="functions">
<caption>Compiling functions</caption>
<ul>
<li>XQuery function compiled to Java method.</li>
<li>Each XQuery parameter becomes a Java parameter.</li>
<li>Compiler adds a <tt>CallContext</tt> parameter.</li>
<li>Method result is <tt>void</tt>; result passed to caller
using SAX-like event calls on <tt>CallContext</tt>.</li>
<li>Query body treated as a zero-parameter function.</li>
</ul>
</slide>

<slide id="tail-calls">
<caption>Tail calls</caption>
<ul>
<li>A <dfn>tail-call</dfn> is when the last expression in a function is
a function call.</li>
<li>Desirable that tail-calls not grow stack.</li>
<li>This is tricky in JVM.</li>
<li>Solution is to split function call into 3 parts:
<ol>
<li>Evaluate arguments and save them and called function in well-known location.</li>
<li>Return from method compiled from current function.</li>
<li>Call the function using an outer driver loop.</li>
</ol>
</li>
</ul>
</slide>

<slide id="Constructors">
<caption>Node constructors</caption>
<ul>
<li>Node constructors are compiled to SAX-like output method calls.</li>
<li>Nodes written to passed-in <tt>Consumer</tt> object.</li>
<li>When nodes need to be materialized, the target is a <tt>TreeList</tt>.</li>
<li>Outermost <tt>Consumer</tt> is initially XML serializer.</li>
<li>Often constructors just cause output directly to serializer.</li>
<li>No global analysis needed.</li>
</ul>
</slide>

<slide id="Extensions">
<caption>Extensions</caption>
<ul>
<li>A Java <tt>Object</tt> without special XQuery meaning is
an atomic item.</li>
<li>Convenient namespace-based syntax for calling Java methods.</li>
<li>Interactive command (read-eval-print) console.</li>
<li>A query can be directly compiled to a servlet.</li>
</ul>
</slide>

<slide id="Web-application">
<caption>Web applications</caption>
<ul>
<li>Web server can execute XQuery programs on client request.</li>
<li>Trivial deployment.  Query is automatically compiled for
high performance.</li>
<li>Expression's value is serialized (by default as XHTML);
becomes server's response.</li>
<li>Query can set response parameters by prepending special values.</li>
<li>Utility functions/syntax to access query parameters.</li>
</ul>
</slide>

<slide id="Status">
<caption>Status</caption>
<ul>
<li>A useful and growing subset of the November '03 draft works.</li>
<li>I use Qexo to generate linked web pages for my personal photo album.</li>
<li>Performance decent but hard to measure as startup-time dominates.</li>
<li>Available now from <a href="http://www.gnu.org/software/qexo">http://www.gnu.org/software/qexo</a>.</li>
</ul>
</slide>

</slides>
