Description of bidirectional UnQL+/UnCAL demonstration

BiG team

Aug. 2009

1 Overview

This document describes how to use our demonstration page of bidirectionalized UnQL+/UnCAL[5,4,3]. UnCAL is a graph algebra that is described in [1].

2 Quick Tour

  1. On the demonstration page, you can select the source DB (and the image of it) at the beginning. Please leave this box intact at the moment.
  2. About the middle of the page, you can select UnQL+ or UnCAL queries by the list box. Please also leave this box intact.

  3. Push ``Select Source DB and Query Expression'' button.

  4. The image of the selected input DB as well as a text representation of the query chosen appears with the buttons ``Submit'' and ``Reset''. There are several options that can be specified by several check boxes above the buttons, but leave these intact as well and push the ``Submit'' button to issue the query displayed.

  5. On the ``UnQL+/UnCAL expression evaluation result'' page, you will see sections ``Selected source DB'', ``Executed command'', ``Message by processor:'', and ``Result graph''. The ``Result graph'' section shows the result of the forward evaluation. You can also see an image ``Sample graph after editing'', which is the sample presented by us as a modified result. This default image shows the modification of the edge labeled ``d'' to ``ModD'' from node n3 to n2, as you can see that label displayed in red.

  6. Press ``Submit'' button to initiate backward transformation. You can optionally perform your own editing. Please refer to section 8 for detail.

  7. On the ``UnCAL expression backward evaluation result'' page, you will see ``Updated source DB'' image at the bottom, which reflect the modified environment by the backward evaluator. If you compare this image with the image of the source DB that is also displayed in the same page, you can see the ``d'' edge from node n3 to n0 turned into ``ModD'', which reflect the sample editing described above.

3 About Graph Representation

In this demonstration, graphs are represented by the output of DOT and DOTTY system1, graph layout products developed by AT&T.

Nodes consist of three components, two of which are missing in normal nodes. Uppermost part shows input markers surrounded by curly braces ({}) if any. The middle part, which is always present, shows id. The bottom part shows output markers rendered in the same way as input markers, if any. Figure 1 shows an example. It includes a cycle which consists of the edge labeled with ``c'' from node n1 to itself. A default input marker &   is shown at the root node n5 explicitly.

Figure 1: A simple graph
\includegraphics[width=4.0cm]{bd_db.uncal_elliptic.eps}

Readers who had read an old version of this document may have noticed the difference in the node format. Previously, nodes are rendered in rectangles, in which markers and id's are separated by horizontal lines as shown in Figure 2. This format is still supported and can be activated by setting global node ``shape'' attribute to ``record'' in dot file.

Figure 2: A simple graph (record node mode)
\includegraphics[width=4.0cm]{bd_db.uncal.eps}

4 How Backward Transformation Works

The bidirectional evaluator enables propagation of modification of the result of forward transformation. This section explains the mechanism.

From a theoretical point of view, the forward evaluator produces a result graph under a set of variable bindings, which consists of pairs of variable name and bound (graph) value. The pairs are collectively called environment. At the beginning of the evaluation, we assume only one2 global variable $db, which is bound to an input graph, resides in the environment.

During the backward evaluation, the evaluator produces, using updated result graph, original environment and input expression, a new environment in which the bound values (graphs) reflect the modification by users on the result graph. We can extract the modified input graph by extracting the value bound to $db.

As for limitation, you can edit only the portion that comes from the input graph. For example, assume the forward transformation is {result: $db}3, which prepend an edge ``result'' on top of the input graph (bound to $db). Then, if you edit the edge ``result'', there is no way to reflect the effect back to input graph, because that edge is not produced by the input graph. System captures this situation by raising an exception (error message).

Figure 3: Disallowed modification
\includegraphics[width=6.0cm]{failcase.eps}

Putting it more precisely, this kind of modification violates one of the required round-trip properties called PutGet [2]: another forward computation using the modified environment should reproduce the same (modified) graph. In this case, the forward evaluation would produce the original graph (g instead of g' in Figure 3).

The most simple case in which a backward modification is guaranteed to succeed, is the case when forward transformation expression is $db itself. In this case (variable reference case in general), the backward evaluator simply replace the bound value of the variable $db with the updated value. By looking up the binding of $db, which obtain the updated value, you can obtain the new source graph.

One of the crucial parts of the backward evaluation is to determine to which operands should we propagate the updating effect, given an operator. In case of the singleton (or edge constructor) as in {result:$db} above, we decompose modified graph g' into top-level edge and the rest of the graph. Backward evaluation is applied recursively on these components. As another example, consider the append ( \ensuremath{\mathbin{
\mathchoice
{\mbox @} {\mbox{\small @}} {\mbox{\scriptsize @}} {\mbox{\scriptsize {\tiny @}}} }}) case. It glues the output nodes in the left hand side with input nodes on the right hand side who has matching markers.

Figure 4: Decomposing the result of Append operator
\includegraphics[width=4.0cm]{bd_append.uncal.eps}

Figure 4 shows the result of forward computation of expression

  (&z1 := {a : {src : &}},
   &z2 := {b : {src : &}}) @ $db
Had the graph bound to $db been actually glued with the result of evaluating (&z1:={a:{src:&}}, &z2:={b:{src:&}}), it would have been much difficult to decompose since we should split glued (I/O) nodes. We instead deliberately leave $ \varepsilon$ edges between these nodes. These are shown in Figure 4 as two edges labeled with ``!'' (bang) between node n1015 and n5 as well as n1012 and n5.

The most interesting case is that of rec operator. For the sake of decomposition, we use bulk semantics for bidirectionalised evaluator. This semantics generates a lot of $ \varepsilon$ edges as well as unreachable parts. The latter also participates in backward evaluation, whereas the former also plays an important role in decomposition of the modified graph. Please refer to section 4 of [3] for details.

5 Queries you can Choose

You can currently choose the following queries. ``(in UnQL)'' and ``(in UnQL+)'' indicates that the queries are written in UnQL and UnQL+, respectively, which are translated to UnCAL before evaluation. See also our technical report [3] in which Examples 1 through 10 appear.

Example 1 (in UnQL)
returns subgraphs that are pointed by ``b'' from the root of $db.
select $G where {b : $G} in $db

Example 2 (in UnQL)
returns all subgraphs that are pointed by either ``b'' or ``a'' from the root of $db.
select ($G1 U $G2) where {b : $G1} in $db,
                         {a : $G2} in $db

Example 3 (in UnQL)
extracts all subgraphs that are pointed by any path from the root of $db ended with an edge labeled ``a'' or ``b'', keeps those subgraphs that do not contain an edge of ``a'' from their root, and glues that results with new edges labeled 'result'.
select {result : $G1} where {_*.(a|b): $G1} in $db,
                            {$l : $G2 } in $G1,
                            not ($l = a)

Example 4(1) in UnQL
replace ``a'' by ``d'' and remove ``c''
select letrec
  sfun a2d_xc({$l : $g}) = 
        if $l = a then {d :  (a2d_xc($g))}
   else if $l = c then        a2d_xc($g)
   else                {$l : (a2d_xc($g))}
  in a2d_xc($db)

Example 4(2) (in UnQL)
removes edges until ``b'' is encountered, below which ``a'' is replaced by ``e''.
select
  letrec
       sfun h({b  : $g}) = {b : (a2e($g))}
          | h({$l : $g}) =         h($g)
  and  sfun a2e {a :  $g} = {e  : (a2e($g))}
          | a2e {$l : $g} = {$l : (a2e($g))}
  in h($db)

Example 5
represents source DB of simple bidirectional transformation examples
cycle((
  &  := {a:{a:&z1},b:{a:&z1},c:&z2},
  &z1:= {d:{}},
  &z2:= {c:&z2}
))

Example 6
replaces ``a'' to ``d'' while removing ``c''
rec(\ ($l,$g).
       if $l = a then {d: &}
  else if $l = c then & 
  else                {$l: &})($db)

Example 7 in UnQL
uses the same query as in Example 4(1) ( replace ``a'' by ``d'' and remove ``c'') with different evaluation option: check ``Apply Skolem terms'' and uncheck ``Glue safe epsilons in the view''. This example shows the bare output of the bulk semantics, where each edge produces a subgraph that is produced by evaluating the body of rec. Skolem terms FrE(InT(1016),(5,a,4),1021) shows a node that is produced by an expression whose ID is 1016 ({_:_}) in this case) from an edge (5, a, 4) in the input graph of the expression rec whose ID is 1021. Note that the Skolem term nests as a result of expression nesting. Careful reader may notice the excess branch for each of the entry point of the subgraph produced by the body of rec. They are the result of $\ldots \ensuremath{\cup}\xspace \ensuremath{\{\}}$ associated with union sequence $\ensuremath{\{g_1,g_2, \ldots g_n\}}$, just like nil at the end of list in functional programming -- $\ensuremath{\{a:\ensuremath{\{\}}\}}$ and $\ensuremath{\{a:\ensuremath{\{\}}\}}\ensuremath{\cup}\xspace \ensuremath{\{\}}$ are (bisimulation) equivalent.

Example 8 (Edge Renaming)
renames ``a'' to ``b''
rec(\ ($l,$G).& := if $l = a then { b:  & }
                   else           { $l: & } )($db)

Example 9
extracts subgraph pointed by path ``a.b''.
rec(\ ($l,$g).
   if $l = a 
   then rec(\($l',$g').
          if $l' = b
          then $g' 
          else {})($g) 
    else {})($db)

Example 10 (Customer2Order)
transforms customer-centric data to order-centric data
select 
  {order: 
    {date: $date,
     no: $no,
     customer_name: $name,
     addr: $a}
  }
where 
  {customer.order: $o} in $db,
  {order_of:$c, date:$date, no:$no} in $o,
  {add:$a, name:$name} in $c,
  {code:$code, info:$info, type:shipping} in $a

Replace (in UnQL+)
replaces ``dest'' pointed by the path ``Association'' form the root with ``tgt''.
replace $G by {tgt:$G1}
where {Association: $G} in $db,
      {dest: $G1} in $G

bd_sng.uncal
prepends an edge ``result'' to the root of the input graph (shown in Figure 2) bound to variable $db.

{result:$db}

bd_let.uncal
also prepends an edge ``result'' to the root of the input database bound to variable $db via variable binding of $v.

let $v = $db in {result:$v}
This example shows the backward propagation of modification via variable binding constructs. The sample modification provided renames the edge from n0 to itself (cycle) to ``ModC''. You can see the reflection of this modification back to source after backward evaluation.

bd_rec.uncal
replaces labels ``a'' at the highest level with ``xa''.

&z1@rec(\ ($L,$T).
     if $L=a then (&z1:= {xa:&z2},&z2:={$L:&z2})
     else         (&z1:= {$L:&z1},&z2:={$L:&z2})
   )($db)
Considering that the input data includes a cycle, this example shows our framework, as with the case of the original UnCAL, works in the presence of cycles. In the result graph of the forward evaluation, there are a lot of $ \varepsilon$ edges (denoted by edges marked with ``!'') as well as unreachable parts. These parts should not be visible to end users since they are not allowed to edit. We plan to hide these parts in the graph presented to the end users. But here these are exposed to demonstrate how it works. The sample modification provided renames the label of edge between n1009 and n1008 to ``ModD''. You can see after backward evaluation, the modification is reflected to the edge from n2 to n1 in the input graph.

bd_append.uncal
prepends a graphs shown in Figure 5 to the source graph.

(&z1 := {a : {src : &}},
 &z2 := {b : {src : &}}) @ $db

Figure 5: A graph prepended to the input graph
\includegraphics[width=3.0cm]{g2.eps}

Since this forward transformation expression includes input and output marker as well as disjoint union (the expression (e1, e2) is a syntactic sugar of e1 $ \oplus$ e2), this example demonstrates the treatment of these constructors as well.

bd_cycle.uncal
forms a (self) cycle with edge labeled ``back'', while the graph shown in Figure 5 is connected to the node with the (self) cycle via edge labeled ``src''.

cycle((&z1 := {src: &z2, back: &z1},
       &z2 := $db))

While \ensuremath{\mathbin{
\mathchoice
{\mbox @} {\mbox{\small @}} {\mbox{\scriptsize @}} {\mbox{\scriptsize {\tiny @}}} }} connects matching I/O nodes between two graphs, cycle do the same thing within a graph. The operand of cycle here has two components: one with I/O markers ( \ensuremath{\{\ensuremath{\mbox{\rm\tt\&}\hspace{-0.07em}{z_1}}\}}, \ensuremath{\{\ensuremath{\mbox{\rm\tt\&}\hspace{-0.07em}{z_1}},\ensuremath{\mbox{\rm\tt\&}\hspace{-0.07em}{z_2}}\}}) and the other with I/O markers ( \ensuremath{\{\ensuremath{\mbox{\rm\tt\&}\hspace{-0.07em}{z_2}}\}},{}). Matching I/O nodes marked &   z1 in the first component forms cycle, while another matching pair marked &   z2 forms inter-component connection from first component to the second, which is further connected to graph bound to $db.


6 Editing the Sample Transformations

If you are dissatisfied with the given transformation example, you can edit it in-place at the textbox in the ``Edit UnQL or UnCAL query selected'' page before pressing ``submit'' button.

Whether the expression you edit is interpreted as UnQL+/UnQL or not depends on whether you have selected UnQL+/UnQL sample or not. If you want to submit your own UnCAL query, you should select an UnCAL example at the first page.

You can see the ``Query or input has been edited'' checkbox below the textarea in which you can edit the query. This is checked automatically if the presented query is edited and informs the system of the fact so that the ready-made modified target example is invalidated because the result produced by the custom query and the ready-made example may no longer be consistent.

If you submit large or complex queries, system may time out (current setting is 5 minutes) to prevent server overloading.

7 Checking Options

In the ``Options'' section of the ``Edit UnQL or UnCAL query selected'' page, you can issue selected query in various configurations that are described below.

7.1 Appearance of the result

[Show unreachable parts] shows nodes and edges that are not reachable by traversing from the root node. Cycle, \ensuremath{\mathbin{
\mathchoice
{\mbox @} {\mbox{\small @}} {\mbox{\scriptsize @}} {\mbox{\scriptsize {\tiny @}}} }} and rec operators may introduce unreachable parts, because these operators glue subgraphs generated by subexpressions by producing $ \varepsilon$ edges that connects matching input and output nodes of the subgraphs. Input nodes that has no counterpart will leave the node and reachable parts thereof unreachable. Since unreachable parts as well as reachable parts participates in backward evaluation, checking this option would help understanding of not only forward semantics but also backward semantics.

[Glue safe $ \varepsilon$ in the view] glues source node and target node of each epsilon edges and remove that edge if it is safe to do so. In general, if source of $ \varepsilon$ edge is an input node or has other outgoing edges, and target of the $ \varepsilon$ edge has other incoming edges, gluing would introduce spurious path that is inexistent before the gluing, so gluing does not take place. Another source of danger is the ambiguity of correspondence between graphs before and after gluing. If the two edges end up with sharing their source and target as a result of gluing, and users edit either of both of the edge, it is difficult to determine to which edge before gluing the editing effect should be propagated. In this situation, check is ignored and gluing does not take place.

Checking this option produces simple graph and easy to edit, while unchecking helps understanding of the forward semantics of the constructors.

Since rendering large graph is time-consuming (it may take several minutes), if a query that is expected to produce large graph if gluing would not have been applied is selected, this checkbox is checked and disabled.

7.2 Evaluation option

[Apply Skolem terms] uses constructors that enables rendezvous between connected subgraphs. For example, bulk semantics uses Hub(nodeid,marker,exprid) 4 where expid identifies subexpression, and nodeid itself is a node ID. Bulk semantics generates nodes and edges independently from each input node and edge. Connecting these components are possible thanks to the Skolem terms. While edges (n1, l1, n2) and (n3, l2, n4) can be connected via node IDs n5 if n2, n3 and n5 coincide, the Skolem terms generated by rec and other constructors enable this coincident.

[Apply rewriting] applies UnCAL-level rewriting rules. This includes fusion rules of two successive applications of rec in [1].

[Escape 1st operand of \ensuremath{\mathbin{
\mathchoice
{\mbox @} {\mbox{\small @}} {\mbox{\scriptsize @}} {\mbox{\scriptsize {\tiny @}}} }} ] applies Skolem term to each node of the graph generated by the first operand of \ensuremath{\mathbin{
\mathchoice
{\mbox @} {\mbox{\small @}} {\mbox{\scriptsize @}} {\mbox{\scriptsize {\tiny @}}} }}. Users usually do not have to check this option.

This option makes difference when you submit the following query.

let $v = {a: &} in
($v @ {b}) U ($v @ {c})

You can see in the result graph that the result of $ \cup$ produced only one $ \varepsilon$ edge. This is because the first operands of both of the \ensuremath{\mathbin{
\mathchoice
{\mbox @} {\mbox{\small @}} {\mbox{\scriptsize @}} {\mbox{\scriptsize {\tiny @}}} }} operators are shared because they are produced by the same variable reference $v. This makes backward evaluation fail. Checking this option will prevent this failure by splitting the two results of references of $v using expression id assigned differently for the two \ensuremath{\mathbin{
\mathchoice
{\mbox @} {\mbox{\small @}} {\mbox{\scriptsize @}} {\mbox{\scriptsize {\tiny @}}} }}.


8 How to Edit the Result

If you are dissatisfied with the given modification example, you can download the result of forward evaluation and submit the modified graph of your own.

Bdotty, our small extension of DOTTY editor, should be used in this editing. The tool should be installed easily if you already have DOTTY installed, which can be compiled on most of the operating systems. Please refer to Bdotty User's guide for installation instructions. Here we assume you have Bdotty command installed.

At the ``UnQL+/UnCAL expression evaluation result'' page, switch the radio button from ``Use edited sample dot file'' to ``Edit the sample edited dot file and upload my own:'' and follow the link ``download'' to obtain the result graph in DOT format. In Figure 6, the node n0 and the associated edge labeled ``c'' is about to be deleted.

Figure 6: DOTTY screen shot while editing the result of forward evaluation
\includegraphics[width=8.0cm]{bdottyscreen.eps}

You can also edit the label of an edge by selecting ``set edge label'' on the context menu that appears when it is called while pointing on the circle on the edge. On the text box in the dialog window, type in ``XXX'' to set the label to value ``XXX''. Select ``save graph'' on the same context menu (or type keyboard shortcut 's') and upload the DOT file by specifying the file in the file upload box on the demo page. Press ``submit'' to see the result of the backward evaluation.

9 Using the Result of Backward Transformation

At the final page ``UnCAL expression backward evaluation result'', you will find an image that represents updated source DB at the ``Updated source DB'' section. Modification propagated by the backward transformation is indicated by the color of edge or nodes: deleted parts are displayed in lightpink, added parts are displayed in purple, and modified labels are displayed in red. Unreachable parts (if any) are displayed in gray.

For Further Editing Cycle

``download'' link at the bottom will let you save the updated source DB in UnCAL format so that you can conduct further bidirectional transformation by uploading this file in ``Select source DB'' section at the initial page in this demo.

Bibliography

1
Peter Buneman, Mary F. Fernandez, and Dan Suciu.
UnQL: a query language and algebra for semistructured data based on structural recursion.
VLDB Journal: Very Large Data Bases, 9(1):76-110, 2000.

2
J. Nathan Foster, Michael B. Greenwald, Jonathan T. Moore, Benjamin C. Pierce, and Alan Schmitt.
Combinators for bi-directional tree transformations: a linguistic approach to the view update problem.
In POPL '05: ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 233-246, 2005.

3
Soichiro Hidaka, Zhenjiang Hu, Kazuhiro Inaba, Hiroyuki Kato, Kazutaka Matsuda, and Keisuke Nakano.
Bidirectionalizing structural recursion on graphs.
Technical Report GRACE-TR09-03, GRACE Center, National Institute of Informatics, August 2009.

4
Soichiro Hidaka, Zhenjiang Hu, Hiroyuki Kato, and Keisuke Nakano.
An algebraic approach to bidirectional model transformations.
Technical Report GRACE-TR08-02, GRACE Center, National Institute of Informatics, September 2008.

5
Soichiro Hidaka, Zhenjiang Hu, Hiroyuki Kato, and Keisuke Nakano.
A compositional approach to bidirectional model transformation.
In ICSE Companion, pages 235-238. IEEE, 2009.



Footnotes

... system1
http://hoagland.org/Dot.html
... one2
Bidirectional interpreter itself can treat multiple global variables. This limitation only apply to this demonstration environment.
...3
You can test this in bd_sng example described below.
...Hub(nodeid,marker,exprid) 4
Original bulk semantics in [1] uses Skolem term S1 instead of Hub.
Big Team
2009-08-31