The Chandler Query System

Overview

The Chandler Repository provides a mechanism for performing queries over the contents of the repository. These queries are declarative: you specify a set of conditions that you want Items to satisfy, and the query processor takes care of retrieving the relevant items. Queries are specified using strings, and the syntax of these strings is inspired by XQuery.

An unusual feature of the query system is the ability to have a query subscribe to changes in the repository. When items are changed, the repository is able to notify the query of that change. The query processor determines whether that change would cause the item to become part of the result of an existing query, or whether the change would cause the item to no longer be part of a query result set. This allows queries to keep their result set up to date automatically.

If you haven't already read the document, "The Busy Developer's Guide to the Chandler repository", you should do so before continuing. Throughout this document we'll use the following terms:

query
A general term denoting a set of boolean conditions which in turn specify a set of Items. Also a specific Python class that provides a programmer with an API for manipulating a query.
query string
A Python string that contains a textual representation of the query
query result set
A set of Chandler Items that satisfy the boolean conditions for a query

Python API

The first thing that you'll need to know when working with queries is how to create them. If you are working within the Chandler Parcel system, then you will be working with queries via parcel.xml when you deal with ItemCollections. When working with queries via parcel.xml, you will only need to know about the string syntax for queries, which is described in the next section. Here's an example of parcel.xml usage, the ItemCollection for Chandler's task list:
  <contentModel:ItemCollection itsName="taskItemCollection">
    <displayName>TaskList</displayName>
    <_rule value="for i in '//parcels/osaf/contentmodel/tasks/TaskMixin' where True"/>
  </contentModel:ItemCollection>
You just supply a query string as the value of the _rule attribute.

To use a query from Python, your code will look something like this

import repository.query.Query as Query

p = rep.findPath('//Queries')
k = rep.findPath('//Schema/Core/Query')
q = Query.Query("for i in '//Schema/Core/Kind' where True", p, k)

for i in q:
    print i
Queries are Chandler Items, which means that they are stored in the repository themselves. The main query class is in repository.query.Query. To instantiate this class, you supply a query string, a path to the query's parent in the repository, and the Kind for a Query (these last two arguments are for the Item constructor). The query in the example specifies all items of type //Schema/Core/Kind.

Once you have an instance of Query, you can ask for the result set of the query. You have two choices for how to do this. You can ask for the 'resultSet' attribute of the Query item, which will return a reference collection containing the items in the result set. You can also call the __iter__ method on the Query Item by iterating over the Item (as shown in the example). This gives you a Python generator for the query results, which can be more efficient thatn asking for the resultSet attribute.

You can update the value of the query string by changing the value of the attribute queryString:

q.queryString = "for i in '//parcels/osaf/contentmodel/contacts/Contact' where True"
The query string may be the empty string, "". In this case, the result set of the query is empty.

Parameters allow you to use values from your Python code in a query. Parameter names are strings that begin with "$". To use parameters, you should set the query's args attribute to a dict containing the arguments. This code shows how to pass reference collection as a parameter to a query. The parameter name is $name. The value of $name is a tuple containing the UUID of the Chandler item that has a reference collection attribute and the name of the reference collection attribute (a string).

q.args = {}
q.args["$name"] = (item.itsUUID, attributeName)
This code shows how to pass an ordinary value -- this allows you to make comparisons to data in your Python code (indicated by the variable name in this case). The name of the parameter is $myname and it's value is a list containing the value of the variable name from your Python code.
q.args["$myname1"] = [ name ]

You can view the entire API for the Query class online.

Query String Syntax

There are several kinds of queries. Simple queries are queries over sets of items. You can write a simple query using the for statement. Compound queries are queries composed from other queries. Keywords and required tokens in a query are shown in bold. The portions of the query that you supply are written in italics

for queries

The most basic query is a for query. The syntax of a for query is:

for var in|inevery set where boolean-condition

The result set of a for query is the set of items in set which satisfy the boolean-condition Here's what you need to supply for the various portions of a for query:

var is the iteration variable for the query. For now, you must use 'i'. This constraint will be removed in the future.

If your for query is iterating over Kinds, and you want to include items of a Kind's subkind, you should use the keyword inevery instead of in.

set specifies a set of Items to apply the boolean-condition to. This allows you to issue a query over a subset of the repository. At the moment you can supply one of three possibilities for set:

The boolean-condition is an expression which can refer to the iteration variable and parameter values.

Here are the elements that you can use in the boolean condition (A BNF grammar for query strings appears at the end of this document).

The iteration variable for the query

At the moment, just the variable i. In the future more variable names will be allowed.

The names of Chandler item attributes

You may use the attributes of an Item. For example, most Items have an itsName attribute, so
i.itsName
will give the value of itsName for the current value of i. You can also call methods on Items (since method names are attributes)

Literal values

numeric literals
You can use any integer
string literals
String literals must be enclosed in either single (') or double (") quotes.
boolean literals
The Python boolean literals True and False

Parameters

This query shows that you can use parameters (like $0) in the where clause as well, allowing you to use run time values from your Python program inside a query.
q.queryString="""for i in "//Schema/Core/Kind" 
    where contains(i.itsName,$0)"""

Unary (prefix) operators

+ expr
Make numeric expression expr positive
- expr
Make numeric expression expr negative
not expr
Negate boolean expression expr

Boolean operators

expr1 and expr2
Perform the logical "AND" of expr1 and expr2
expr1 or expr2
Perform the logical "OR" of expr1 and expr2
not expr
Negate expr

Relational operators

expr1 >= expr2
Return true if the numeric/date expression expr1 is greater than or equal to the numeric expression/date expr2
expr1 <= expr2
Return true if the numeric/date expression expr1 is less than or equal to the numeric/date expression expr2
expr1 > expr2
Return true if the numeric/date expression expr1 is greater than the numeric/date expression expr2
expr1 < expr2
Return true if the numeric/date expression expr1 is less than the numeric/date expression expr2
expr1 == expr2
Return true if expr1 and expr2 are equal according to the equality rules for their Kinds
expr1 != expr2
Return true if expr1 and expr2 are not equal according to the equality rules for their Kinds

Arithmetic operators

expr1 + expr2
Add the numeric expressions expr1 and expr2
expr1 - expr2
Subract the numeric expression expr2 from the numeric expression expr1
expr1 * expr2
Multiply the numeric expression expr1 by the numeric expression expr2
expr1 / expr2
Divide the numeric expression expr1 by the numeric expression expr2
expr1 div expr2
Produce the result of integer dividing the numeric expression expr1 by the numeric expression expr2
expr1 mod expr2
Produce the remainder of dividing the numeric expression expr1 by the numeric expression expr2

Dates

You can supply dates and times in eGenix mxDateTime ISO format like this:
date(ISO-date-string)
Construct a date instance that represents ISO-date-string

This example shows how to use dates in a query. Note the use of the date constructor to create a date literal which is then compared to the startTime attribute of i (i will be a CalendarEvent)

q.queryString="""for i in '//parcels/osaf/contentmodel/calendar/CalendarEvent' 
    where i.startTime > date('2004-09-31 12:34:56')"""

Functions


At the moment there are only three functions that you can call from queries. We will be expanding this set of functions as the system develops.
len(expr)
Return the length of expr. expr can be a string or a list Kind
contains(string, substring)
Return true if string contains substring
hasAttribute(string)
A method on Chandler Items that returns True if the Item has an attribute named string

Union Queries

union(query1,query2, ... , queryn)
Compute the union of query1..queryn and return that as the result. Any item that appears in the result set of any of the queries will appear in the result set of the union.
This query is composed of three for queries that show the same pattern. They all use a Kind path as the source set, and use True as the where clause, to indicate all items of a particular kind. The union operator simply creates the union of the three for queries.
q.queryString="""union(for i in "//parcels/osaf/contentmodel/calendar/CalendarEvent" where True,
      for i in "//parcels/osaf/contentmodel/Note" where True, 
      for i in "//parcels/osaf/contentmodel/contacts/Contact" where True)"""

Intersection Queries

intersect(query1,query2)
Compute the intersection of query1 and query2 and return that as the result. Items in the result set of the intersection must appear in the result set of both query1 and query2.
This query computes the intersection of those Kind items whose name contains 'o' and those Kind items whose name contains 't'
q.queryString="""intersect(for i in '//Schema/Core/Kind' where contains(i.itsName,'o'),
          for i in '//Schema/Core/Kind' where contains(i.itsName,'t'))"""

Difference Queries

difference(query1,query2)
Compute the difference of query1 and query2 and return that as the result. Items in the result set of the difference consist of any Item that is in the result set of query1 which is not in the result set of query2. You can think of this as starting with the result set of query1 and removing any Item which appears in the result set of query2.
The result of this query is those Kind items whose names contain 'o' and do not contain 't'.
q.queryString="""difference(for i in '//Schema/Core/Kind' where contains(i.itsName,'o'),
           for i in '//Schema/Core/Kind' where contains(i.itsName,'t'))"""

Query Notification

A key feature of Chandler queries is change notification. A Chandler query defines a set of items. The notification mechanism makes sure that the result set of the query always contains the correct Items. If you change an item so that it satisfies the conditions of some query, the notification mechanism will add that item to the result set of the query. If you change an item so that it no longer satisifies the conditions of a query then the notification mechanism will remove that item from the result set of that query. The query notification mechanism does not indicate that any attribute of any item in a query's result set has changed. It just keeps the right items in the result set.

Clients of a query can request that they be notified when the query notification mechanism notices items that enter or exit the query result set. Your client code supplies a Chandler Item which has a callback method. In the Chandler application, this item will usually be an ItemCollection. The callback method will be passed a tuple containing two lists: a list of the UUID's of any items added to the query result and a list of the UUID's of any items removed from the query result.

Your code makes a request for notification by calling the subscribe method on the Query Item. This method has two mandatory parameters and two optional parameters. The first parameter is an Item that has the required callback item, and the second parameter is that name of that callback method. The two optional parameters are a little more difficult to explain. The repository's concurrency model gives each thread a separate view of the items in the repository. You can select when you would like to be notified of changes. The options are:

  1. be notified of changes that happen in your view (the same view that the query is being run in) - instantaneously
  2. be notified of changes that happen on views outside your own - when your view commits
  3. be notified of both kinds of changes

The default is to be notified of both kinds of changes. The optional parameters are Booleans that you can set to False if you don't want to be notified if changes in your view (the first optional parameter) or of changes in other view (the second optional parameter)

No matter how you set the view notification parameters, you will only be notified of changes that would add or remove items from the query result set.

q.subscribe(item, methodName, inSameView, inOtherViews)
call methodName on item when changed items enter or leave the query result set. If inSameView, is true, the callback will be called as soon as the item is changed, if the item is in the same repository view as the Query. If inOtherViews is true, the callback will be called when commit is called
At the appropriate moment, the query system will call all subscribed methods. These methods might look like this:
def handle(self, changes):
    added, removed = changes
    print added, removed # simple action
The changes argument to the callback method is a tuple containing two lists. The first is a list of all the items which were added to the query result set. The second is a list of all the items which were removed from the query result set.

Future plans

We are planning a number of improvements to the query system:
Performance enhancements
There are a number of ways to improve the performance of queries in Chandler. This work will be ongoing over the next several releases.
Debugging tools
In a future version of Chandler, we will provide an interactive means for testing queries. This is not to be confused with a general end user query facility, which is also planned for a future version of Chandler.

We want to update and improve this document

Please send any comments to dev@osafoundation.org.

Appendix 1: Grammar for Queries

Non Terminals in plain Terminals in bold
NUM: '[0-9]+'
STRING: '"([^\\"]+|\\\\.)*"|\'([^\']+|\\\\.)*\''
PARAM: '\$[0-9]+'
UNOP: '(not|\+|-)'
MULOP: '(\*|/|div|mod)'
ADDOP: '(\+|-)'
RELOP: '(==|!=|>=|<=|>|<)'
BOOLOP: '(and|or)'
ID: '[a-zA-Z]+'
END: '$'

stmt: union_stmt END
    | intersection_stmt END
    | difference_stmt END
    | for_stmt END

stmt_list: stmt (, stmt)*

union_stmt: union (stmt_list)

intersection_stmt: intersect (stmt , stmt)

difference_stmt: difference (stmt , stmt)

for_stmt: for ID in | inevery (name_expr where and_or_expr END 
                   | STRING where and_or_expr END 
                   | stmt where and_or_expr ) END

and_or_expr: rel_expr [ BOOLOP rel_expr ]

rel_expr: add_expr [ RELOP add_expr ]

add_expr: mul_expr [ ADDOP mul_expr ]

mul_expr: unary_expr [ MULOP unary_expr ]

unary_expr:  [ UNOP ] value_expr

value_expr: constant
    | PARAM
    | ID [ ( [ arg_list ] )
         | (. ID )+  [ ( [ arg_list ] ) ]
         ]

constant: STRING | NUM

arg_list:  and_or_expr ( , and_or_expr )*

str_list: STRING ( , STRING )*

name_expr: ID | PARAM
    | ftcontains ( str_list ) 

$Revision: 1.4 $
$Date: 2005/03/15 23:00:05 $
$Author: twl $
$Log: chandler-query-system.html,v $ Revision 1.4 2005/03/15 23:00:05 twl Update query docs for 0.5 Revision 1.3 2004/10/21 22:30:39 twl Commit branched docs to trunk Revision 1.1.2.2 2004/10/19 20:03:02 twl Incorporate Ducky's feedback Revision 1.1.2.1 2004/10/18 21:27:03 twl Committing doc changes to branch Revision 1.2 2004/10/15 18:31:04 twl Bugs 2112, 2113 (Doc bugs) Incorporate review feedback Revision 1.1 2004/10/12 20:08:20 twl Fix bug 2112 - 0.4 Query documentation update