rdfDB Query Language

rdfDB uses a high level SQLish query language. The data is modelled as a directed labelled graph (RDF). Nodes in graph can be
  1. Resources : Every Resource is identified by a URI (e.g., foo, http://dmoz.org/#Top). Resources are written as URIs quoted by angle brackets. The Resource whose URI is mailto:guha@guha.com is referred to (in the query language) as .
  2. Integers. Integers are written as such (e.g., 42, 9, 18).
  3. Strings : Strings are UTF8, enclosed by single or double quotes (' or "). e.g., 'foo', 'foo bar'. To insert a single quote iselft into a string, escape it with a slash (\). e.g., 'Paul O\'Brien'. To insert a newline into a string use \n, for example: "Text\nThis is on the next line".
Other datatypes such as floats and dates are coming soon.

All operations revolve around the concept of a "triple". A triple is intended to model the concept of a object with a property value. It consists of

The triple is written using the predicate logic syntax : (<subject> <predicate> <object>).

A collection of triples forms a database. There are no constraints on the set of triples that constitutes the database. (Some other RDF implementation refer to the concept of database as a "model").

Database Operations are divided into the following categories:

Database Creation

Result Codes :
  1. 0 : success
  2. -10 : database could not be deleted. Most likely cause is that the file permissions were wrong. Make sure that rdfDB is allowed to write into the directory RDFDB_DIR. There is no return value.

Loading Files

rdfDB is designed to act as a cache for RDF, RSS, edge-labelled XML and other data out on the network. To facilitate this, it supports the ability to load the contents of a url (that points to an RDF, RSS ... file) into the database.

Result Codes:
  1. 0 : success
  2. -2 : syntax error
  3. -5 : database does not exist
  4. -6 : could not access the url
  5. -9 : unknown file format

Namespace Commands

RDF vocabularies may come from different namespaces. When parsing XML files, rdfDB creates URIs by concatenating the namespace uri (of an element's namespace) with the character '#' and the element name. So, if we have
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dmoz="http://dmoz.org/rdf">

<rdf:Description rdf:about="http://dmoz.org/Auto/CarSeats">
  <dmoz:relatedTo rdf:resource="http://dmoz.org/Children/Safety"/>
</rdf:Description>

</rdf:RDF>
The triple that is added to the database is
(http://dmoz.org/rdf#relatedTo http://dmoz.org/Auto/CarSeats http://dmoz.org/Children/Safety)
In order to simplify the statement of queries, one can set a namespace prefix to correspond to a namespace uri. Result Codes:
  1. 0 : success
  2. -2 : syntax error

Inserts & Deletes

The following two commands are used to add and remove triples.
  1. insert into {database_name} (source1 arc1 target1) [, (source2 arc2 target2) ... ]
    e.g., insert into dmoz (<http://dmoz.org/Top> <narrow> <FlyingPizzas>) </>

  2. delete from {database_name} (source1 arc1 target1) [ , (source2 arc2 target2) ... ]
    e.g., delete from foo (<http://dmoz.org/Top> <narrow> <FlyingPizzas>) </>

    Here arcs, sources and targets must be explicitly specified values. If you want to delete from the database all triples that satisfy some condition, use the next form of delete.

  3. delete from {database_name} where (source1 arc1 target1) [, (source2 arc2 target2) ... ]

    Here you can use variables in the same way as you'll use them in the select queries, described below. All the triples, that match the requirements of the where (...) conditions will be deleted.

    With delete ... where ... statements you can also use optional output [format] exactly as in the select queries, to have all the matching values of the variables be printed out as the triples are deleted.
Result Codes :
  1. 0 : success
  2. -2 : general syntax error
  3. -3 : malformed literal
  4. -5 : database does not exit
  5. -10 : wrong file permissions (could not open DB)

Query

There is one query command which has the syntax : select variable1 [variable2 ... ] from {database} where constraint1 [constraint2 ...] [and condition1 [and condition2 ...]] [ output {output-format} ] </> which returns a set of variable bindings for the selected variables such that the triples in {database} satify constraint1, constraint2 ... under those variable substitutions.

Variables are syntactically designated by symbols starting with the character '?'. e.g., ?name, ?foo.

A constraint is of the form (arc-label source target) where any one or more of arc-label, source or target can be a variable or resource and in the case of the target, also an integer or string. The same variable can appear in multiple constraints.

e.g., select ?x ?y from dmoz where (title ?x ?y), (createdBy ?x RichSkrenta), (type ?x Topic)
List the id's and titles of all objects of type Topic created by RichSkrenta.

The query can optionally specify the output format by adding output {output-format} to the query. The supported output formats are "tab-limited" and "variable-list". I hope to add "javascript" and "rdf-xml" as supported output formats. The default is "tab-limited".

Result structure: The result contains zero or more lines of answers followed by the result code line. The syntax of the answer line depends on the chosen output format. In the case of the "variable-list" format (the default), there is one line per variable binding set which has the syntax
variable1=value1TABvariable2=value2...

In the case of "tab-limited", there is one line with tab-separated variable names, and all following lines are variable values set which has the syntax
value1TABvalue2...
where the order of the values is in the order of the variables in the query. Please note that EACH value is followed by a tab (that includes the last value).

The Win32 port allows optional triples, these are specified by preceding a constraint triple with a questionmark, like this (note the questionmark in front of the second triple):

select ?x ?y from test1 where (?x <worksFor> <W3C>) ?(?x <name> ?y) </>
By using this RDFDB will return all resources with worksFor property value of W3C, in addition to their names if they have one (if they don't RDFDB will return nothing, although it will still output a tab). A possible output would be:
?x[TAB]?y[TAB]
<DanB>[TAB]"Dan Brickley"[TAB]
<DanC>[TAB][TAB]
#0

Result Codes:

  1. 0 : success
  2. -2 : syntax error
  3. -3 : malformed literal error
  4. -4 : general error
  5. -5 : database does not exist
  6. -6 : could not access data
  7. -8 : unconstrained variable

Sample Session

See
this file for an example session.

Result Codes

Here is the complete list of result codes.
  1. 0 : success
  2. -1 : unknown query type
  3. -2 : general syntax error
  4. -3 : malformed literal error
  5. -4 : misc
  6. -5 : database does not exist
  7. -6 : could not access data
  8. -7 : unauthorizes access
  9. -8 : unconstrained variable
  10. -9 : unknown file format
  11. -10 : wrong file permissions