XQuery

XQuery is a computer query language that not only retrieves information from an XML document or XML document collection, but also performs complex calculations from extracted information and reconstructs new documents or XML fragments.

XQuery is a W3C specification whose final version 1.0 was released in January 2007, and which took nearly eight years to complete. XQuery has been developed in conjunction with XSLT 2 , a major revision of XML transformation language XSLT , with which it shares a subset XPath 2.0  (en) .

XQuery plays a similar role to XML data as the SQL language with respect to relational data, and analogies can be found between these two languages.

Syntaxes

There are two separate syntaxes for XQuery:

  • The non-XML “natural” syntax also known as FLWOR (pronounce flower), whose name comes from the five principal clauses that compose it ( for , let , where , order by and return );
  • The XQueryX syntax (for “XML Syntax for XQuery”), in which a query is an XML document. Therefore, it is much more verbose and less readable than the previous one and is destined for formal manipulations by programs (possibly themselves written in XQuery).

Example

Either the following XML document, located at URL http://www.example.com/ and named example.xml:

 <employees>
 <employee>
 <name> during </ name>
 <firstname> Albert </ firstname>
 <birth_date> 09/23/1958 </ birth_date>
 </ employee>
 <employee>
 <name> Smith </ name>
 < firstname> Alphonse </ firstname>
 <birth_date> 23/12/1975 </ birth_date>
 </ employee>
 <employee>
 <name> Smith </ name>
 <firstname> Isabelle </ firstname>
 <birth_date> 03/12/1967 </ Date_naissance>
 </ Employee>
 ...
 </ Employees>

The following FLWOR query:

For $ b in document ("http://example.com/example.xml")//employe
 where $ b / name =" Dupont "
 return
 <Dupont> {
 $ B / first name,
 $ B / birthdate
 } </ Dupont>

Will return the following result:

 <smith>
 <firstname> Alphonse </ firstname>
 <birth_date> 23/12/1975 </ birth_date>
 </ smith>
 <smith>
 <firstname> Isabelle </ firstname>
 <birth_date> 03/12/1967 </ birth_date>
 </ Dupont>

Language Components

XQuery is a modularly specified language: the core of the language can be augmented by optional modules.

  • The minimal language is based on the XPath 2 standard(which specifies the XML query language itself), augmented by the following main features:
    • The FLWOR expression, a powerful loop statement, with many features, that is quite similar to SQL SELECT. With where it is possible to write internal or external joins . XQuery version 1.1 adds the group by , and the “windowing” (possibility to cut the input sequence according to boolean conditions). There are other constructs such as if and typeswitch that can be composed with FLWOR.
    • The constructors , instructions to reconstruct XML fragments, with a syntax very similar to XML itself (a well-formed fragment of XML is actually a valid XQuery expression). This allows to write templates with dynamically evaluated expressions, in the manner of the many languages ​​(example: PHP) of generation of web pages.
    • The functions defined by the user.
    • A set of predefined functions and operators common to XPath2, XQuery and XSLT 2.

Optional modules:

  • The optional Full Axis module allows you to take advantage of the ancestor-or-self, following, following-sibling, preceding , and previous-sibling “axes” in an XPath expression.
  • XQuery modules allow you to import XQuery function libraries or variables into an XQuery program.
  • The optional Schema Import module allows you to specify the XML schemas to which the manipulated data responds, thus making it possible to infer the types of certain expressions, and possibly to optimize the queries.
  • The optional Schema Validation module makes it possible to use the validation mechanism of XML fragments in relation to schemas.
  • The optional “Static Typing” module supports pre-run type checks.

Extensions:

  • XQuery Update Facility is being developed standard (draft standard 2010) by extending XQuery XML nodes modification instructions:insert node, delete node, rename node, replace node, replace value, copy/modify.
  • XQuery Full-Text is an extension under development (preliminary standard in 2008) specifying the text search integrated with XQuery. It allows the contextual search of words and phrases, that is to say restricted, for example, to the content of a particular XML element.
  • XQuery Scripting , under development (Working Draft of April 2010), changes the programming model to add mandatory instructions (sequential blocks,while,exit) ensuring the execution order.

Language Characteristics

  • XQuery is a functional language (where any construct returns a value), so there is no edge effect , that is, it does not directly modify the data on which it works. XQuery Scripting is an exception to this principle.
  • Unlike most functional languages, XQuery does not have any second-order functions (which can be arguments for other functions). This changes in XQuery 1.1.
  • XQuery can optionally be strongly typed (in the sense of XML schemas) at compilation and execution.
  • These aspects are shared with XSLT 2, which is a language close to XQuery in its functionality.
  • Programming in XQuery is usually a more “imperative” style than in XSLT, ie it requires a priori knowledge of the structure of manipulated XML data. Conversely, XSLT specifies declaratively treatments on each type of XML nodes, independently of each other. The programming in XQuery is therefore more natural, but a little less powerful and modular than in XSLT.
  • The XQuery Scripting extension offers a more traditional programming model.

Data Model

  • All values ​​manipulated by XQuery (as well as XPath2 and XSLT 2) are sequences (or lists) of items . There are no nested sequences: a sequence of sequences is always “flattened”.
  • The items are divided into two main groups:
    • The XML nodes, which themselves are of six different species: document, element, attribute, text, comment, processing-instruction .
    • The basic types (borrowed from XML Schemas) numbering 48, including numeric types (integer, decimal, floating), strings and derivatives, dates, instants and durations.
  • The base types can be extended by importing Schemas (optional functionality).

Examples:

  • The expression 1 to 5 returns the sequence of integer items: 1 2 3 4 5 .
  • The expression for $ i in 1 to 5 return $ i * $ i returns the sequence of integer items: 1 4 9 16 25 .
  • The expression for $ i in 1 to 3 return <X> {$ i} returns the sequence of nodes elements: <X> 1 </ X> <X> / X> .
  • A sequence is not necessarily of a homogeneous type. For example, the expression (1, 2), 2.5, (true (), “text”) returns the sequence of items 1 2 2.5 true () of the text of types integer (2 times), decimal, Boolean, character string.