|
|
Index Data > YAZ > YAZ User's Guide and Reference > Chapter 7. Supporting Tools Table of Contents In support of the service API - primarily the ASN module, which provides the pro-grammatic interface to the Z39.50 APDUs, YAZ contains a collection of tools that support the development of applications.
Since the type-1 (RPN) query structure has no direct, useful string
representation, every origin application needs to provide some form of
mapping from a local query notation or representation to a
Z_RPNQuery structure. Some programmers will prefer to
construct the query manually, perhaps using
Since RPN or reverse polish notation is really just a fancy way of describing a suffix notation format (operator follows operands), it would seem that the confusion is total when we now introduce a prefix notation for RPN. The reason is one of simple laziness - it's somewhat simpler to interpret a prefix format, and this utility was designed for maximum simplicity, to provide a baseline representation for use in simple test applications and scripting environments (like Tcl). The demonstration client included with YAZ uses the PQF. NoteThe PQF have been adopted by other parties developing Z39.50 software. It is often referred to as Prefix Query Notation - PQN. The PQF is defined by the pquery module in the YAZ library. There are two sets of function that have similar behavior. First set operates on a PQF parser handle, second set doesn't. First set set of functions are more flexible than the second set. Second set is obsolete and is only provided to ensure backwards compatibility. First set of functions all operate on a PQF parser handle:
#include <yaz/pquery.h>
YAZ_PQF_Parser yaz_pqf_create (void);
void yaz_pqf_destroy (YAZ_PQF_Parser p);
Z_RPNQuery *yaz_pqf_parse (YAZ_PQF_Parser p, ODR o, const char *qbuf);
Z_AttributesPlusTerm *yaz_pqf_scan (YAZ_PQF_Parser p, ODR o,
Odr_oid **attributeSetId, const char *qbuf);
int yaz_pqf_error (YAZ_PQF_Parser p, const char **msg, size_t *off);
A PQF parser is created and destructed by functions
The second set of functions are declared as follows:
#include <yaz/pquery.h>
Z_RPNQuery *p_query_rpn (ODR o, oid_proto proto, const char *qbuf);
Z_AttributesPlusTerm *p_query_scan (ODR o, oid_proto proto,
Odr_oid **attributeSetP, const char *qbuf);
int p_query_attset (const char *arg);
The function
If the parse went well,
The The grammar of the PQF is as follows:
You will note that the syntax above is a fairly faithful representation of RPN, except for the Attribute, which has been moved a step away from the term, allowing you to associate one or more attributes with an entire query structure. The parser will automatically apply the given attributes to each term as required.
The @attr operator is followed by an attribute specification
(
Version 3 of the Z39.50 specification defines various encoding of terms.
Use NoteThis is an advanced topic, describing how to construct queries that make very specific requirements on the relative location of their operands. You may wish to skip this section and go straight to the example PQF queries.
WarningMost Z39.50 servers do not support proximity searching, or support only a small subset of the full functionality that can be expressed using the PQF proximity operator. Be aware that the ability to express a query in PQF is no guarantee that any given server will be able to execute it.
The proximity operator In PQF, the proximity operation is represented by a sequence of the form
@prox
in which the meanings of the parameters are as described in in the standard, and they can take the following values:
(The numeric values of the relation and well-known unit-code parameters are taken straight from the ASN.1 of the proximity structure in the standard.) Example 7.2. PQF boolean operators
@or "dylan" "zimmerman"
@and @or dylan zimmerman when
@and when @or dylan zimmerman
Example 7.4. Attributes for terms
@attr 1=4 computer
@attr 1=4 @attr 4=1 "self portrait"
@attrset exp1 @attr 1=1 CategoryList
@attr gils 1=2008 Copenhagen
@attr 1=/book/title computer
Example 7.5. PQF Proximity queries
@prox 0 3 1 2 k 2 dylan zimmerman
NoteHere the parameters 0, 3, 1, 2, k and 2 represent exclusion, distance, ordered, relation, which-code and unit-code, in that order. So:
So the whole proximity query means that the words
Example 7.7. PQF mixed queries
@or @and bob dylan @set Result-1
@attr 4=1 @and @attr 1=1 "bob dylan" @attr 1=4 "slow train coming"
@and @attr 2=4 @attr gils 1=2038 -114 @attr 2=2 @attr gils 1=2039 -109
NoteThe last of these examples is a spatial search: in the GILS attribute set, access point 2038 indicates West Bounding Coordinate and 2030 indicates East Bounding Coordinate, so the query is for areas extending from -114 degrees to no more than -109 degrees.
Not all users enjoy typing in prefix query structures and numerical attribute values, even in a minimalistic test client. In the library world, the more intuitive Common Command Language - CCL (ISO 8777) has enjoyed some popularity - especially before the widespread availability of graphical interfaces. It is still useful in applications where you for some reason or other need to provide a symbolic language for expressing boolean query structures.
The CCL parser obeys the following grammar for the FIND argument.
The syntax is annotated by in the lines prefixed by
CCL-Find ::= CCL-Find Op Elements
| Elements.
Op ::= "and" | "or" | "not"
-- The above means that Elements are separated by boolean operators.
Elements ::= '(' CCL-Find ')'
| Set
| Terms
| Qualifiers Relation Terms
| Qualifiers Relation '(' CCL-Find ')'
| Qualifiers '=' string '-' string
-- Elements is either a recursive definition, a result set reference, a
-- list of terms, qualifiers followed by terms, qualifiers followed
-- by a recursive definition or qualifiers in a range (lower - upper).
Set ::= 'set' = string
-- Reference to a result set
Terms ::= Terms Prox Term
| Term
-- Proximity of terms.
Term ::= Term string
| string
-- This basically means that a term may include a blank
Qualifiers ::= Qualifiers ',' string
| string
-- Qualifiers is a list of strings separated by comma
Relation ::= '=' | '>=' | '<=' | '<>' | '>' | '<'
-- Relational operators. This really doesn't follow the ISO8777
-- standard.
Prox ::= '%' | '!'
-- Proximity operator
Example 7.8. CCL queries The following queries are all valid:
dylan
"bob dylan"
dylan or zimmerman
set=1
(dylan and bob) or set=1
Assuming that the qualifiers
ti=self portrait
au=(bob dylan and slow train coming)
date>1980 and (ti=((self portrait)))
Qualifiers are used to direct the search to a particular searchable index, such as title (ti) and author indexes (au). The CCL standard itself doesn't specify a particular set of qualifiers, but it does suggest a few short-hand notations. You can customize the CCL parser to support a particular set of qualifiers to reflect the current target profile. Traditionally, a qualifier would map to a particular use-attribute within the BIB-1 attribute set. It is also possible to set other attributes, such as the structure attribute.
A CCL profile is a set of predefined CCL qualifiers that may be
read from a file or set in the CCL API.
The YAZ client reads its CCL qualifiers from a file named
A qualifier specification is of the form:
where
Table 7.1. Common Bib-1 attributes
Refer to the complete list of Bib-1 attributes It is also possible to specify non-numeric attribute values, which are used in combination with certain types. The special combinations are: Table 7.2. Special attribute combos
Example 7.9. CCL profile Consider the following definition:
ti u=4 s=1
au u=1 s=1
term s=105
ranked r=102
date u=30 r=o
You can combine attributes. To Search for "ranked title" you can do ti,ranked=knuth computer which will set relation=ranked, use=title, structure=phrase. Query date > 1980 is a valid query. But ti > 1980 is invalid. A qualifier alias is of the form:
which declares
Directive specifications takes the form
Table 7.3. CCL directives
All public definitions can be found in the header file
To read a file containing qualifier definitions the function
To parse a simple string with a FIND query use the function
struct ccl_rpn_node *ccl_find_str (CCL_bibset bibset, const char *str,
int *error, int *pos);
which takes the CCL profile (
An English representation of the error may be obtained by calling
the
To convert the CCL RPN tree (type
A CCL profile may be destroyed by calling the
The token names for the CCL operators may be changed by setting the
globals (all type CQL - Common Query Language - was defined for the SRU protocol. In many ways CQL has a similar syntax to CCL. The objective of CQL is different. Where CCL aims to be an end-user language, CQL is the protocol query language for SRU. TipIf you are new to CQL, read the Gentle Introduction. The CQL parser in YAZ provides the following:
A CQL parser is represented by the
#include <yaz/cql.h>
typedef struct cql_parser *CQL_parser;
CQL_parser cql_parser_create(void);
void cql_parser_destroy(CQL_parser cp);
A parser is created by To parse a CQL query string, the following function is provided:
int cql_parser_string(CQL_parser cp, const char *str);
A CQL query is parsed by the
int cql_parser_stream(CQL_parser cp,
int (*getbyte)(void *client_data),
void (*ungetbyte)(int b, void *client_data),
void *client_data);
int cql_parser_stdio(CQL_parser cp, FILE *f);
The functions The the query string is valid, the CQL parser generates a tree representing the structure of the CQL query.
struct cql_node *cql_parser_result(CQL_parser cp);
Each node in a CQL tree is represented by a
#define CQL_NODE_ST 1
#define CQL_NODE_BOOL 2
struct cql_node {
int which;
union {
struct {
char *index;
char *index_uri;
char *term;
char *relation;
char *relation_uri;
struct cql_node *modifiers;
} st;
struct {
char *value;
struct cql_node *left;
struct cql_node *right;
struct cql_node *modifiers;
} boolean;
} u;
};
There are two node types: search term (ST) and boolean (BOOL). A modifier is treated as a search term too. The search term node has five members:
The boolean node represents both
Conversion to PQF (and Z39.50 RPN) is tricky by the fact that the resulting RPN depends on the Z39.50 target capabilities (combinations of supported attributes). In addition, the CQL and SRU operates on index prefixes (URI or strings), whereas the RPN uses Object Identifiers for attribute sets.
The CQL library of YAZ defines a
cql_transform_t cql_transform_open_FILE (FILE *f);
cql_transform_t cql_transform_open_fname(const char *fname);
void cql_transform_close(cql_transform_t ct);
The first two functions create a tranformation handle from either an already open FILE or from a filename respectively.
The handle is destroyed by
When a
int cql_transform_buf(cql_transform_t ct,
struct cql_node *cn, char *out, int max);
This function converts the CQL tree
If conversion failed, If conversion fails, more information can be obtained by calling
int cql_transform_error(cql_transform_t ct, char **addinfop);
This function returns the most recently returned numeric
error-code and sets the string-pointer at
The SRU error-codes may be translated into brief human-readable error messages using
const char *cql_strerror(int code);
If you wish to be able to produce a PQF result in a different way, there are two alternatives.
void cql_transform_pr(cql_transform_t ct,
struct cql_node *cn,
void (*pr)(const char *buf, void *client_data),
void *client_data);
int cql_transform_FILE(cql_transform_t ct,
struct cql_node *cn, FILE *f);
The former function produces output to a user-defined
output stream. The latter writes the result to an already
open
The file supplied to functions
Each line is of the form
An RPN pattern is a simple attribute list. Each attribute pair takes the form:
The attribute
The character The following CQL patterns are recognized:
Example 7.10. CQL to RPN mapping file This simple file defines two context sets, three indexes and three relations, a position pattern and a default structure.
set.cql = http://www.loc.gov/zing/cql/context-sets/cql/v1.1/
set.dc = http://www.loc.gov/zing/cql/dc-indexes/v1.0/
index.cql.serverChoice = 1=1016
index.dc.title = 1=4
index.dc.subject = 1=21
relation.< = 2=1
relation.eq = 2=3
relation.scr = 2=3
position.any = 3=3 6=1
structure.* = 4=1
With the mappings above, the CQL query
computer
is converted to the PQF:
@attr 1=1016 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "computer"
by rules CQL query
computer^
is rejected, since CQL query
>my = "http://www.loc.gov/zing/cql/dc-indexes/v1.0/" my.title = x
is converted to
@attr 1=4 @attr 2=3 @attr 4=1 @attr 3=3 @attr 6=1 "x"
Example 7.11. CQL to RPN string attributes In this example we allow any index to be passed to RPN as a use attribute.
# Identifiers for prefixes used in this file. (index.*)
set.cql = info:srw/cql-context-set/1/cql-v1.1
set.rpn = http://bogus/rpn
set = http://bogus/rpn
# The default index when none is specified by the query
index.cql.serverChoice = 1=any
index.rpn.* = 1=*
relation.eq = 2=3
structure.* = 4=1
position.any = 3=3
The
title = a
which is converted to
@attr 2=3 @attr 4=1 @attr 3=3 @attr 1=title "a"
Example 7.12. CQL to RPN using Bath Profile
The file Conversion from CQL to XCQL is trivial and does not require a mapping to be defined. There three functions to choose from depending on the way you wish to store the resulting output (XML buffer containing XCQL).
int cql_to_xml_buf(struct cql_node *cn, char *out, int max);
void cql_to_xml(struct cql_node *cn,
void (*pr)(const char *buf, void *client_data),
void *client_data);
void cql_to_xml_stdio(struct cql_node *cn, FILE *f);
Function
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Copyright Index Data ApS 2008 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||