|
|
Index Data > Zebra > Zebra - User's Guide and Reference > zebrasrv Namezebrasrv — Zebra Server Synopsis
DESCRIPTIONZebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads structured records in a variety of input formats (eg. email, XML, MARC) and allows access to them through exact boolean search expressions and relevance-ranked free-text queries. zebrasrv is the Z39.50 and SRU frontend server for the Zebra search engine and indexer. On Unix you can run the zebrasrv server from the command line - and put it in the background. It may also operate under the inet daemon. On WIN32 you can run the server as a console application or as a WIN32 Service. OPTIONS
The options for
zebrasrv
are the same
as those for YAZ'
yaz-ztest
.
Option
A
For TCP, an address has the form
hostname | IP-number [: portnumber]
The port number defaults to 210 (standard Z39.50 port) for privileged users (root), and 9999 for normal users. The special hostname "@" is mapped to the address INADDR_ANY, which causes the server to listen on any local interface.
The default behavior for
zebrasrv @
zebrasrv tcp:some.server.name.org:1234
zebrasrv ssl:@:3000
To start the server listening on the registered port for Z39.50, or on a filesystem socket, and to drop root privileges once the ports are bound, execute the server like this from a root shell:
zebrasrv -u daemon @
zebrasrv -u daemon tcp:@:210
zebrasrv -u daemon unix:/some/file/system/socket
Here Z39.50 Protocol Support and BehaviorZ39.50 InitializationDuring initialization, the server will negotiate to version 3 of the Z39.50 protocol, and the option bits for Search, Present, Scan, NamedResultSets, and concurrentOperations will be set, if requested by the client. The maximum PDU size is negotiated down to a maximum of 1 MB by default. Z39.50 SearchThe supported query type are 1 and 101. All operators are currently supported with the restriction that only proximity units of type "word" are supported for the proximity operator. Queries can be arbitrarily complex. Named result sets are supported, and result sets can be used as operands without limitations. Searches may span multiple databases. The server has full support for piggy-backed retrieval (see also the following section). Z39.50 PresentThe present facility is supported in a standard fashion. The requested record syntax is matched against the ones supported by the profile of each record retrieved. If no record syntax is given, SUTRS is the default. The requested element set name, again, is matched against any provided by the relevant record profiles. Z39.50 ScanThe attribute combinations provided with the termListAndStartPoint are processed in the same way as operands in a query (see above). Currently, only the term and the globalOccurrences are returned with the termInfo structure. Z39.50 SortZ39.50 specifies three different types of sort criteria. Of these Zebra supports the attribute specification type in which case the use attribute specifies the "Sort register". Sort registers are created for those fields that are of type "sort" in the default.idx file. The corresponding character mapping file in default.idx specifies the ordinal of each character used in the actual sort. Z39.50 allows the client to specify sorting on one or more input result sets and one output result set. Zebra supports sorting on one result set only which may or may not be the same as the output result set. Z39.50 CloseIf a Close PDU is received, the server will respond with a Close PDU with reason=FINISHED, no matter which protocol version was negotiated during initialization. If the protocol version is 3 or more, the server will generate a Close PDU under certain circumstances, including a session timeout (60 minutes by default), and certain kinds of protocol errors. Once a Close PDU has been sent, the protocol association is considered broken, and the transport connection will be closed immediately upon receipt of further data, or following a short timeout. Z39.50 Explain
Zebra maintains a "classic"
Z39.50 Explain database
on the side.
This database is called
The records in the explain database are of type
Note
Zebra
must
be able to locate
The SRU Server
In addition to Z39.50, Zebra supports the more recent and
web-friendly IR protocol
SRU
.
SRU can be carried over SOAP or a REST-like protocol
that uses HTTP GET or POST to request search responses. The request
itself is made of parameters such as
Zebra supports Z39.50, SRU GET, SRU POST, SRU SOAP (SRW) - on the same port, recognising what protocol is used by each incoming requests and handling them accordingly. This is a achieved through the use of Deep Magic; civilians are warned not to stand too close. Running zebrasrv as an SRU ServerBecause Zebra supports all protocols on one port, it would seem to follow that the SRU server is run in the same way as the Z39.50 server, as described above. This is true, but only in an uninterestingly vacuous way: a Zebra server run in this manner will indeed recognise and accept SRU requests; but since it doesn't know how to handle the CQL queries that these protocols use, all it can do is send failure responses. Note
It is possible to cheat, by having SRU search Zebra with
a PQF query instead of CQL, using the
http://localhost:9999/Default?version=1.1
&operation=searchRetrieve
&x-pquery=mineral
&startRecord=1
&maximumRecords=1
This will display the XML-formatted SRU response that includes the
first record in the result-set found by the query
In order to turn on Zebra's support for CQL queries, it's necessary
to have the YAZ generic front-end (which Zebra uses) translate them
into the Z39.50 Type-1 query format that is used internally. And
to do this, the generic front-end's own configuration file must be
used. See the section called “YAZ server virtual hosts”;
the salient point for SRU support is that
zebrasrv
must be started with the
A minimal front-end configuration file that does this would read as follows: <yazgfs> <server> <config>zebra.cfg</config> <cql2rpn>../../tab/pqf.properties</cql2rpn> </server> </yazgfs>
The
A zebra server running with such a configuration can then be queried using proper, conformant SRU URLs with CQL queries:
http://localhost:9999/Default?version=1.1
&operation=searchRetrieve
&query=title=utah and description=epicent*
&startRecord=1
&maximumRecords=1
SRU Protocol Support and BehaviorZebra running as an SRU server supports SRU version 1.1, including CQL version 1.1. In particular, it provides support for the following elements of the protocol. SRU Search and RetrievalZebra supports the SRU searchRetrieve operation.
One of the great strengths of SRU is that it mandates a standard
query language, CQL, and that all conforming implementations can
therefore be trusted to correctly interpret the same queries. It
is with some shame, then, that we admit that Zebra also supports
an additional query language, our own Prefix Query Format
(
PQF
).
A PQF query is submitted by using the extension parameter
SRU Scan
Zebra supports
SRU scan
operation.
Scanning using CQL syntax is the default, where the
standard
In addition, a
mutant form of SRU scan is supported, using
the non-standard SRU ExplainZebra supports SRU explain.
The ZeeRex record explaining a database may be requested either
with a fully fledged SRU request (with
Unfortunately, the data found in the CQL-to-PQF text file must be added by hand-craft into the explain section of the YAZ Frontend Server configuration file to be able to provide a suitable explain record. Too bad, but this is all extreme new alpha stuff, and a lot of work has yet to be done .. There is no linkeage whatsoever between the Z39.50 explain model and the SRU explain response (well, at least not implemented in Zebra, that is ..). Zebra does not provide a means using Z39.50 to obtain the ZeeRex record. Other SRU operationsIn the Z39.50 protocol, Initialization, Present, Sort and Close are separate operations. In SRU, however, these operations do not exist.
It can be seen, then, that while Zebra operating as an SRU server does not provide the same set of operations as when operating as a Z39.50 server, it does provide equivalent functionality. SRU Examples
Surf into
http://localhost:9999/?version=1.1&operation=explain
See number of hits for a query
http://localhost:9999/?version=1.1&operation=searchRetrieve
&query=text=(plant%20and%20soil)
Fetch record 5-7 in Dublin Core format
http://localhost:9999/?version=1.1&operation=searchRetrieve
&query=text=(plant%20and%20soil)
&startRecord=5&maximumRecords=2&recordSchema=dc
Even search using PQF queries using the
extended naughty
parameter
http://localhost:9999/?version=1.1&operation=searchRetrieve
&x-pquery=@attr%201=text%20@and%20plant%20soil
Or scan indexes using the
extended extremely naughty
parameter
http://localhost:9999/?version=1.1&operation=scan
&x-pScanClause=@attr%201=text%20something
Don't do this in production code! But it's a great fast debugging aid. YAZ server virtual hostsThe Virtual hosts mechanism allows a YAZ frontend server to support multiple backends. A backend is selected on the basis of the TCP/IP binding (port+listening adddress) and/or the virtual host. A backend can be configured to execute in a particular working directory. Or the YAZ frontend may perform CQL to RPN conversion, thus allowing traditional Z39.50 backends to be offered as a SRU service. SRU Explain information for a particular backend may also be specified. For the HTTP protocol, the virtual host is specified in the Host header. For the Z39.50 protocol, the virtual host is specified as in the Initialize Request in the OtherInfo, OID 1.2.840.10003.10.1000.81.1. NoteNot all Z39.50 clients allows the VHOST information to be set. For those the selection of the backend must rely on the TCP/IP information alone (port and address).
The YAZ frontend server uses XML to describe the backend
configurations. Command-line option
The configuration uses the root element
The
NoteWe expect more information to be added for the listen section in a future version, such as CERT file for SSL servers.
The
The XML below configures a server that accepts connections from
two ports, TCP/IP port 9900 and a local UNIX file socket.
We name the TCP/IP server
<yazgfs>
<listen id="public">tcp:@:9900</listen>
<listen id="internal">unix:/var/tmp/socket</listen>
<server id="server1">
<host>server1.mydomain</host>
<directory>/var/www/s1</directory>
<config>config.cfg</config>
</server>
<server id="server2">
<host>server2.mydomain</host>
<directory>/var/www/s2</directory>
<config>config.cfg</config>
<cql2rpn>../etc/pqf.properties</cql2rpn>
<explain xmlns="http://explain.z3950.org/dtd/2.0/">
<serverInfo>
<host>server2.mydomain</host>
<port>9900</port>
<database>a</database>
</serverInfo>
</explain>
</server>
<server id="server3" listenref="internal">
<directory>/var/www/s3</directory>
<config>config.cfg</config>
</server>
</yazgfs>
There are three configured backend servers. The first two
servers,
For
The third server, |
|||
|
|
||||
| Copyright Index Data ApS 2008 | ||||