We play well with others and don’t lock you in. Standards are essential to this philosophy and form the core of our company. Formal standards inter-operate out of the box with the library and research software ecosystems. An approach based around clean API conventions makes it convenient to flexibly integrate your selection of components from our stack and provide ample opportunities for novel applications.
Bibliographic Data
Searching: Z39.50, SRU/SRW
If you ever want to search library collections with your software, it is likely that we have some tools that will help—they are mostly accessible though a protocol designed for the task: Z39.50. The YAZ toolkit serves as the basis for most language bindings that follow the ZOOM specification for implementing APIs for Z39.50 and make it far more convenient to use than coding against the aging wire protocol. As a bonus, our ZOOM implementation allows you to support both Z39.50 and the more recent SRU protocol as well as the Apache Solr webservice API in the same application.
We also offer IRSpy, a catalog of Z39.50 and SRU targets and host several targets that expose various types of open content.
Z39.50 is a standard for searching bibliographic metadata (e.g., title, author, publisher) Library catalogs were an early application of digital computing; the protocol originated in 1970, well before the Internet. The ANSI/NISO & ISO standard dates back to 1988 and is maintained by the Library of Congress. It remains prominent despite its advanced age and efforts to modernize it such as SRU/SRW.
A voting NISO member, Index Data has contributed to the development and maintenance of key metadata standards for nearly two decades.
Records: MARC, Dublin Core, XML
Our software has extensive support for popular data formats, both legacy versions of MARC, and any schemas based on XML. Our middleware normalizes incoming data into Unicode and XML to enable powerful applications based on international, heterogeneous data sources.
Harvesting: OAI–PMH
To complement the local index feature of the MasterKey suite, we have a Harvester component that indexes data providers and periodically checks for updated information. Rather than fetch everything each time or crawl through looking for time stamps where available, many resources offer a map to updated content via OAI–PMH (Open Archives Initiative Protocol for Metadata Harvesting). It’s commonly used in library/research communities, and you can find a list of providers at the OAI site. Other services that allow for harvesting of their data have also adopted this protocol, for example, Wikipedia.
OpenURL
We build outgoing OpenURLs into our search interfaces so they integrate smoothly with third-party resolvers. In addition, our federated search engine can be used as a free-standing resolver which maps incoming openURLs to queries against relevant subscription resources as an alternative to conventional, maintenance-heavy resolver applications.
Our APIs follow convention
JSON / XML over HTTP
Rather than inventing config formats and wire protocols, we aim to use XML and JSON for almost everything. XML gives versatility with its expressive structure and vast collection of tools. One of our favorites is XSLT which we use to good effect with MARCXML and the myriad of other document types that we come across in enterprise computing.
We’ve come to embrace JSON. Along with many of the software cognoscenti, we believe the simplicity and performance it brings is valuable and often more than sufficient in many circumstances. XML is often the right tool for the job and isn’t jealous: it’s an understanding markup standard.
Metasearch
Our open source metasearcher, Pazpar2, and its enterprise counterpart MasterKey share a common web service API for exchanging queries and results with a front-end as well as communicating the current status of the search. This abstraction greatly eases implementation versus dealing with the underlying protocols directly and even for a single target may be the most straightforward way to integrate a library catalog search into your application. The technology shines when you need to bring together many, disparate resources into one place, however; functionality includes merging, relevance ranking/sorting, and facet analysis.
Screen scraping / web automation
Our flexible screen scraping framework, the Connector Platform combines a captive browser engine with a visual development environment and an expressive, action-oriented language to allow fast and efficient development of Connectors, or gateways for different resources (proprietary APIs or outright HTML-scraping). Applications of the technology include metasearching, authentication, library patron functions, and semantically rich web harvesting. In a nutshell, our Connector Platform allows us to create APIs for nearly any purpose and map those APIs against widely different native interfaces.
Industry standard platforms
Our search platform, Pazpar2/MasterKey exposes its functionality through a simple but powerful XML/HTTP API which can be utilized directly from an Ajax-based dynamic interface. However, to simplify application development, we have also developed or supported a variety of plug-ins or extensions for popular interface development environments, including the Drupal and TYPO3 CMSes, and the JavaServer Faces (JSF) framework.
The Apache Solr enterprise search platform is one of the most popular choices for highly scalable indexing and search applications. Pazpar2 can access Solr-based indexes directly, allowing for near-instant search results for locally indexed resources. Our YAZ Zoom API implementation and our MetaProxy can also be used to build interfaces to Solr-based interfaces, enabling numerous options for integration across remote information retrieval targets and local indexes.