The XML interface


Table of Contents

The <results> tag - Query results contents.
The <result> tag - A single query result
The <id> tag - Contains the document ID
The <URL> tag - Contains the document URL
The <title> tag - Contains the document title
The <mimetype> tag - Contains the document MIME type
The <coretextid> tag - Contains the document core text id
The <contextlist> tag - Wraps the document context list
The <context> tag - Contains a document context
The <archives> tag - Wraps the document archives list
The <archive> tag - Contains the document archive ID
The <timestamp> tag - Contains the document timestamp
The <servertime> tag - Contains the document server time
The <score> tag - Contains the document score
The <metadatalist> tag - Wraps the document metadata list
The <metadata> tag - Contains a document metadata
The <templatelist> tag - Wraps the document template list
The <template> tag - Contains a document template

The XML interface is a fast and direct approach to searchbox integration into proprietary applications. It consists in an HTTP GET request that outputs the query results as an XML file and lets you get the documents stored by searchbox.

The URL on which you will operate can be one of these:

The HTTP request must be made using HTTP authentication, so with some HTTP clients the ENDPOINT will be something like admin:password@localhost:2200.

The parameters that follows the ? follows the HTTP GET parameter encoding rules and can be one or more of the following:

alg=X

X is the query string. This parameter is required and the query string can be empty.

mintime=X

X is the oldest timestamp (expressed as number of seconds passed since 01/01/1970 00:00:00 GMT) allowed for query results, that is only documents newer than X will be returned.

maxtime=X

X is the newest timestamp (expressed as number of seconds passed since 01/01/1970 00:00:00 GMT) allowed for query results, that is only documents older than X will be returned.

minscore=X

X is the minimum score that a document must have to be returned by the query.

start=X

X is the position of the first document to be returned, useful for paginating the results.

num=X

X is the number of documents to be returned (default is 10), useful for paginating the results.

info=X

X can be one of the following:

none

means that only document ID is returned for each query result.

url

means that also the document URL is returned for each query result.

title

means that also the document title is returned for each query result.

context

means that also the query context is returned for each query result.

templatemeta

means that also the template medatata is returned for each query result.

allmeta

means that also all metadata is returned for each query result.

view=X

X can be one of the following:

published

means that all the documents are eligible for searching.

corechanged

means that only the documents that changed in the core text are eligible for searching.

sort=X

X can be one of the following:

standard

means that standard sorting will be used.

relevance

means that relevance sorting will be used.

docscore

means that document score sorting will be used.

newer

means that time sorting will be used (newer document first).

older

means that time sorting will be used (older document first).

wS=X

S is a slice id (see slice and dictionary ids for possible values), X is the integer weight to use for the slice during this query.

When accessing the http://ENDPOINT/doc/DOCID URL you can pass an HTTP Accept header in your GET request specifiying the type application/vnd.focuseek-fff to get the document translated into FFF form[6] or text/html to get an HTML approximation of the document. If the request contains no Accept header the document is returned in its original format. In any case searchbox sets the Content-Type header in the reply to the mimetype of the document it sends back.