Pyramid Elasticsearch Integration

Scott Torborg - Cart Logic

pyramid_es is a pattern and set of utilities for integrating the elasticsearch search engine with a Pyramid web app. It is intended to make it easy to index a set of persisted objects and search those documents inside Pyramid views.

Example Usage

client = get_client(request)
result = client.query(Movie).\
    filter_term('year', 1987).\
    order_by('rating').\
    execute()

Contents

Quick Start

Install

Install with pip:

$ pip install pyramid_es

Integrate with a Pyramid App

Include pyramid_es, by calling config.include('pyramid_es') or adding pyramid_es to pyramid.includes.

Configure the following settings:

  • elastic.servers
  • elastic.timeout
  • elastic.index
  • elastic.disable_indexing

Add the Mixin Class to a Model

Add ElasticMixin to a model class. For example:

from pyramid_es.mixin import ElasticMixin

class Article(Base, ElasticMixin):
    ...

Then implement the elastic_mapping() class method:

from pyramid_es.mixin import ElasticMixin, ESMapping, ESString, ESField

class Article(Base, ElasticMixin):
    ...

    @classmethod
    def elastic_mapping(cls):
        return ESMapping(
            return ESMapping(
                analyzer='content',
                properties=ESMapping(
                    ESString('title', boost=5.0),
                    ESString('body'),
                    ESField('pubdate'))))

You can customize the exact behavior of the mapping and document creation by adjusting the elastic_mapping(cls) class method and the elastic_document(self) instance method.

Access the Client

To interact with the elasticsearch server, use the client instance maintained by pyramid_es. You can access it like:

from pyramid_es import get_client

client = get_client(registry)

All operations–index maintenance, diagnostics, indexing, and querying–are performed via methods on this instance.

Index a Document

After the model class is prepared, index a document with:

client.index_object(article)

This call will create or update the elasticsearch backend state for this model object, so you can simply call it any time the object is created or updated. If the object is deleted, call:

client.delete_object(article)

Execute a Search Query

Search queries are formed generatively, much like SQLAlchemy. Here’s an example:

q = client.query(Article)
q = q.filter_term('title', 'Introduction')
q = q.order_by('pubdate', desc=True)
results = q.execute()

for result in results:
    print result.title, result.pubdate

To make a keyword search, add the q argument to client.query():

q = client.query(Article, q='kittens')

Calling a query method like .filter_term() or .order_by() will create a totally new query instance, and not modify the original.

You can use query methods to:

  • Add filters on specific fields, range filters, or anything else supported by elasticsearch
  • Sort by fields
  • Add search facets

The Result Object

Calling .execute() on a query issues the query to the backend and returns a special result object. This object behaves similar to a dict, but supports iteration and a few special properties.

API Reference

Client

pyramid_es.client_from_config(settings, prefix='elastic.')[source]

Instantiate and configure an Elasticsearch from settings.

pyramid_es.get_client(request)[source]

Get the registered Elasticsearch client. The supplied argument can be either a Request instance or a Registry.

class pyramid_es.client.ElasticClient(servers, index, timeout=1.0, disable_indexing=False)[source]

A handle for interacting with the Elasticsearch backend.

analyze(text, analyzer)[source]

Preview the result of analyzing a block of text using a given analyzer.

delete_index()[source]

Delete the index on the ES server.

delete_mapping(cls)[source]

Delete the mapping corresponding to cls on the server. Does not delete subclass mappings.

ensure_all_mappings(base_class, recreate=False)[source]

Initialize explicit mappings for all subclasses of the specified SQLAlcehmy declarative base class.

ensure_index(recreate=False)[source]

Ensure that the index exists on the ES server, and has up-to-date settings.

ensure_mapping(cls, recreate=False)[source]

Put an explicit mapping for the given class if it doesn’t already exist.

get(obj, routing=None)[source]

Retrieve the ES source document for a given object or (document type, id) pair.

get_mappings(cls=None)[source]

Return the object mappings currently used by ES.

index_document(id, doc_type, doc, parent=None)[source]

Add or update the indexed document from a raw document source (not an object).

index_object(obj)[source]

Add or update the indexed document for an object.

index_objects(objects)[source]

Add multiple objects to the index.

query(*classes, **kw)[source]

Return an ElasticQuery against the specified class.

refresh()[source]

Refresh the ES index.

search(body, classes=None, fields=None, **query_params)[source]

Run ES search using default indexes.

subtype_names(cls)[source]

Return a list of document types to query given an object class.

Model Mixin

Utility classes intended to make it easier to specify Elastic Search mappings for model objects.

class pyramid_es.mixin.ESField(name, filter=None, attr=None, **kwargs)[source]

A leaf property that doesn’t emit a mapping definition.

This behavior is useful if you want to allow Elastic Search to automatically construct an appropriate mapping while indexing.

class pyramid_es.mixin.ESMapping(*args, **kwargs)[source]

ESMapping defines a tree-like DSL for building Elastic Search mappings.

Calling dict(es_mapping_object) produces an Elastic Search mapping definition appropriate for pyes.

Applying an ESMapping to another object returns an Elastic Search document.

properties[source]

Return the dictionary {name: property, ...} describing the top-level properties in this mapping, or None if this mapping is a leaf.

update(m)[source]

Return a copy of the current mapping merged with the properties of another mapping. update merges just one level of hierarchy and uses simple assignment below that.

class pyramid_es.mixin.ESProp(name, filter=None, attr=None, **kwargs)[source]

A leaf property.

class pyramid_es.mixin.ESString(name, **kwargs)[source]

A string property.

class pyramid_es.mixin.ElasticMixin[source]

Mixin for SQLAlchemy classes that use ESMapping.

elastic_document()[source]

Apply the class ES mapping to the current instance.

classmethod elastic_mapping()[source]

Return an ES mapping for the current class. Should basically be some form of return ESMapping(...).

class pyramid_es.mixin.ElasticParent[source]

Descriptor to return the parent document type of a class or the parent document ID of an instance.

The child class should specify a property:

__elastic_parent__ = (‘ParentDocType’, ‘parent_id_attr’)

Queries

class pyramid_es.query.ElasticQuery(client, classes=None, q=None)[source]

Represents a query to be issued against the ES backend.

add_facet(*args, **kwargs)[source]

Add a query facet, to return data used for the implementation of faceted search (e.g. returning result counts for given possible sub-queries).

The facet should be supplied as a dict in the format that ES uses for representation.

It is recommended to use the helper methods add_term_facet() or add_range_facet() where possible.

add_range_facet(name, field, ranges)[source]

Add a range facet.

ES will return data about documetn counts for the top sub-queries (by document count) inw hich the results are filtered by a given numerical range.

add_term_facet(name, size, field)[source]

Add a term facet.

ES will return data about document counts for the top sub-queries (by document count) in which the results are filtered by a given term.

count()[source]

Execute this query to determine the number of documents that would be returned, but do not actually fetch documents. Returns an int.

execute(start=None, size=None, fields=None)[source]

Execute this query and return a result set.

filter_term(*args, **kwargs)[source]

Filter for documents where the field term matches value.

filter_terms(*args, **kwargs)[source]

Filter for documents where the field term matches one of the elements in value (which should be a sequence).

filter_value_lower(*args, **kwargs)[source]

Filter for documents where term is numerically more than lower.

filter_value_upper(*args, **kwargs)[source]

Filter for documents where term is numerically less than upper.

limit(*args, **kwargs)[source]

When returning results, stop at document n.

static match_all_query()[source]

Static method to return a filter dict which will match everything. Can be overridden in a subclass to customize behavior.

offset(*args, **kwargs)[source]

When returning results, start at document n.

order_by(*args, **kwargs)[source]

Sort results by the field key. Default to ascending order, unless desc is True.

size(*args, **kwargs)

When returning results, stop at document n.

start(*args, **kwargs)

When returning results, start at document n.

static text_query(phrase, operator='and')[source]

Static method to return a filter dict to match a text search. Can be overridden in a subclass to customize behavior.

pyramid_es.query.filters(f)[source]

A convenience decorator to wrap query methods that are adding filters. To use, simply make a method that returns a filter dict in elasticsearch’s JSON object format.

Should be used inside @generative (listed after in decorator order).

pyramid_es.query.generative(f)[source]

A decorator to wrap query methods to make them automatically generative.

class pyramid_es.result.ElasticResult(raw)[source]

Wrapper for an Elasticsearch result set. Provides access to the documents, result aggregate data (like total count), and facets.

facets[source]

Return the facets returned by this seach query.

total[source]

Return the total number of docs which would have been matched by this query. Note that this is not necessarily the same as the number of document result records associated with this object, because the query may have a start / size applied.

class pyramid_es.result.ElasticResultRecord(raw)[source]

Wrapper for an Elasticsearch result record. Provides access to the indexed document, ES result data (like score), and the mapped object.

Contributing

Patches and suggestions are strongly encouraged! GitHub pull requests are preferred, but other mechanisms of feedback are welcome.

pyramid_es has a comprehensive test suite with 100% line and branch coverage, as reported by the excellent coverage module. To run the tests, simply run in the top level of the repo:

$ tox

This will also ensure that the Sphinx documentation builds correctly, and that there are no PEP8 or Pyflakes warnings in the codebase.

Any pull requests should preserve all of these things.

Indices and Tables