Overview

For mapping we need to define:

  • Which Entities should be mapped to an index
  • For each entity type, what part of the object graph is included in the document
  • Which string properties are really codes and should not be analysed
  • Which string properties do we want both analysed and 'raw' fields for search and sorting
  • Any extra ElasticSearch specific mapping

@DocStore - Entities to map

We add the @DocStore annotation on each entity we want to map into an ElasticSearch index.

// Store contact in ElasticSearch
@DocStore
@Entity
public class Contact {

By default @DocStore means:

  • @OneToMany and @ManyToMany are NOT included
  • @ManyToOne and @OneToOne include the associate @Id property only
  • All other persistent properties are included in the document

You can effective reduce the properties that are included in an index by specifying which ones to include via doc.

Example: Only index some properties
@DocStore(doc="firstName, lastName, email")
@Entity
public class Contact {
...

I expect that reducing properties to index like the above to be relatively rare. TODO: Add @DocIgnore and @DocProperty(ignore=true) support.

@DocEmbedded - Embedded documents

On @ManyToOne and @OneToMany properties you can specify the properties that should be included in the document that is to be indexed using @DocEmbedded.

Example: Embedded ManyToOne

Embedded the customer id, name into the contact index.

@DocStore
@Entity
public class Contact {

  ...
  // denormalise including the customer id and name
  // into the 'contact' document
  @DocEmbedded(doc="id,name")
  @ManyToOne(optional=false)
  Customer customer;
Example: Embedded OneToMany

Embed some customer details (customer id and name). Embed the order details (as ElasticSearch "nested" property as it is a @OneToMany).

@DocStore
@Entity
public class Order {

  ...
  @DocEmbedded(doc="id,name")
  @ManyToOne(optional=false)
  Customer customer;

  @DocEmbedded
  @OneToMany(cascade = CascadeType.ALL, mappedBy = "order")
  List<OrderDetail> details;
Example: Embed with nesting

Embed more customer details including the nested billingAddress and billingAddress.country. Embed more order details including the nested product with id, sku and name.

@DocStore
@Entity
public class Order {

  ...
  // embed some customer details including the billingAddress
  @DocEmbedded(doc = "id,name,status,billingAddress(*,country(*))")
  @ManyToOne(optional=false)
  Customer customer;

  @DocEmbedded(doc = "*,product(id,sku,name)")
  @OneToMany(cascade = CascadeType.ALL, mappedBy = "order")
  List<OrderDetail> details;

@DocCode - Strings that are "codes"

We want to map some string properties with @DocCode such that the property values are not analysed and instead treated as literal values / codes (not lower cased / stemmed etc by analysers).

Ebean will automatically treat UUID, Enum and any string @Id properties as "codes" and you do not need to annotate these with @DocCode.

Note that if you put @DocCode on product sku, then it is also deemed a code where ever it is embedded. So if product sku is embedded in the order index it will also be considered a @DocCode property there.

Example

We want to treat product sku values as literal codes (not analysed).

@DocStore
@Entity
public class Product {

  // treat sku as a "code" (not analysed)
  @DocCode
  String sku;

  @DocSortable
  String name;

Mapping

@DocCode properties are mapped as not analyzed.

"properties" : {
  "sku": { "type": "string", "index": "not_analyzed" },
  ...

@DocSortable - Analysed and not analysed

Some string properties we want to annotate with @DocSortable and what this does is that it provides both an analysed and non-analysed/raw field for the property. We can use the analysed field for text search and we can use the non-analysed/raw field for sorting (and ElasticSearch aggregation features).

Note that if you put @DocSortable on customer name, then it is also deemed sortable where ever it is embedded. So if customer name is embedded in the order index it will also be considered @DocSortable there.

Example: Customer

We want to be able to sort on customer name.

@DocStore
@Entity
public class Customer {

  ...
  @DocSortable
  String name;
Example: Product

We want to be able to sort on product name (and we can sort on product sku as it's a code).

@DocStore
@Entity
public class Product {

  @DocCode
  String sku;

  @DocSortable
  String name;

Mapping

@DocSortable properties are mapped with additional "raw" not analyzed field.

"properties" : {
  "name": { "type": "string", "fields": {
            "raw": { "type": "string", "index": "not_analyzed" } } },
  ...

Query use - order by

When you write an Ebean query and specify an order by clause Ebean will automatically translate the order by clause to use the associated 'raw' field is one is defined.

List<Product> products = server.find(Product.class)
  .setUseDocStore(true)
  .order().asc("name")
  .findList();
Elastic query:
// name.raw used automatically for sort order
{"sort":[{"name.raw":{"order":"asc"}}],"query":{"match_all":{}}}

Query use - Term expressions

"Equal to" translates to an Elastic "term" query and for the case of a @DocSortable property the term expression will use the associated "raw" field.

Similarly "Greater than", "Less than", "Greater than or equal to" and "Less than or equal to" also translate into range queries that also use the "raw" field when available.

List<Product> products = server.find(Product.class)
  .setUseDocStore(true)
  .where().eq("name","Chair")
  .findList();
Elastic query:
// name.raw used automatically for 'term' expression
{"query":{"filtered":{"filter":{"term":{"name.raw":"Chair"}}}}}

@DocProperty

@DocProperty provides all the extra mapping options including:

  • store default false
  • boost default 1
  • includeInAll default true
  • enabled default true
  • norms default true
  • docValues default true
  • nullValue
  • analyzer
  • searchAnalyzer
  • copyTo
  • index options - DOCS, FREQS, POSITIONS, OFFSETS

It also provides flags to set code and sortable as an alternative to @DocCode and @DocSortable.

@DocProperty can be put on a property or on the @DocStore mapping attribute and mappings here effectively override any existing property mappings.

@DocStore(mapping = {
  @DocMapping(name = "description",
    options = @DocProperty(enabled = false)),
  @DocMapping(name = "notes",
    options = @DocProperty(boost = 1.5f, store = true))
})
@Entity
public class Content {

Mapping generation

In order to use ElasticSearch effectively with the relatively structured ORM documents we need to create the ElasticSearch indexes with the appropriate property mappings (types, codes, sortable etc) which in a way is similar to DDL for SQL databases.

ebean.docstore.generateMapping=true

With ebean.docstore.generateMapping=true Ebean will generate a mapping file for each bean type that is mapped (with @DocStore). By default these mapping files go into src/main/resources and then elastic-mapping and this is configurable via DocStoreConfig pathToResources and mappingPath.

This is expected to be used during development/testing.

ebean.docstore.dropCreate=true

With ebean.docstore.dropCreate=true Ebean at startup will drop all the mapped indexes and re-create them using the generated mapping.

This is expected to be used during development/testing.

ebean.docstore.create=true

With ebean.docstore.create=true Ebean at startup will check which indexes exists and create any missing ones using the generated mapping.

This is expected to be used during development/testing.

Note that you only use create=true if dropCreate is false.

Example mappings

You can find examples of generated mappings of the example application at src/main/resources/elastic-mapping.

Example: product_v1.mapping.json
{
  "mappings" : {
    "product" : {
      "properties" : {
        "sku": { "type": "string", "fields": { "raw": { "type": "string", "index": "not_analyzed" } } },
        "name": { "type": "string", "fields": { "raw": { "type": "string", "index": "not_analyzed" } } },
        "whenCreated": { "type": "date" },
        "whenModified": { "type": "date" },
        "version": { "type": "long" }
      }
    }
  }
}
Example: customer_v1.mapping.json
{
  "mappings" : {
    "customer" : {
      "properties" : {
        "status": { "type": "string", "index": "not_analyzed" },
        "name": { "type": "string" },
        "smallNote": { "type": "string" },
        "anniversary": { "type": "date" },
        "billingAddress" : {
          "properties" : {
            "id": { "type": "long" },
            "line1": { "type": "string" },
            "line2": { "type": "string" },
            "city": { "type": "string" },
            "country" : {
              "properties" : {
                "code": { "type": "string", "index": "not_analyzed" },
                "name": { "type": "string" }
              }
            },
            "whenCreated": { "type": "date" },
            "whenModified": { "type": "date" },
            "version": { "type": "long" }
          }
        },
        "shippingAddress" : {
          "properties" : {
            "id": { "type": "long" },
            "line1": { "type": "string" },
            "line2": { "type": "string" },
            "city": { "type": "string" },
            "country" : {
              "properties" : {
                "code": { "type": "string", "index": "not_analyzed" },
                "name": { "type": "string" }
              }
            },
            "whenCreated": { "type": "date" },
            "whenModified": { "type": "date" },
            "version": { "type": "long" }
          }
        },
        "whenCreated": { "type": "date" },
        "whenModified": { "type": "date" },
        "version": { "type": "long" }
      }
    }
  }
}