Documentation / Query / Background / Query Overview
Not a SQL replacement
Ebean ORM queries are good for typical OLTP query use cases but there are many queries that can be written in SQL that ORM is not well suited to, especially in the reporting and OLAP area. Recursive SQL queries are also not currently well supported.
Ebean ORM queries are focused to OLTP. Don't forget about using SQL where appropriate.
For Ebean the focus is to use ORM queries for OLTP use cases. The plan is not to expand the ORM
query language but to instead have good integration with SQL via RawSql
, SqlQuery
and upcoming/desired integration into JOOQ
.
Object graph construction
It can be useful to think of ORM queries as Object graph construction. That is, when you are creating an ORM Query it is a definition on what part of the object graph to load and what filters/predicates need to be applied.
Loading
For Ebean the select()
and fetch()
control what part of the object graph to
load. The select()
is for the root level of the object graph and fetch() is for the
object graph leaves. In more advanced cases FetchConfig
can be used to control what
parts of the graph are loaded eagerly/lazily and if that part of the graph is loaded from the
database or L2 cache or L3 cache/document store.
Some parts of the object graph can be loaded from L2 cache or L3/document store
Loading - Query tuning
Tuning an ORM query for optimal performance comes down to:
- What properties to load (only load what you need for the given use case)
- What parts of the object graph to load Eagerly or Lazily
- What parts of the object graph to load from Database or L2 cache or L3 document store
- Not executing the query at all and using L2 Query cache
- Not executing the query at all and using L3 document store
Loading - AutoTune
Ebean's AutoTune feature has the ability to profile what parts of the object graph are used by the
application and then provide automatic tuning of the "what to load" part of an ORM query. That is,
AutoTune can suggest what the select()
and fetch()
part of the ORM query should
be in order to fetch only what is needed from the database (reducing network and databases costs of the query)
and minimise lazy loading.
Ebean can automatically tune "What to load"
Predicates
The where()
clause is where you define predicates for the query. This is
effectively 'application logic' and does not change for 'tuning' purposes etc.
Predicates - Type safety
The Type Safe query extension for Ebean is designed to provide a type safe way of building query predicates. As such it does not have a focus on providing type safety on "What to fetch" via the select() and fetch() clauses (which are instead expected to be largely defined by AutoTune).
Automatic joins
Ebean's query language is designed such that for a query you do not explicitly specify joins but instead define "What to load" and "Predicates". Ebean works out what joins are required to support both the load and predicate aspects of the query. Some of the SQL joins can be shared by both loading and predicates and some joins can not be, outer joins are 'inherited' down a given object path etc. Ebean works the joins and join types to add based on the cardinality and optionality of the relationships/paths involved in both loading and predicates.
The downside of this approach is that predicates using "self joins" can not be easily specified currently but this issue will be addressed and there are some very significant benefits to this approach where Ebean determines the joins automatically.
Avoiding Cartesian product
Ebean will never generate a SQL cartesian product. No matter how complex or big you ORM query gets Ebean will not generate a SQL cartesian product but instead break the query up into multiple SQL queries.
FirstRows MaxRows
FirstRows
and MaxRows
will always work in your ORM query. That is, firstRows/maxRows work on relational
rows (and not objects) and Ebean will automatically break the ORM query up into multiple SQL queries as needed in
order for firstRows/maxRows to work as intended.
Arbitrarily complex graphs
No matter how complex or nested your object graph is for your ORM query Ebean can build it in a pretty efficient manor. More specifically, a single SQL query can contain at most one OneToMany or ManyToMany relationship "path" (without producing a SQL cartesian product which is what we really want to avoid). Ebean will break up the ORM query into multiple SQL queries with each SQL query having a single OneToMany/ManyToMany relationship in its fetch (each SQL query is as eager/big as it can be without becoming a cartesian product).
Ebean processes all the fetch paths determining which paths contain a OneToMany or ManyToMany relationship.
The query 'breakup' may be manually controlled by using FetchConfig
but if this has not been done
or has not covered all the required relationships then Ebean will automatically break up the query based
on these 'paths containing a many relationship'.
TERMINOLOGY
Load Context
The "Load Context" is an internal Ebean feature with a very important job. Is supports executing secondary queries and provides:
- Batch lazy loading
- Query joins (Eager secondary queries)
- Propagation of key query state from the origin query to any/all secondary queries.
In essence the Load Context provides the mechanism for batch loading complex graphs.
Origin queries
"Origin query" is an Ebean term to refer to the original SQL query executed when building an object graph. Further SQL "secondary queries" can execute and relate back to the "origin query".
Summary logging
In summary logging there is an origin
attribute and this is a key that can be used to link
origin and secondary queries.
txn[1012] FindBean type[Order] origin[B0dP9E.DWeHD4.5WUnB] ...
Secondary queries
"Secondary queries" is an Ebean term for SQL queries that relate back to a previously executed "origin query". These secondary queries can be executed due to lazy loading or because a single ORM query is broken up manually or automatically into multiple SQL queries (in order to avoid cartesian product and support firstRows/maxRows).
Summary logging
In summary logging the mode
and origin
attributes can be used to identify which sql
queries are "secondary queries" and the origin which they relate back to. This logging can be used to identify
excessive lazy loading (N + 1) for example.
mode
: +lazy for lazy loading query and +query for a query join/eager loadingorigin
: this key links the secondary query back to the origin
txn[1012] FindBean type[Order] origin[B0dP9E.DWeHD4.5WUnB] ...
...
txn[1016] FindMany mode[+lazy] type[Customer] origin[B0dP9E.DWeHD4.5WUnB] ...
...
txn[1017] FindMany mode[+lazy] type[Address] origin[B0dP9E.DWeHD4.5WUnB] ...