Aggregation
SQL has the aggregation functions of SUM
, MIN
, MAX
, COUNT
and AVG
.
We can use these functions in Ebean as dynamic formula or as properties via @Aggregation and @Sum.
Dynamic formula
Single attribute
LocalDate maxDate =
new QOrder()
.select("max(orderDate)")
.customer.name.equalTo("Rob")
.findSingleAttribute();
select max(t0.order_date)
from orders t0
join customer t1 on t1.id = t0.customer_id
where t1.name = ? ; --bind(Rob)
With aggregation queries that select only a single attribute the returned type
matches the property type. In the above case the orderDate is a LocalDate
and so that is the type returned.
Multiple attributes
When we specify non-aggregate properties (status in the example below) then
a GROUP BY
clause is generated that includes all the non-aggregate properties.
List<Order> orders =
new QOrder()
.select("status, max(orderDate)") // status is non-aggregate
.customer.name.equalTo("Rob")
.findList();
select t0.status, max(t0.order_date)
from orders t0
join customer t1 on t1.id = t0.customer_id
where t1.name = ?
group by t0.status -- group by non-aggregate properties
The beans returned by aggregation queries are partially populated. They don't contain the @Id property and thus do not support lazy loading or persisting.
The order beans in the example above will only have their status and orderDate properties loaded.
Fetch
Aggregation queries can include fetch
or fetchQuery
to load related beans.
List<Order> orders
= new QOrder()
.select("status, max(orderDate)")
.customer.fetch("name") // (1) fetch
.status.notEqualTo(Order.Status.NEW)
.findList();
select t0.status, max(t0.order_date), t1.id, t1.name -- (1) has customer id and name
from orders t0
join customer t1 on t1.id = t0.customer_id
where t0.status <> ?
group by t0.status, t1.id, t1.name -- (1) has customer id and name
When we use fetchQuery
the ORM query is executed as 2 sql queries.
List<Order> orders
= new QOrder()
.select("status, max(orderDate)")
.customer.fetchQuery("name") // (2) fetchQuery ...
.status.notEqualTo(Order.Status.NEW)
.findList();
-- Primary query
select t0.status, max(t0.order_date), t0.customer_id -- (2) has customer id only
from orders t0
where t0.status <> ?
group by t0.status, t0.customer_id
-- Secondary query
select t0.id, t0.name -- (2) customer id and name
from customer t0
where t0.id in (?, ?, ?, ?, ? )
Using fetchQuery() can be good when there are many properties on customer that we want to load (as these all go into the secondary sql query rather than the primary query and therefore are not part of the GROUP BY.
@Aggregation
We can model aggregations with properties annotated with @Aggregation
.
This is an alternative to using dynamic formula.
Properties with @Aggregation must be explicitly included in a query (they are considered Transient).
Example:
@Entity
@Table(name = "orders")
public class Order extends BaseModel {
...
LocalDate orderDate;
@Aggregation("max(orderDate)") // aggregation property
LocalDate maxOrderDate;
@Aggregation("count(*)") // aggregation property
Long totalCount;
Having added the maxOrderDate and totalCount properties we can use
them in select
and having
clauses in our queries.
QOrder o = QOrder.alias();
List<Order> orders = new QOrder()
.select(o.status, o.maxOrderDate, o.totalCount)
.findList();
select t0.status, max(t0.order_date), count(*)
from orders t0
group by t0.status
Having
Predicates on aggregation properties should be added to the Having
clause.
List<Order> orders = new QOrder()
.select(o.status, o.maxOrderDate, o.totalCount)
.status.notEqualTo(Order.Status.COMPLETE) // (1) where clause - non aggregate properties
.having()
.totalCount.greaterThan(1) // (2) having clause - aggregate properties
.findList();
select t0.status, max(t0.order_date), count(*)
from orders t0
where t0.status <> ? // (1)
group by t0.status
having count(*) > ? // (2)
Aggregation beans
If we only want 1 or 2 aggregation properties we feel ok adding these to the existing entity beans. However, if instead we have many properties we want to aggregate this approach is messing with our model and it starts to look ugly.
We can create aggregation beans to model the properties - model the sum/group by/aggregation view.
- Use @View rather than @Table
- Does not extend Model as we don't save() or delete() these
- Does not have @Id or @Version properties
- Has properties we want to Aggregate on (sum, max etc)
- Has properties we want to group by on (date, status, ManyToOne)
Example
Say we have an entity bean called MachineUse with:
@Entity
@Table(name = "machine_use")
public class MachineUse extends Model {
@Id
private long id;
@ManyToOne(optional = false)
private DMachine machine;
private LocalDate date;
private long distanceKms; // we want to sum() this ...
private long timeSecs; // we want to sum() this ...
private BigDecimal fuel; // we want to sum() this ...
@Version
private long version;
...
We can create the aggregate bean for this as:
@Entity
@View(name = "machine_use")
public class MachineUseAggregate {
@ManyToOne(optional = false)
private DMachine machine; // group by, fetch, fetchQuery on ...
private LocalDate date; // group by on ...
@Aggregation("sum(distanceKms)")
private Long distanceKms;
@Aggregation("sum(timeSecs)")
private Long timeSecs;
@Aggregation("sum(fuel)")
private BigDecimal fuel;
@Aggregation("count(*)")
private Long count;
...
@Sum
When we create aggregation beans we very often see the pattern where we sum a property like:
@Aggregation("sum(<property name>)")
Type <property name>;
For example, we see beans like:
@Aggregation("sum(distanceKms)")
BigDecimal distanceKms;
@Aggregation("sum(useSeconds)")
Long useSeconds;
@Aggregation("sum(cost)")
BigDecimal cost;
@Sum
is syntatic sugar for this pattern and the above becomes:
@Sum
BigDecimal distanceKms;
@Sum
Long useSeconds;
@Sum
BigDecimal cost;