Wednesday 14 October 2009

Hibernate: Getting a paged list of distinct results in join… a series of unfortunate events

Tables hierarchy:
Parent, ChildFirstLevel (Parent's child), and ChildSecondLevelSet (ChildFirstLevel's child)

In the system I’m working on, there are two very similar queries:
-Query 1: filter Parents (filter criteria: some fields of Parent) and display results in a paging list
-Query 2: same filtering of Parents, but it’s required to retrieve parents and their children in all levels. No paging.

Problem #0: Hibernate and returning distinct results from joins

See: https://www.hibernate.org/117.html#A3: Hibernate does not return distinct results for a query with outer join fetching enabled for a collection (even if I use the distinct keyword)?
Query 2 joins parents with children, so to overcome this I used

setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY)

to remove join duplicates, which was no issue since Query 2 does not require paging.

Problem #1: Paging and Criteria APIs

We then got a change request to add some ChildFirstLevel fields to the filter criteria. This required editing Query 1 to join Parent and ChildFirstLevel tables, which caused duplicates to return in the result set. The solutions in https://www.hibernate.org/117.html#A3 to remove duplicates didn’t work because they all remove duplicates in-memory (after executing the SQL on the database and retrieving the results), which messed up the paging.

Solution:
To keep using Criteria APIs, the solution was to use Projections.distinct(), which translates to select distinct in SQL.
Projections.distinct() will cause the query to only return the properties in the projection, since it constructs the select clause (with the distinct keyword). So we will need to use AliasToBeanResultTransformer to construct Parent objects from the result set, which will have the individual fields of Parent.

See: https://forum.hibernate.org/viewtopic.php?t=941669
Post: one of the solutions.

Problem #2: Projections.distinct() and listing all properties in the query

Parent table is huge. We don’t want to list all the fields. You want to generate their properties names.

Solution:
This can be done using getClassMetadata(Parent.class).getPropertyNames().

Problem #3: Projections.distinct() and component beans

Parent class has many component beans, so we will need recursion in generating the property names (to generate the properties inside the components). Otherwise, the query will contain component names that SQL does not recognize. Here’s the messy code (based on http://www.agitar.com/openquality/hibernate2-2.1.8/net/sf/hibernate/expression/Example.java.html):

private ProjectionList addPropNamesToPrjList(ProjectionList prjList, String componentName, String[] propNames, Type[] propTypes) {
for (int i = 0; i <>
if (propTypes[i].isComponentType()) {
String[] subNames = ((AbstractComponentType)propTypes[i]).getPropertyNames();
Type[] subTypes = ((AbstractComponentType)propTypes[i]).getSubtypes();
String newComponentName = "".equals(componentName) ? propNames[i] : (componentName + "." + propNames[i]);
addPropNamesToPrjList(prjList, newComponentName, subNames, subTypes);
} else {
String propName = "".equals(componentName) ? propNames[i] : (componentName + "." + propNames[i]);
prjList.add(Projections.property(propName), propName);
}
}

return prjList;
}

/** This medhod constructs a ProjectionaList containing the given class' properties
* Supports comoponent properties using recursion
*/
protected ProjectionList constructParentPropsProjList(Session session, Class classObj) {
String[] propNames = session.getSessionFactory().getClassMetadata(classObj).getPropertyNames();
Type[] propTypes = session.getSessionFactory().getClassMetadata(classObj).getPropertyTypes();
ProjectionList prjList = Projections.projectionList();

prjList = addPropNamesToPrjList(prjList, "", propNames, propTypes);
return prjList;
}

The above code does construct correct running SQL, but then you get a Hibernate error when Hibernate attempts to construct Parent objects from the result:
Could not set property component.x on Parent.
No solution was found to this. Apparently, Hibernate is unable to construct the component beans from the properties we had such a hard injecting into the query.

A tweak to the above code: Not using an alias
prjList.add(Projections.property(propName));
Returns empty objects.

Solution:
Use HQL to be able to insert the distinct keyword directly in a query that would be both humane and working.

Problem #4: HQL and lazy initialization

The HQL query is something like:
select distinct prnt -- or select count(distinct prnt.id) if you want
from Parent prnt
left join prnt.childFirstLevelSet firstLevel
left join firstLevel.childSecondLevelSet
where … (conditions on Parent and ChildFirstLevel)

This does generate correct running SQL. But unfortunately, it does not work when the conditions are on ChildFirstLevel. It gives:
Org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: Parent.childFirstLevelSet, no session or session was closed

This seems to be related to the following
https://forum.hibernate.org/viewtopic.php?f=1&t=960269&start=0
that quotes Hibernate in Action (page 261):
Hibernate currently limits you to fetching just one collection eagerly. This is a reasonable restriction, since fetching more than one collection in a single
query would be a Cartesian product result.
Attempts to use with and fetch all keywords came to nothing.

Solution:
Luckily, throughout the system ChildFirstLevel is always retrieved with its children. So changing its children’s mapping by adding lazy="false" solved the problem. No need for the join statements (except the first one) that were causing wrong results. So the query would be:
select distinct prnt
from Parent prnt
left join prnt.childFirstLevelSet firstLevel
where … (conditions on Parent and ChildFirstLevel)

Problem #5: fetch and distinct results

For Query 2, we need to fetch the children so the query should be
select distinct prnt
from Parent prnt
left join fetch prnt.childFirstLevelSet firstLevel
where … (conditions on Parent and ChildFirstLevel)

Adding fetch caused the results to duplicate (again).

Solution:
That was easy, since Query 2 does not require paging. We can just throw the results into a HashSet.

Problem #6: Retrieving filtered children only (and not other children in same parent)

Query 2: We now have a working query that returns distinct results filtered by conditions on both the parent and child tables. The problem is that since we filter by ChildFirstLevel fields, only objects that meet the filter criteria will be retrieved. If the parent has two children and only one meets the filtering criteria, only one child will be retrieved.

Solution:
This required a tweak to the query:

select distinct prnt
from Parent prnt
left join fetch prnt.childFirstLevelSet firstLevel
where … (conditions on Parent and ChildFirstLevel)
-- repeat query to get all parent’s children
or prnt.id = firstLevel.parent.id
and prnt.id in
(select prnt.id from Parent prnt
left join prnt.childFirstLevelSet firstLevel
where … (conditions on Parent and ChildFirstLevel))

1 comment:

M@hdeTo said...

Wonderful post! Kudos on smashing all these walls! only a true code warrior would be able to!