A Tale of Optimization (part 2)


Warning: Illegal string offset 'filter' in /var/sites/t/theproactiveprogrammer.com/public_html/wp-includes/taxonomy.php on line 1409

Click here for Part 1

Chapter 3: The Select n + 1 Problem

The Select n + 1 problem describes a situation where you are hitting the database many times unnecessarily, specifically once for each object in a collection of objects. As I mentioned in part 1, I had been struggling to diagnose a performance problem with an operation which involved retrieving several thousand objects from the database. My NHibernate log file showed me that this single NHibernate request was resulting in thousands of database queries being executed, one for each object in my collection. So why was this happening?

Below is a simplification of my class structure, starting with my root class, Equipment.

Each instance of Equipment has a collection of Assets, each of which has a Location, which in turn has a collection of PositionPoints. The problem this structure presents to NHibernate is that the root class has a collection of objects in a one-to-many relationship, each of which has another collection of objects in another one-to-many relationship. My mapping classes had been set up to explicity turn off lazy loading for Assets, Locations and PositionPoints, therefore NHibernate was obliged to find a way to fetch all this data, and it chose to do this by first retrieving the data for Equipments, Assets and Locations, and then executing a single query for each Location to retrieve all of its PositionPoints.

I couldn’t remember why exactly I had turned lazy loading off for these relationships (perhaps I should have commented my mapping file with an explanation). Therefore I modified the mapping file to turn lazy loading back on. As expected this solved the Select n + 1 problem, as NHibernate was no longer obliged to fully populate Locations and PositionPoints. However, this change caused an exception to be thrown in the business layer, a LazyInitializationException. This was caused by logic in the business layer which was attempting to read the PositionPoints property of a Location after the session which originally obtained the root objects had been closed. Indeed this exception may well have been the reason I had previously decided to turn lazy loading off for these objects. So the idea of using lazy loading was not a viable solution, at least not without some other changes being made. A little research around the lazy initialization problem led me to the idea of injecting my NHibernate session into the data access layer from the business layer, allowing me to use the same session for the lazy loading, but I really didn’t want my business layer to know anything about database sessions.

I reverted my code to switch lazy loading back off and continued to investigate my original problem. I tried instructing NHibernate to eagerly load my objects using a HQL query to eagerly fetch associated data, but this resulted in a cartesian product issue, where the returned collection contained duplicate objects.

Then I discovered a page on ayende.com on NHibernate Futures.

Chapter 4: Futures

NHibernate Futures are a variation on the MultiCriteria feature, which allow us to combine queries to eagerly load one-to-many associations in exactly the way I needed. I would have to define a Future query to retrieve all of my Equipments, then another to retrieve my Assets, and another to retrieve my PositionPoints. These queries would then be combined my NHibernate to retrieve and combine all the required data in a single roundtrip. Finally it seemed like I had found a solution to my problem. I modified my code to use Future queries and tested it.

But it didn’t work!

Stepping through the code and checking my log file revealed that each Future query was causing a trip to the database. Future queries are not supposed to result in an immediate trip to the database, execution should be delayed until the last possible moment.

Again I had hit a brick wall. So again I started googling for answers.

After some time I very fortunately stumbled upon an explanation – NHibernate Future queries do not work with an Oracle database. This was disappointing.

Chapter 5: Getting Futures to Work with Oracle

So now I had reached a point where I had discovered an NHibernate feature which would seemingly allow me to eagerly populate my collection of objects in an efficient way. But it wasn’t supported with Oracle. I did however discover a method of getting Futures to work with Oracle on stackoverflow.

I would need to extend NHibernate’s OracleDataClientDriver and BasicResultSetsCommand classes. I followed the instructions, and updated my NHibernate config file to use the new driver. I reran my code using Futures, and it worked! All of my data was returned in a single trip to the database! But it wasn’t quick. In fact it was very slow. The whole point of this was to try to optimize my code. The Select n + 1 problem seemed to be an obvious reason for its slowness. I had solved that problem. But my code was still slow. Why? The reason was that the solution I had found on stackoverflow to get NHibernate Futures to work with Oracle used cursors. And cursors are slow. The built-in Futures feature results in SQL which does not use cursors. I had found a workaround but it wasn’t a good solution for my problem. Yet again I felt like I was back to square one.

Chapter 6: Rethinking My Approach

Having gone down various rabbit holes and tried a number of potential solutions, it was time now to take a step back from the problem.

What had I learnt?

I needed to obtain a large collection of Equipments, and their associated Assets. The operation was too slow because of a Select n + 1 problem. I needed to read PositionPoint data in the business layer.  I couldn’t lazy load this data because of a LazyInitializationException. I couldn’t use NHibernate Futures because the result was still too slow (with the Oracle workaround at least).

But what exactly did I need to use my PositionPoints for? I reviewed my business layer code and then it hit me. Of the several thousand Equipments and Assets I was retrieving, I only actually needed to access the PositionPoints of a small number of them! Less than ten in fact. Therefore if I turned lazy loading back on, which would result in a fast select query to obtain my root objects, I could identify in my business layer which objects I actually needed to access PositionPoint data for, and hit the database again (using a new session), just for those particular objects.

A few minutes of coding later and at last, I had my solution. The operation which had previously been taking around one and a half minutes was now taking around 30 seconds – an acceptable level of performance.

Conclusions

Looking back on this journey, I must admit I feel a little stupid. I had been looking at the problem in the wrong way. I had assumed that my approach was correct, and that I needed to optimize my use of memory or NHibernate, when in fact it was my algorithm which was inefficient. This is the main lesson I will try to take from this experience. When faced with a database performance issue, first review your algorithm and consider whether you are retrieving data you don’t actually need, particularly when using an ORM framework. There are also a few other things I will take away from this. Perfview is a great tool which I am sure will use again. NHibernate logging is an equally valuable tool for understanding what is going on under the hood. And it remains a mystery how anyone ever coded before the existence of the Internet!

Share Button