Xmas Break

Ronnie Mukherjee 0 Comments

Just a short post to say I will be having two weeks off from blogging over the Christmas period.

I hope you all have a wonderful Christmas and come back in the new year feeling refreshed and ready to code!


A Proposed Approach to Estimating

Ronnie Mukherjee 0 Comments

“Estimation (or estimating) is the process of finding an estimate, or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable.” (Wikipedia)

Estimating how long a task or project will take is a critical part of the software development process, and unfortunately it is extremely difficult. Why? Because it is essentially a guessing game, based on assumptions and information we don’t yet have. Yet the consequences of poor estimation can be quite significant. On a personal level, it can lead to embarrassment, stress and a damaged reputation, as you fail to meet deadlines which were essentially set by yourself. At the team or organisation level, poor estimation may ultimately be the difference between success and failure in a project. If you under-estimate a task, you will have to sacrifice quality or cost to meet your deadline, or alternatively you will deliver late – bad news either way.

Your client or business obviously wants software to be delivered as quickly as possible, without compromising on quality. As professionals therefore, we should endeavour to provide estimates which are as low as possible, without being unrealistic. In other words, if we honestly believe that a project will take six months, we shouldn’t say it will take a year “just to be on the safe side”. I have seen some developers recommend the practice of taking a guess, and then tripling that estimate. A better approach I believe is to take your time when producing an estimate, and do your best to predict the task in detail, however unpredictable it may seem. The fact that estimation is difficult should not be an excuse for not trying to do it well.

I will outline an approach to estimation here which should at least provide food for thought. It is based primarily on one principle: when it comes to producing accurate estimates, we need to take our time and be thorough. In practice, this principle is rarely followed.

The approach assumes that you have been asked to produce an estimate only – not a specification, a design, or any other kind of official document. Typically the request will have come from someone who is trying to weigh up the value of pushing ahead with a task or project. I have been in this position many times, and have often made the mistake of spending approximately 10 seconds thinking before coming back with an answer. What I am trying to do here is propose a better and more systematic approach.

A second assumption I am making is that you are being asked about a task or project which will take a day or more. Anything smaller than that and it is probably not worth trying to be too clever with your approach.

The process consists of 6 steps.

Step 1 – Isolate Yourself

When you are put on the spot it is difficult to think clearly. It is also difficult to think about a complex problem in your head, without drawing, brainstorming, scribbling and making lists. Furthermore, a reliable ‘gut feeling’ tends to come only from adequate experience, which is rare given the inherent uniqueness of each software project.

Therefore your first task when asked to produce an estimate is to isolate yourself from the person making the request. This doesn’t mean you have to lock yourself away in a room by yourself. It just means that you need to tell that person that you will get back to them. Don’t give them an answer immediately just because that is what they expect. You won’t actually know how long you will need to produce your estimate until you have completed steps 1 to 4, but this can be done in a few minutes, so at worst you will be able to get back to them very soon and tell them how long you need to produce an estimate. There may of course be a problem here that comes from pressure to provide an answer quickly (“just give me a ballpark figure”). If they won’t take no for an answer then go ahead and triple your gut feeling, but explain that if you are allowed some time to think about this more carefully you will be able to provide a better estimate.

Step 2 – Choose an Order of Magnitude

Based on your gut feeling, pick one of the following four categories for how long you think the task will take:

  • 1 to 5 days
  • 1 to 4 weeks
  • 1 to 6 months
  • 6+ months

Step 3 – Choose an Importance Factor

Make a call on how important you feel it is that your estimate is accurate. In some cases, there is not much riding on the accuracy of your answer. You may know the person well, and know that they will not hold you to your answer and that they understand you are just giving them a rough idea. On the other hand, you might suspect that you will be held to whatever answer you give, and that it is very important that you provide the best possible estimate.

  • 1 = Low importance
  • 2 = Medium importance
  • 4 = High importance

Notice that the ‘High importance’ factor is assigned a value of 4, rather than 3. This is to reflect the fact that an estimate deemed as highly important is probably twice as important as one that is regarded as of medium importance.

Step 4 – Calculate How Long to Spend

Based on your ‘order of magnitude’ estimate, assign a base time for how long you need to spend on your estimate, as follows:

  • 1 to 5 days = 30 minutes
  • 1 to 4 weeks = 1 hour
  • 1 to 6 months = 2 hours
  • 6+ months = 1 day

Now take that base time and multiply it by the importance factor value from step 3, and you have a value for how long you should spend on the estimate. For example, if your feel a task will take between 1 and 4 weeks, and you assign the estimation task a medium importance factor, then you should spend 2 hours on your estimate.

How long you spend on an estimate should be a reflection of how complex the task or project is, combined with how important it is that your estimate is accurate. The longer you spend on an estimate, the more likely it is to be accurate. In this respect I don’t believe an estimate is ever really ‘done’, and a sensible approach is to decide up front how much of your time and energy you are going to devote to trying to get your estimate right.

Step 5 – Estimate

Now you know how long you want to spend on your estimate. I would recommend breaking up this time into the following steps, to ensure you make the most of this time. These steps are based loosely on the vertical planning process described by David Allen in his book Getting Things Done.

1. Describe the project or task in a sentence

This will help you to take a step back and briefly consider the big picture, and what it is you are trying to achieve.

2. Describe the end result

How will you know when the project is done? Write up your answer to this question. Depending on how much time you have available and your preferred approach, this may consist of a short paragraph of text, or a long list of informal acceptance tests. By thinking ahead to completion, you are more likely to identify challenges which you will encounter along the way.

3. Brainstorm

Using a list or mind map, get down on paper (or on your machine) everything you can think of relating to the project. What could go wrong? What tasks are involved? What is relevant? What are your general thoughts?

4. Technical Design

This is the most difficult part and again, the level of detail you go into will depend on how much time you want to spend. You may want to divide your task into horizontal tiers, thinking in terms of the database, data access layers, middleware and user interface. In each of these layers, what will your solution look like? If you only have a few minutes you may just sketch a couple of diagrams or make some lists. If you have longer, you might produce some more comprehensive UML diagrams, or do whatever else you would typically do to design a system.

5. Identify Sub-Tasks and Estimate

As you produce your technical design, you should naturally begin to identify sub-tasks, which you may break down further still. As you identify these lower level tasks, you can assign estimates to each of these, and add up all your low level estimates to produce your overall estimate. In terms of estimating these low level tasks, you may choose to use a technique such as PERT estimating, or you may even get other people involved and play planning poker. Even if you decide to use your gut feeling, as I generally do at this stage, you are more likely to produce a better overall answer than if you try to use gut feeling at a higher level.

Step 6 – Negotiate

You have been through the process and you have an estimate. If you are lucky, when you provide this estimate to the person who requested it, they will accept it and your estimation task is over. If you are less fortunate, they will pull a face and question your estimate. Obviously, they were hoping the project would not take so long. If this happens, then you should have some evidence to hand to justify your estimate. Show them your technical design, your sub-tasks and their corresponding estimates. If they are still unsatisfied with the estimate, then it may be appropriate to negotiate based on the Project Management Triangle. If your estimate is still not accepted, then at least you know that you have done your best to provide an honest and accurate answer.

Final Thoughts

Estimation is hard, and as I always maintain in every aspect of software development, there is no silver bullet. There is no single approach or technique that is going to work for everyone. What I have done here is to try to come up with an approach which I believe will help me become a better estimator, and therefore a better professional. Obviously, software projects overrun due to unanticipated obstacles. We need to take our time to try and anticipate the obstacles which we will inevitably face, and not rush the estimation process. When you are asked for an estimate, replying immediately with a number is generally a bad idea. Our brains aren’t powerful enough to process all the relevant information that quickly. So accept the fallibility of your brain, and take your time when estimating.


Does Inline CSS Make Me A Terrible Person?

Ronnie Mukherjee 0 Comments

Inline CSS Confession

There is a lot of dogma in programming. I don’t know if there are similar levels of dogma in other forms of engineering, but the adamance with which some programmers subscribe to certain practices and approaches can be quite disturbing. One of my favourite stackoverflow threads is a post entitled – “What’s your most controversial programming opinion?”. A number of the answers allude to the problem of dogma in programming, for example: the only best practice you should be using all the time is ‘use your brain’.

One ‘best practice’ which I have been thinking about recently is the idea that you should never write inline css. I have always written a lot of inline css. I must admit, I do so largely out of habit, as I have never really worked on a team where this practice was considered unacceptable. From time to time however, I come across a view that strongly condemns inline css. The reasons given tend to be one of the following.


There are times when you want a certain style to be reused many times. An obvious example is a button. Typically you will want all of your buttons to look the same, therefore it is of course common sense to not repeat yourself (remember the DRY principle). I absolutely agree with this justification, and in cases like these I tend to advocate the use of an external CSS file.

In reality however, there are a lot of cases where an element needs to be styled in a unique way. In these cases, I prefer not to use external styling. The reason for this is that I don’t see the point. It may only take a few seconds to flick between your html and CSS files, but every second counts when you are trying to maintain a sense of flow.


Generally I think inline CSS only affects readability when there is a lot of it. There have certainly been times when I have overdone inline CSS, and this is one area where I do feel I could improve. What I plan to start doing, is using <style> tags more, to separate css from html without having to flick between css and html files. In terms of general code prettiness, I actually find the idea of a css file containing thousands of classes, many of which only contain a single attribute, rather ugly. I also feel that adding CSS to the <style> tags on a page maintains a sense of cohesion, with the elements and their styles being stored in the same file.

HTML5 allows <style> tags to be placed outside of the <head> tag, allowing us to add <style> tags to partial views which contain no <head> tag. This is a feature I have yet to take advantage of, but plan to start doing so.


The question of performance is an interesting one. External CSS  files are cached by the browser, which results in pages loading faster as the user navigates around a site. The trade-off here is that the first page load requires an additional HTTP request to retrieve the external file. So the first page load might be slower, with subsequent pages being faster. For this reason a common approach is to use inline CSS  (and JavaScript) on homepages where there is typically only one page view per session.

In other words, you need to consider your individual circumstances and requirements. I tend to lean away from premature optimization, favouring readability, speed of development, and good design as guiding principles. If performance is an issue, it needs to be defined as such, with specific requirements for how quickly a page needs to load, and under what circumstances. If a site is failing to meet performance requirements, then I would happily look at the effects of inlining versus external files.


I am keen not to sound like I am saying that inline CSS is good, and external CSS is bad. My point is that under certain circumstances, inline CSS is acceptable, in my opinion. What is more important than my opinion however, at least as far as inline CSS is concerned, is the opinion of my employers and team-mates. Consistency is good, therefore if I were to find myself in a team where inline CSS is prohibited, I would happily adjust my approach. But I have never been on a team like this.

I do plan to start using <style> tags more, as this seems like something of a compromise. It will probably make html files more readable, without the ugliness of an enormous external CSS file. An additional benefit of this I guess is that if I ever did decide, for some reason (e.g. performance), to move my CSS into an external file, it would be much easier to do so from <style> tags.

There is something to be said for reading the thoughts of more intelligent and better programmers, and basically doing what they do, particularly early on in our careers. However I feel it is more important that we critically evaluate what we read, and decide for ourselves the best approach. The rule of thumb I follow is to be generally weary of rules of thumb.

The Frustrated Programmer

Ronnie Mukherjee 1 Comment

As a programmer you are likely to experience frustration from time to time. In the last few months I am sure you can recall at least a couple of occasions when you have sat at your desk, faced with a problem of some description, and felt at least a small degree of frustration. Or maybe you experience this unpleasant feeling on a daily basis. Maybe it has become part and parcel of your working life. If this is indeed the case, then something needs to change. Too much frustration is likely to lead to stress, which in turn can of course lead to a whole range of physical and mental problems. So how can programmers keep a lid on frustration in the workplace? What can we do to minimize its effects? And why do we ever feel frustrated by our work in the first place?

The Wikipedia page on frustration, in its first two sentences, provides a nice definition:

In psychology, frustration is a common emotional response to opposition. Related to anger and disappointment, it arises from the perceived resistance to the fulfillment of individual will. The greater the obstruction, and the greater the will, the more the frustration is likely to be.

In more simple terms, frustration happens like this: we want something (will), but we can’t get it (obstruction). And the greater the obstruction and will, the more intensely frustration is felt.

If we were to look at frustrated programmers at work, we would find that the thing they want, their individual will, is probably the satisfactory completion of whatever they are working on. If you are trying to fix a bug, you want it to be fixed. If you are working on a new feature, you want to finish its implementation. And if you are working towards a deadline, you want to achieve completion on time.

But something gets in the way. The unanticipated obstruction. You thought you could achieve completion by performing tasks a, b and c. However, halfway through task b, you realise that there are actually a couple of other tasks that need doing – tasks which you hadn’t planned for. Suddenly, the thing that you want – satisfactory completion – has slipped further out of reach. Worse still, your original estimate has become unrealistic.

The result? Frustration.

If all this wasn’t bad enough, you may even find yourself stuck in a vicious cycle of ever-increasing frustration, which looks something like this:


You have a problem, you try a solution, it doesn’t work.

Or maybe this:


You have a problem, you try a solution, and it works, but you find one or more new problems along the way.

What we need to do as programmers is figure out ways to avoid getting stuck in these cycles of frustration. Because the more frustrated we feel, the less sharp our thinking will become. And the less sharply we can think, the less likely we are to come up with good solutions to the problems we face.

How can we do this? How can we fight off the perils of frustration?

To a certain degree, I don’t think we can. We are only human, and we are never going to be clever enough to fully predict how a system is going to behave. Software systems are just too complex for that. In Code Complete, Steve McConnell states that managing complexity is the most important technical topic in software development. In other words, the whole purpose of programming languages, operating systems, design principles and patterns, is to help us flawed humans better understand what is going on inside a computer. Without these things, you would be working directly with bits and bytes, and you clearly wouldn’t get very far in trying to develop a useful application. Our brains are not capable of perfectly predicting how exactly a line of code is going to impact every other part of a system. There will always be times when we are caught up in that cycle of unpredictability, when a 3 day task becomes a 3 week task.

What we can do is improve, reduce the likelihood of frustration, even if we can’t eliminate it completely. The best way to do this is to continuously acknowledge our limitations, and the fact that we are not capable of perfectly predicting a complex system’s behaviour. It is almost a kind of humility. If we can learn to accept that we aren’t as clever as we think we are, then we will begin to take measures to compensate for our limitations. The opposite of this would be to be so confident in our ability that we would just ruthlessly plough into our code, refusing to admit that we’re not actually sure we’re doing the right thing.

The measures you take to compensate for your limitations will depend on your individual circumstances. Test-driven development is interesting as it fundamentally changes the low-level dynamic of how we code. Rather than waiting for a problem to show itself, the developer creates a problem (code which doesn’t compile or a failing test) and then fixes it. However, TDD is no silver bullet. You can still find yourself underestimating the complexity of a task, which will of course lead to frustration. Improving our estimating skills can also reduce the likelihood of frustration, along with finding the courage to provide an estimate which may not be well received. My own practice of religiously taking regular breaks stems partly from an acknowledgement that we are limited in how long we are able to concentrate on a task in an effective manner. Careful planning of a project or task is also a reflection of our inability to ‘wing it’ and still achieve good results. Defining what ‘done done’ looks like ahead of time may also help us to gain control.

It sometimes takes a crisis to force us to accept that we are vulnerable and imperfect. We need to choose not to wait for that wake-up call, and instead make the right low-level decisions, minute by minute, in acknowledgement of our limitations. A lot has been said about positive visualisation and its merits, however we should also consider the potential value of a little negative visualisation from time to time. What if things don’t go according to plan? How likely is it that we will sail through this task without entering a cycle of unpredictable system behaviour, and how can we mitigate that risk?

A Beginner’s Guide to Garbage Collection in .NET

Ronnie Mukherjee 0 Comments


Garbage collection is often seen as a kind of black magic in the world of .NET, particularly by junior programmers. There is no real need to understand how it works in order to build most applications, therefore it remains a mystery until a time comes when you think you need to know how it works, or else you decide that it might be interesting to learn about. I remember early in my career being part of a team faced with a performance problem which was due to a memory leak. We decided to respond by littering the code with GC.Collect calls to see if that helped. Recalling this makes me wince. It is a classic example of “hit and hope”, when really we should have taken more time to try and understand and diagnose the problem. At the very least, having a basic understanding of garbage collection may prevent you from attempting to solve a performance problem in this way, without understanding what that call to GC.Collect is actually doing.

What is garbage collection in .NET?

Garbage collection is the automatic process of freeing up (or deallocating) memory which is being taken up by objects your application no longer needs. Most developers know this much at least.

Every time you instantiate an object in .NET some memory is allocated to store that object. But at some point that object may no longer be needed by your application. Only by deallocating its memory does it become available again for your application to reuse. If memory is not deallocated, then sooner or later your application will run out of memory and stop working.

So garbage collection is a must?

No, garbage collection is just one approach to the problem of how to deallocate memory. In some languages, such as C and C++, it is the responsibility of the programmer to keep track of which objects are no longer needed, and to explicitly deallocate memory as required.

How does garbage collection work?

Inside the CLR lives a ‘garbage collector’. Unsurprisingly it is the responsibility of the garbage collector to manage garbage collection. It does this by periodically performing ‘garbage collections’. Every time a garbage collection is performed, the garbage collector basically looks at the memory (i.e. the managed heap for your application) and frees up memory which is no longer needed, that is, memory which is occupied by ‘dead objects’.

How does it know when an object is ‘dead’?

An object is dead if it is unreachable by your code. The obvious example is a local variable inside a method. There is no way for your code to access that variable once the method has returned, so it becomes ‘dead’.

How often does the garbage collector perform a garbage collection?

There are three ways a garbage collection can be triggered.

Firstly, if your system has low physical memory, this can trigger a garbage collection.

Secondly, a threshold is defined which indicates an acceptable level of memory on the heap which can be used by allocated objects. If this threshold is surpassed, then a garbage collection is triggered.

Finally, a garbage collection can explicitly be triggered by calling the GC.Collect method. Only under very rare circumstances is this ever required.

So ignoring the case of calling GC.Collect, a garbage collection is basically triggered when the garbage collector figures that it might be useful to free up some memory?


So whenever a garbage collection is triggered, it frees up all memory on the heap that is occupied by dead objects?

No, because scanning the entire managed heap for dead objects might take a long time, and thus affect performance. Every time a garbage collection is triggered, execution on all other threads is paused until it completes.  So the garbage collector tries to be efficient in the way it looks for and deallocates dead objects. It is selective.

How is it selective?

Basically, every object on the heap is categorized into one of three ‘generations’. These are called ‘Generation 0’, ‘Generation 1’ and ‘Generation 2’. The generation of an object indicates how ‘old’ it is, that is, how long it has been since it was created. Broadly speaking, Generation 0 is for younger objects, Generation 1 is for middle-aged objects, and Generation 2 is for older objects.

When a garbage collection occurs, it does so on a specific generation, and deallocates any dead objects on that generation and on all younger generations. So a collection on Generation 1 will deallocate dead objects in Generations 0 and 1. The only time that all dead objects on the heap are deallocated is if a collection is performed on Generation 2.

Garbage collection is generally performed on Generation 0, that is, on short-lived objects. This is based on the reasonable assumption that the objects that are most likely to be dead are the ones which have most recently been allocated.

If an object ‘survives’ a garbage collection, it is ‘promoted’ to the next generation. So Generation 0 objects which are still alive at the time of a garbage collection are promoted to Generation 1. The assumption here is that if it is still alive after this collection, there is a good chance that it will still be alive after the next one, so we will move it out of our ‘top priority’ Generation 0 collection.

Presumably new objects are allocated into Generation 0 then?

When a new object is allocated, it goes into Generation 0, with one exception. Large objects (larger than 85,000 bytes) go straight into Generation 2. This decision is based on an assumption that large objects are more likely to be long-lived.

…and that’s pretty much it as far as the basics of garbage collection go.

As we can see, the garbage collector makes a few assumptions about your application to help it decide how to behave. Only when these assumptions turn out to be inappropriate for your application do you need to consider the possibility of manipulating it. For example, you can configure the ‘intrusiveness’ of the garbage collector (how often it is triggered), or explicitly trigger a collection on a specific generation.

The fact that many developers never feel the need to understand how garbage collection works is perhaps an indication that it does its job quite well. The garbage collector does the work so you don’t have to.

My 3 Favourite Productivity Tools

Ronnie Mukherjee 0 Comments

Over the years I have been something of a productivity tool junky. Like many programmers I love trying out new productivity tools and utilities, but 90% of them don’t stand the test of time. I will generally try something out, fail to be impressed, and cast it aside to make way for something more interesting. However, there are three tools in particular which have stood the test of time, and which I have been using on an almost daily basis for years. These are ‘secondary’ tools – they are not essential to my performing my work, but they really make my work easier and help me to get things done more quickly. Each of them is lightweight, minimalist in terms of functionality and free of charge. I have found that paid-for tools are often overloaded with complexities which aren’t needed, perhaps in an effort to justify the price tag. There are so many great free utilities available now that I very rarely find the need to pay for any productivity tool.

Without further ado, here are my three favourite software productivity tools, in no particular order.

1. SlickRun

I hate having to use the mouse. It’s a lot easier to achieve a state of flow by using the keyboard exclusively, and it can really breaks your stride when you have to move your hand over to the mouse and move the cursor or click. Maybe it utilises a different part of the brain. In any case, any tool which helps me to avoid using the mouse is likely to win my approval, and SlickRun does just that.

SlickRun allows you to define commands for common tasks, in particular for opening applications. For example, I have defined a command ‘sql’, which opens up Oracle Sql Developer (as I am working with Oracle databases at present). A small command prompt floats on your desktop, by default just above the date and time in your Windows taskbar.



It it is small enough to be barely noticeable, but large enough to let you see what you are typing. To focus your cursor in this command prompt you just hit Win+Q.

To manage your shortcuts you simply type ‘setup’.


In addition to this primary function, allowing you to quickly and easily open applications and run commands, SlickRun includes a couple of other very useful features which I use regularly.

When your cursor is not focused on the command prompt, SlickRun displays the percentage of RAM on your machine which is currently available.


I know that when this figure falls below 10% my machine will start to run slowly, and I need to close some applications down or end some processes. If my system is grinding to a halt, I will often see that I have only 3% or 4% of RAM available.

The final feature of SlickRun which I find immensely useful is something called SlickJot. Hitting Win+J opens up a ‘JOT’ where you can store notes and other useful bits of random text. Whatever you type here is automatically saved, and you can quickly close the ‘JOT’ by hitting Escape. It is roughly equivalent to having a Notepad text file which you can open and close with keyboard shortcuts, and which saves automatically.


2. Ditto

An even simpler utility is Ditto, which does one thing but does it well.

Ditto is an extension to the Windows clipboard, which simply allows you to not only paste the last thing you copied, but also to paste anything you have copied prior to that, looking back as long as you want. This is a particularly useful utility for programmers, which you may not realise until you try it. Again it requires no mouse interaction. You cut and copy stuff as normal, but when you then want to paste something which you have previously copied, you hit CTRL+’ which opens a list of things which you have placed on the clipboard, with the most recent first. This list is quickly filtered as you type, and unless you clear it, will store items that were placed on your clipboard days or weeks ago.


3. Workflowy

A far more popular tool than SlickRun and Ditto, but equally simple and effective, is Workflowy. It is a web-based application which allows you to write and store notes in a hierarchical structure. We all make lists containing items and sub-items, and workflowy is the most usable system for doing so that I have come across. The beauty of workflowy is the speed with which you can use it. Its keyboard shortcuts allow you to instantly expand, collapse, traverse, delete or move items. If you have a mind which can move quite quickly like mine, you will love it. Mindmaps are undoubtedly an effective tool for brainstorming and organizing your ideas, but there is no mindmapping tool which allows you to capture, process and organize your thoughts anywhere near as quickly as workflowy. I still sometimes use mindmaps with a pen and paper, but once you’ve drawn a branch on paper, you obviously can’t instantly move, edit or delete it.


Like my other favourite tools, workflowy has a very minimalist feel about it, favouring simplicity over bells and whistles. You can’t write text in different colors or fonts, which I think is great. I use it on a daily basis, whether I am outlining a blog, writing a to-do list, or brainstorming. Workflowy is free for up to 500 list items per month, but you can gain additional items by referring friends. I currently have 1500 items per month available per month, free of charge, and this is more than enough for my needs.

I sporadically use the Getting Things Done methodology for organising my tasks and ideas, and I find workflowy is a great tool for helping me to do this.

Final Words

Using the right software tools and utilities can have a huge impact on your productivity. As a general rule, anything which allows you to use keyboard shortcuts to perform common tasks is probably going to help you a lot. The three tools I have described here do just that. They are all free to use, lightweight and are either web-based or can be downloaded and installed in seconds. If you haven’t tried them already, I urge you to do so. I would also love to hear of any lightweight and free tools and utilities which you find invaluable in your work.

B-tree Indexes

Ronnie Mukherjee 0 Comments

Round bookshelf in public library

You probably know that database indexes are a means for improving database performance. But surprisingly few people understand the different types of indexes there are, how they work, and how to choose an appropriate index for a specific performance problem – or indeed whether an index will help at all. This article may help you get started. It will focus on B-tree indexes, the most commonly used index type in both Oracle and SQL Server databases. Some terminology, such as ‘branch blocks’ and ‘leaf blocks’ is specific to Oracle databases, as these are my primary interest at present.

Why Database Indexing?

In short, indexes help our database to find rows quickly.

Without an index, our database must perform a full table scan every time it needs to find a row or a collection of rows. A full table scan describes a process where the database just looks through a table one row at a time, looking for the desired row or rows. In large tables this can be slow. An index provides shortcuts which the database can use to help find what it is looking for more quickly.

One analogy is a library. Imagine how long it would take you to find a particular book if the books in your local library were arranged in no particular order. You would have to examine one book at a time until you found what you were looking for. Thankfully, libraries are organized in such a way as to provide clues on where to look. Books are typically arranged by genre and by author. In most libraries there are computer terminals dotted around which you can use to access a catalog. You type in a title or author, and the catalog tells you where exactly to look for your book. In this way the catalog is much like a database index.

The key of an index is the column or group of columns to which the index is applied. An index can only help speed up a query if its key includes one or more of the columns used in the query’s predicate (i.e. in its WHERE clause). In a library, if all you know is a book’s title, then having all books arranged solely in order of the author’s surname isn’t going to help you.

B-tree Indexes

B-tree indexes are a particular type of database index with a specific way of helping the database to locate records. B-tree stands for ‘balanced tree’ (not ‘binary tree’ as I once thought). A B-tree index orders rows according to their key values (remember the key is the column or columns you are interested in), and divides this ordered list into ranges. These ranges are organized in a tree structure, with the root node defining a broad set of ranges, and each level below defining a progressively narrower range.

Consider again our library example, and imagine that the books in the library are arranged purely with regards to the author’s surname. Now let’s say you are looking for a book by an author named Osman.

On entering the library you see that the ground floor contains books by authors named A-G, the first floor is H-N, the second floor O-U and the third floor V-Z. So you go directly to the second floor to search for books by Osman. On the second floor you see a sign which describes how the books are arranged on this floor. There is one aisle for O-R, and another aisle for S-U, so you progress to the O-R aisle. In this aisle there is a shelf for each letter, so you find the shelf for O, and here the books are ordered alphabetically so you can quickly find your book by Osman.

Now let us consider the equivalent database search. Imagine all the books in the library were contained within a Books database table, with a B-tree index on the author_surname column. To find your book by Osman, you might perform the following query:

SELECT * FROM Books WHERE author_surname = ‘Osman’

First the database would examine the root node of the B-tree index, which might define four ranges corresponding to our four floors in the library (A-G, H-N, O-U, V-Z). It would then progress to the node in the next level of the tree representing the O-U range, corresponding to progressing to the second floor of our library. Within this node would be a further set of ranges (O-R, S-U), corresponding to the aisles on the second floor, and so on.

Each node in the B-tree is called a block. Blocks on every level of the tree except the last are called branch blocks, and those on the last level are called leaf blocks. Entries in branch blocks consist of a range and a pointer to a block on the next level of the tree. Each entry in a leaf block represents a row and consists of a key value (e.g. ‘Osman’) and a rowid which the database uses to obtain all other data contained within the relevant row.

Traversing Across Leaf Nodes

In addition to traversing up and down the B-tree, the database also has the ability to traverse horizontally between leaf nodes. This can be useful, for example, when processing a query containing an ORDER BY clause on the indexed column. Consider the following query:

SELECT * FROM Books ORDER BY author_surname

With our B-tree index, the database can simply start at the first leaf node in the tree, and traverse horizontally across all leaf nodes in sequence to obtain the requested ordered collection. Without this index, sorting the rows by surname would obviously be a much more complicated and slower operation.

Traversing from leaf node to leaf node is also useful when executing queries which return a range of rows based on a predicate. For example, consider the following query:

SELECT * FROM Books WHERE author_surname > ‘O’

Using our B-tree index, the database will traverse down the tree from the root until it reaches the leaf node containing rows where author_surname begins with ‘O’, and then simply traverse to the right across all remaining leaf nodes to obtain books for ‘P’, ‘Q’ and so on.

Final Words

There are a number of ways you can configure a B-tree index to meet your specific needs. In Oracle you may want to look into index-organized tables, reverse key indexes, descending indexes and B-tree cluster indexes. In addition, you may want to consider Bitmap indexes, which are an alternative to B-tree indexes. These topics may be covered in future blog posts, but hopefully this article has given you a quick introduction to B-tree indexes.

A Tale of Optimization (part 2)

Ronnie Mukherjee 0 Comments

Click here for Part 1

Chapter 3: The Select n + 1 Problem

The Select n + 1 problem describes a situation where you are hitting the database many times unnecessarily, specifically once for each object in a collection of objects. As I mentioned in part 1, I had been struggling to diagnose a performance problem with an operation which involved retrieving several thousand objects from the database. My NHibernate log file showed me that this single NHibernate request was resulting in thousands of database queries being executed, one for each object in my collection. So why was this happening?

Below is a simplification of my class structure, starting with my root class, Equipment.

public class Equipment
   public virtual IList<Asset> Assets { get; set }
public class Asset
   public virtual Location Location { get; set; }
public class Location
   public virtual IList<PositionPoint> PositionPoints { get; set; }
public class PositionPoint
   public virtual double X { get; set; }
   public virtual double Y { get; set; }
public class Equipment
   public virtual IList<Asset> Assets { get; set }

public class Asset
   public virtual Location Location { get; set; }

public class Location
   public virtual IList<PositionPoint> PositionPoints { get; set; }

public class PositionPoint
   public virtual double X { get; set; }
   public virtual double Y { get; set; }

Each instance of Equipment has a collection of Assets, each of which has a Location, which in turn has a collection of PositionPoints. The problem this structure presents to NHibernate is that the root class has a collection of objects in a one-to-many relationship, each of which has another collection of objects in another one-to-many relationship. My mapping classes had been set up to explicity turn off lazy loading for Assets, Locations and PositionPoints, therefore NHibernate was obliged to find a way to fetch all this data, and it chose to do this by first retrieving the data for Equipments, Assets and Locations, and then executing a single query for each Location to retrieve all of its PositionPoints.

I couldn’t remember why exactly I had turned lazy loading off for these relationships (perhaps I should have commented my mapping file with an explanation). Therefore I modified the mapping file to turn lazy loading back on. As expected this solved the Select n + 1 problem, as NHibernate was no longer obliged to fully populate Locations and PositionPoints. However, this change caused an exception to be thrown in the business layer, a LazyInitializationException. This was caused by logic in the business layer which was attempting to read the PositionPoints property of a Location after the session which originally obtained the root objects had been closed. Indeed this exception may well have been the reason I had previously decided to turn lazy loading off for these objects. So the idea of using lazy loading was not a viable solution, at least not without some other changes being made. A little research around the lazy initialization problem led me to the idea of injecting my NHibernate session into the data access layer from the business layer, allowing me to use the same session for the lazy loading, but I really didn’t want my business layer to know anything about database sessions.

I reverted my code to switch lazy loading back off and continued to investigate my original problem. I tried instructing NHibernate to eagerly load my objects using a HQL query to eagerly fetch associated data, but this resulted in a cartesian product issue, where the returned collection contained duplicate objects.

Then I discovered a page on ayende.com on NHibernate Futures.

Chapter 4: Futures

NHibernate Futures are a variation on the MultiCriteria feature, which allow us to combine queries to eagerly load one-to-many associations in exactly the way I needed. I would have to define a Future query to retrieve all of my Equipments, then another to retrieve my Assets, and another to retrieve my PositionPoints. These queries would then be combined my NHibernate to retrieve and combine all the required data in a single roundtrip. Finally it seemed like I had found a solution to my problem. I modified my code to use Future queries and tested it.

But it didn’t work!

Stepping through the code and checking my log file revealed that each Future query was causing a trip to the database. Future queries are not supposed to result in an immediate trip to the database, execution should be delayed until the last possible moment.

Again I had hit a brick wall. So again I started googling for answers.

After some time I very fortunately stumbled upon an explanation – NHibernate Future queries do not work with an Oracle database. This was disappointing.

Chapter 5: Getting Futures to Work with Oracle

So now I had reached a point where I had discovered an NHibernate feature which would seemingly allow me to eagerly populate my collection of objects in an efficient way. But it wasn’t supported with Oracle. I did however discover a method of getting Futures to work with Oracle on stackoverflow.

I would need to extend NHibernate’s OracleDataClientDriver and BasicResultSetsCommand classes. I followed the instructions, and updated my NHibernate config file to use the new driver. I reran my code using Futures, and it worked! All of my data was returned in a single trip to the database! But it wasn’t quick. In fact it was very slow. The whole point of this was to try to optimize my code. The Select n + 1 problem seemed to be an obvious reason for its slowness. I had solved that problem. But my code was still slow. Why? The reason was that the solution I had found on stackoverflow to get NHibernate Futures to work with Oracle used cursors. And cursors are slow. The built-in Futures feature results in SQL which does not use cursors. I had found a workaround but it wasn’t a good solution for my problem. Yet again I felt like I was back to square one.

Chapter 6: Rethinking My Approach

Having gone down various rabbit holes and tried a number of potential solutions, it was time now to take a step back from the problem.

What had I learnt?

I needed to obtain a large collection of Equipments, and their associated Assets. The operation was too slow because of a Select n + 1 problem. I needed to read PositionPoint data in the business layer.  I couldn’t lazy load this data because of a LazyInitializationException. I couldn’t use NHibernate Futures because the result was still too slow (with the Oracle workaround at least).

But what exactly did I need to use my PositionPoints for? I reviewed my business layer code and then it hit me. Of the several thousand Equipments and Assets I was retrieving, I only actually needed to access the PositionPoints of a small number of them! Less than ten in fact. Therefore if I turned lazy loading back on, which would result in a fast select query to obtain my root objects, I could identify in my business layer which objects I actually needed to access PositionPoint data for, and hit the database again (using a new session), just for those particular objects.

A few minutes of coding later and at last, I had my solution. The operation which had previously been taking around one and a half minutes was now taking around 30 seconds – an acceptable level of performance.


Looking back on this journey, I must admit I feel a little stupid. I had been looking at the problem in the wrong way. I had assumed that my approach was correct, and that I needed to optimize my use of memory or NHibernate, when in fact it was my algorithm which was inefficient. This is the main lesson I will try to take from this experience. When faced with a database performance issue, first review your algorithm and consider whether you are retrieving data you don’t actually need, particularly when using an ORM framework. There are also a few other things I will take away from this. Perfview is a great tool which I am sure will use again. NHibernate logging is an equally valuable tool for understanding what is going on under the hood. And it remains a mystery how anyone ever coded before the existence of the Internet!

A Tale of Optimization (part 1)

Ronnie Mukherjee 0 Comments

trying to get an award

I intended for this article to be contained within a single post, but it turned out to be too long for that. Click here for part 2.


Over the past couple of days I have been on quite a journey in trying to optimize a method in my current project. I’ve known this operation was slow for the past few months, but it has never really been a priority to address the issue.

On my daily commute I often listen to podcasts, my favourites being .NET Rocks, Radiolab and The Guardian Football Weekly. Earlier this week I listened to a .NET Rocks episode titled Making .NET Perform with Ben Watson. Watson is the author of Writing High Performance .NET Code and during the show he and the guys discussed the topic of performance optimization from a number of different angles. The show inspired me to finally look into this annoying performance issue which had been on my radar for months. As I was listening I contemplated the problem, and in particular took an interest in a discussion on a favourite tool of Watson’s: PerfView. I had never heard of PerfView, but it sounded like the perfect application to help me understand my performance issue, and if nothing else offered the chance to try out a new tool. Apparently it was free and lightweight – two characteristics I always love in a development tool.

Chapter 1: Adventures in PerfView

Later that day, sitting at my desk, I downloaded PerfView and read through its tutorial. What a great tool! I had previously used Red Gate’s ANTS Performance Profiler, admittedly to a limited extent, but PerfView seemed easier to use, just as capable and a great deal more lightweight. Essentially PerfView helps you to explore two aspects of your application – CPU usage and memory allocation. My problem operation involved retrieving several thousand rows of data from an Oracle database, to automatically populate a collection of complex C# objects, using NHibernate. I had a hunch that it was the complexity of each object, with several layers of inheritance and multiple associations, that was the problem. I was perhaps slightly biased having just heard Watson emphasise the importance of memory allocation in .NET applications and how slowness was often a result of memory issues. Indeed, the PerfView tutorial states:

If your app does use 50 Meg or 100 Meg of memory, then it probably is having an important performance impact and you need to take more time to optimize its memory usage.

So I loaded my application in Visual Studio, paused execution with a breakpoint and used PerfView to take a snapshot of the heap, when I knew my large collection of objects would be in memory. I discovered that although IIS express was indeed using over 100MB of memory, only a fraction of this (around 10%) was being used by the heap. So maybe memory allocation wasn’t the problem at all?

Next I decided to use PerfView to analyse the CPU usage during my long operation. In total the operation was taking around one and a half minutes. I ran an analysis and was not surprised to find that the bulk of this time was taken up in the data layer, retrieving data from the database and implicitly populating my collection of several thousand objects. This was just as I had feared. Would I have to redesign my database and class structure to remove some layers of complexity? This would be a huge task. However, on closer inspection, I realised that although over 80% of the CPU usage during this operation was taken up inside my data retrieval method, the total CPU usage time was in fact only 15 seconds or so. Surely this could mean only one thing – it must have been the database query which was taking so long, which suprised me as several thousand rows is of course not much to ask of a database.

Chapter 2 – NHibernate Logging

This project is the first time I have used NHibernate. While I think I can see its benefits, or rather the benefits of using an ORM tool in general, I am not totally convinced. I come from a traditional background of writing and calling my own stored procedures, and miss that complete control of how and when the database is called. There have been a few times when I have wrestled with NHibernate to achieve the desired results, but perhaps this is just a part of getting to grips with it and learning how to use it to my advantage. In any case, having concluded that the problem involved my database query, I wanted to know exactly what query was being executed. After some googling I found that I could use NHibernate logging to obtain this query, by adding a couple of lines to my web.config file.

Using breakpoints to isolate the data access method, I was able to examine my log file to obtain the database query in question. It was indeed a very complex query, with many joins corresponding to the multiple layers of inheritance defined in my class and database structure. However, I noticed that stepping from my data retrieval line of code to the next line was in fact pretty quick, less than 5 seconds in fact. Copying and pasting the cumbersome SQL query into Oracle SQL Developer and executing it from there confirmed that the query itself was indeed very fast, despite its complexity. So my assumption was proved wrong again. It was not memory allocation that was the problem, it was not my data retrieval query, yet it was not CPU usage that was taking up so much time. So what was it? I hit F5 to continue execution from my breakpoint, let the operation complete, and then reexamined my NHibernate log file. To my surprise I discovered that the database had been hit several thousand times, running a relatively simple query each time to populate a collection property on one of my classes. It seemed that, without my knowledge, I had fallen victim to the Select n + 1 problem.

Click here for part 2.

Advice to My Younger Self

Ronnie Mukherjee 0 Comments

grandpa dj

In professional terms, at the age of 33 I am still relatively young. I have more of my career ahead of me than behind me. Nevertheless, when I look back, I can see that my perspective has changed considerably. With this in mind, I thought I would consider the question: if I could give some professional advice to myself as a fresh-faced graduate entering a career in programming, what would that be? This post is my answer to that question.

At the start of a project, things always look rosy

One of the most satisfying parts of life in software is beginning new projects. You have a blank sheet of paper upon which you can create the greatest system ever developed. The possibilities are endless. You will get everything done on time, under budget, and you will be a hero. It is difficult to avoid this kind of wishful thinking at the start of a project, in fact such optimism is a good thing in some respects. All great achievements start with a lofty vision. However, without being too pessimistic or miserable, I believe it is important to temper that early enthusiasm with a dose of realism. There will be difficulties, disagreements and unexpected obstacles. People will over-promise and under-deliver. Some things will take much longer to complete than expected. This is just how projects unfold. If we acknowledge this reality from the outset, we are more prepared for difficulties, even if only on a subconscious level. This is something I have learnt from experience. Things never run completely smoothly. One common mistake that people make is to believe that by following a particular methodology or project management method, difficulties can be largely eliminated. This is simply not the case. There will be difficulties, and a degree of stoicism is required to handle and overcome these difficulties.

Time and experience are the best teachers

Early on in my career, I was desperate to make progress. I could see that the people around me were better than me and I wanted to bridge that gap as quickly as possible. I studied and tried hard but just couldn’t move forward at the speed I wanted to. Now I see that the reason those around me were ahead of me was simply that they were more experienced. There is a reason that job descriptions tend to require a certain number of years of experience. Reading a book or completing a tutorial is incomparable to real experience. In addition, personal and professional development take time – not only the hours spent at work, but the years spent developing an overall picture of work, people and life. As you encounter and overcome obstacles, your brain forms new connections which make sense of those problems and prepare you for their reoccurence in the future. There is no other way to form these connections than to gain experience and wait. Success in my opinion is essentially about learning to solve problems, whether the problem is a bug in your code or a difficult colleague. You can read about bug-fixing or relationships, and this can help to some extent, but to really develop you need to face and overcome these problems.

Satisfaction at work is down to the people around you

As a junior programmer, I greatly underestimated how important people are to your level of satisfaction at work. From your peers to your boss to your customers, the people around you are the biggest influence to your day-to-day levels of job satisfaction. You can be faced with a failing project or a seemingly insurmountable snag list, but if the people around you are intelligent, positive and understanding, you will be able to cope and learn. Equally, you can have access to all the latest tools and technologies, and use them to design and deliver a brilliant system, but if you are working with difficult people, you won’t enjoy the experience. It is easy to think of life in programming consisting simply of a programmer and his machine, joining forces to conquer the world (or at least his project). Indeed, programmers are perceived stereotypically as geeks, because they are seen as lacking in social skills. This stereotype comes from the fact that introverts are often attracted to computers as a possible alternative to having to deal with actual people. I see myself as something of an introvert and this is possibly what drew me to computers initially, but there is simply no escaping the fact that you can’t get  very far alone. The good news is that human relationships offer rewards far greater than anything offered by a machine, and the real value of a successful project comes from sharing satisfaction with your colleagues and your customers.

Don’t just code

As I have progressed in my career, I have learnt that actually writing code is a small part of being a good programmer. A much more important skill is the ability to think well. That is, the type of thinking required to take a vague task or problem and turn it into a plan of action. Sometimes we need to step back and perform a ‘brain dump’ of everything on our minds. We need to learn to capture, organize and prioritise ideas, and also to let go of some of our desires. For example, upon experiencing a desire for our code to be supported by unit tests, a seemingly reasonable next step would be to start writing unit tests. But we need to learn to view that desire relative to the bigger picture, to make sacrifices and realise that we can’t achieve everything. As much as we would like our code to be faster, is performance optimization the best way we could be spending our time right now? The only way to effectively consider such questions, I have found, is to stop coding for a while and start thinking. Make a list on paper, in notepad, or on a whiteboard. Draw a mind map, be messy, write down your thoughts. Think about how much time you have, make some estimates, make some sacrifices and decide upon the best course of action. Then go ahead and start coding in the right direction.

I hope you have found something interesting or useful in this post, particularly if you are new to programming. I have no doubt that in ten years time, my view will be quite different. As we progress through our careers and our lives, our experiences will inevitably reshape our views. It would be nice to know what my future self would advise me right now, but I guess I’ll just have to wait and see.