Jump to content


Photo
- - - - -

Performance-tuning an enterprise application


  • Please log in to reply
No replies to this topic

#1 usr.c

usr.c

    Boss, my code's compiling (xkcd)

  • Admins
  • 10,440 posts
  • Gender:Male
  • Interests:Software
    Soccer
    Photography
    RC cars
    Electronics

  • Nothing Selected

Posted 28 May 2009 - 08:42 PM

The situation
So you’ve been working on a software system for some time and now it’s time to refactor it. There’s always a fuzzy feeling up for grabs whenever you decide stop with the change requests and defects and go off the beaten track to refactor-ville. Sometimes, the decision to refactor is dictated by management and takes the form of a quality-attribute (non-functional) change request, and other times it’s just for the sake of the aforementioned fuzzy feeling, which is not to be underrated, by the way.

A few days ago, we finished working on a refactoring project whose aim was to improve our J2EE system’s performance. Improvement in this particular context was defined as
  1. lower response times, and
  2. lower memory utilization
Both issues had caused a number of embarrassing episodes in our production environment and so it was our job to determine how best to mitigate or eradicate their root causes.

Even when one is faced with a solved issue, it is important to take the idiosyncrasies of the system at hand into consideration. That is, we wouldn’t be the first developers in the world to try our hands at performance tuning a live system, but we definitely would be the only developers in the world to performance-tune our particular system.

Compiling a list of system-specific issues
Going to developers, business analysts, users and other stakeholders and eliciting information from them about their experiences, be they general or specific, in terms of modules and components or in terms of business processes, is a good way to start. Input from stakeholders who are more versed with the codebase and system is of great value since they can give you an idea about which use-cases you should focus on the most.

Generating concrete issues from existing patterns
There is a sizable subset of issues that affects quality attributes, including performance, that are technology and environment-agnostic. Therefore, it is easy to apply heuristics and best practices derived from those patterns of issues in a software system and reap their benefits. Though such patterns are usually found in literature, we decided to take a third-party tool (FindBugs) out for a spin, which not only gave us a list of said patterns, but also had the added advantage of identifying how well our system rated with respect to each of those patterns by indicating how many times we were breaking each. We ran FindBugs on our Web and EJB projects and within minutes it gave us a list of all bug patterns and how we fared in each. For us, we were interested in the following patterns:
  • The ones under “Performance”
  • A few of the ones under “Dodgy” (Dead local store, redundant null-checks, duplicate branches, unsatisfied obligation to cleanup streams or resources)
  • A few of the ones under “Correctness” (Dubious method invocation, Useless self-operation)
  • A few of the ones under “Bad Practice” (Database resource not closed on all paths, Stream not closed on all paths)
The important thing here is to realize that when you run a tool on your codebase, taking the results as gospel and fixing every one of the reported issues isn’t necessarily going to automagically give you a better system (however so “better” is defined). We learned the long way that had resources been scarse, micro-benchmarking would be an utter waste of man-hours. No one cares if you can shave one second a day off your response time by removing redundent null checks or saving 1MB in your heap by getting rid of dead code. Usually, it’s much more cost-effective to just throw hardware at such problems, which is what we ended up doing.

Prioritizing the final set of issues
Having a set of concrete issues isn’t enough. The most important step is prioritizing them. For us, our criteria for prioritization were
  1. noticeable effect on user-experience, and
  2. impact on code
That is, the things that came highest on our list were the things that would disturb as little of the codebase for the most performance gain. As it turned out, the first thing on our list was closing streams and resources that were being left open in numerous Data Access Objects. We don’t use any object-relational mapping libraries like Hibernate, so the onus of making sure that things aren’t done sloppily lies on the developer.

Prioritizing makes sure that the issues that are least cost-effective to fix are at the bottom; unless you prefer to have glistening and squeaky-clean code staring at you every day, such issues should really be ignored. Fixing issues that, although make the code cleaner and leaner, but only shave a few seconds off your collective response times, are really not worth the time and effort to fix.

Measuring success
Without quantitative data, it is difficult to accurately gauge the success of this entire exercise. In our case, success was defined as a noticeable improvement in the system’s responsiveness to user actions. The Out-Of-Memory issues had been solved with hardware upgrades so we didn’t worry too much about that. To help make our results as reflective of reality as possible, we ran tests using Rational’s Performance Tester on a sample set of code paths that were determined by identifying the use-cases that were affected by the most number of code changes on two workspaces: one included none of the changes and the other had all of them. The latter decision was to reduce the possibility of network latency caused by system calls to interfaces in external systems polluting our results. Averaging the results of a large number of iterations per use-case, such as 100, aided that too.

Take-aways
  1. It is essential to have a well-structured plan when refactoring a codebase, both to manage it well and to be able to measure its success
  2. Some problems are better solved by throwing hardware at them
  3. When using a third-party tool, it is important not to think strictly in terms of false positives and false negatives; a pragmatic view of which issues are likely to have noticeable impact and which are likely to have negligable impact should be adopted
  4. Whenever code is changed, checked-in and then deployed, particularly by mere mortals, there is always the possibility of it negatively impacting other parts of the system, so limiting modifications to only those whose effects are likely to be significant is more cost-effective
  5. To avoid the same issues creeping into the codebase again, it is essential to add a step to your deployment process that ensures that all files scheduled for deployment do not contain any of the identified issues; in the absence of automation, a builder, senior developer or second developer could simply code-review the files
Ali Almossawi
May 21, 2009



Things that I don't suck at: Photography (flickr, JPG Mag), Skydiving, Splitting atoms, Flying a space shuttle
"Don't bail; the best gold is at the bottom of barrels of crap!" -Randy Pausch
I have people-skills goddamnit! What is wrong with you people!!! | www.skyrill.com




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users