Does your Java application slow down or outright crash and you have no idea why? The prospect of searching through numerous logs gives you the heebie-jeebies? Fear no more! Async-profiler is like a private eye that follows your applications and takes note of everything they do.
On December 4, guest lecturer Krzysztof Ślusarski visited our premises and taught us a great deal about monitoring and profiling Java applications. He talked about async-profiler, a free and open-source profiling tool, and how Generali, an insurance company, used it for solving one of its performance issues.
Async-profiler is a program that enables continuous monitoring and profiling of Java applications. Once switched on, it monitors several performance metrics of a Java app and saves them to a file. The data can be conveniently viewed in the form of a flame graph (Fig. 1). If anything goes wrong, you can open the graph and identify the source of an issue quickly.
Finding out why your app isn’t performing as expected can be a hassle. Most modern applications aren’t monoliths but consist of multiple components. All these components communicate with each other in a myriad of ways. There’s a wealth of vectors, nodes, and building blocks that don’t make the task of troubleshooting fast or easy. That’s where profiling tools, such as async-profiler, come in handy. They monitor everything that’s happening in your app all the time and present it in a digestible, browsable way (refer back to Fig. 1). So if you want to save time and improve your application in a matter of minutes, async-profiler is an investment you won’t regret (also because it doesn’t cost anything).
Let’s look at a real-life example to better appreciate the capabilities of the tool.
Generali is an international insurance company. It runs an internal knowledge base on Confluence, a web-based corporate wiki by Atlassian. At any time, an employee can log in and browse the wiki. In fact, the wiki registered as many as 18 thousand sessions every day, each handled by EasySSO, a single-sign-on plugin. There was only one problem. For about 20% of those sessions, the process of logging in took between one and five minutes.
It doesn’t seem like a lot of time when considered individually. But if you multiply the minutes by the number of daily sessions, it turns out that Generali’s employees were “on standby” for a combined minimum of 60 hours a day!
18 000 logins / day * 20% * min. 1 minute
= min. 3600 minutes of idle time a day
What was happening? After inspecting the flame graph, Generali found out that the problem started when Confluence called the Connections class (Fig. 2). Once it was called, a lot of users experienced delays because ObjectMonitor was blocked by a single thread trying and failing to connect to an external service. There could be many reasons for the communication failure, for example a network error. But why did the thread continue its attempts to connect for such a long time? The answer: it didn’t have any timeout.
Once the timeout was implemented, login times suddenly dropped from 1-5 minutes to a few seconds. What’s more, the entire process of identifying and eliminating the problem was between 5 and 15 minutes long.
Let’s say that again for emphasis: async-profiler allowed a large insurance company to cut down its idle time by sixty man hours in less than fifteen minutes. This fact alone speaks volumes about how powerful the free tool can be for optimizing performance and troubleshooting errors in Java applications.
We’re really grateful for the chance to familiarize ourselves with async-profiler and look forward to using it in our company. The tool is like an X-ray that examines your application and shows you where it’s hurting. You can download and use it for free from this GitHub repository.
If you liked this article, don’t forget to like and share. We’ll appreciate the love :).