19 minute read

Joy of Programming: Under standing Concurrency Bugs

Ganesh Samarthyam

Understanding Concurrency Bugs

Advertisement

Concurrency has come of age with the wide use of multi-core processors. In this article, let us explore the importance of writing correct concurrent code.

Multi-core processors have really become mainstream these days. It is common to see mobile phone processors with dual-cores, with some new models even having quad-cores. Almost all computers (laptops, servers, etc) have multiple cores. With the wide use of multi-core processors, it has become more important than ever before to write concurrent code to exploit the power of these processors.

In the past, lots of multi-threaded code was written but for single-core processors. Concurrent code was written mainly for running tasks in the background, to provide responsive user interfaces, etc. But when we start using these applications in systems with multiple cores, the applications become really concurrent and concurrency bugs start showing up.

Writing correct concurrent code is not easy. With every thing else being equal, concurrent code can be expected to have more problems than sequential (deterministic) code. Why? Sequential programs are influenced by input, the systems environment and user interaction. In addition to these factors, concurrent programs are influenced by the ordering of events (such as scheduling, which is non-deterministic). Testing concurrent programs is also difficult. There are two main reasons for this—limited observability and controllability. The tester cannot observe important details of program execution, like the interleaving of threads. The tester also cannot easily reproduce the problems, limiting the controllability. Experts Herb Sutter and James Larus put it succinctly "...humans are quickly overwhelmed by concurrency and find it much more difficult to reason about concurrent than sequential code. Even careful people miss possible inter-leavings..."

When I wrote concurrent programs, I got exposed to different kinds of concurrency problems. I always wondered why no one told me about the fundamental kinds of concurrency problems that one ought to be aware of. So, I created a quick and simple classification of concurrency bugs, which has only three categories of problems that you need to remember: determinismrelated, safety-related, and ‘liveness’-related. Wellknown definitions of these three properties are: ƒ Determinism: Ensure that, for a given set of inputs, the output values of a program are the same for any execution schedule. ƒ Safety: Ensure that nothing bad happens. ƒ Liveness: Ensure that something good eventually happens.

Determinism-related bugs

Data races (also known as race conditions) are perhaps the best known bugs related to determinism.

Typically, when we talk about a data race, we discuss the low-level data race when two or more concurrent threads access a shared variable and when at least one access is a write; and the threads use no explicit mechanism (such as a mutex) to prevent the access from being simultaneous. However, a data race could also be high-level when a set of shared variables need to be accessed or modified together atomically.

There are many other kinds of determinism bugs as well. For instance, when the code depends on thread scheduling, it can cause subtle bugs. I remember cases in which programmers had used sleep calls instead of using mutex or the wait/notify pattern for safe access to shared variables. In such cases, when the programmers try to use the application in their machines, it may work fine, but in a testing or production environment, the bug may get exposed, as in the following real-world incident.

In August 14, 2003, millions of people lost electric power in northern USA and Canada. There were several factors contributing to the blackout, and the official report indicated a problem in a C++ alarm monitoring software. There was a data race caused because of artificially introduced delays in the code. Because of this race condition, the alarm event handler went into an infinite loop and failed to raise an alarm. This eventually led to a power blackout.

Safety-related bugs

A well-known safety-related concurrency bug is ‘missing

locks’, i.e., not using mutexes for a section of code that must be protected from concurrent execution.

Another well-known problem is ‘open call’, i.e., making a call to a method that is not thread-safe, from code that is part of a critical section.

There are other uncommon bugs as well, in this category. To give an example, the ‘two stage access’ problem occurs when a sequence of operations needs to be protected as a whole, but each operation is protected separately.

Liveness-related bugs

Deadlocks and livelocks are perhaps the best-known concurrency problems in this category.

A deadlock happens when there is a cycle in the resources acquired by different threads, and they hold on to the resources for ever (as it often happens at the traffic signals in India).

Livelocks happen when two or more processes continuously change their state in response to changes in the other threads without doing any useful work. For instance, one thread may create a file and another deletes that file—and they keep watching for these events and are busy undoing each other's actions!

When high-priority threads keep using the CPU without letting lower priority threads do their tasks, we have the problem of starvation. So some work will never get done, and that will affect the program.

Sometimes, liveness problems happen because a thread ‘waits forever’! For instance, I have seen an application hang because a thread that acquired a critical section never returned and the program was waiting for that thread to complete.

Concurrency bugs often take many hours to debug, and so it is better to be prepared and safe, than sorry. So, if you write concurrent programs, keep the three kinds of bugs mentioned in this article in mind and avoid them. You'll be happier for having done that.

By: Ganesh Samarthyam

The author is a freelance corporate trainer and consultant based in Bengaluru. You can reach him at ganesh. samarthyam at gmail dot com.

None

OSFY?

You can mail us at osfyedit@efyindia.com. You can send this form to ‘The Editor’, OSFY, D-87/1, Okhla Industrial Area, Phase-1, New Delhi-20. Phone No. 011-26810601/02/03, Fax: 011-26817563

Analyse Java Memory Dump with Eclipse

This is the third and final part of the series of articles on Java heap and thread dump tools. This article covers the Eclipse Memory Analysis Tool and the Thread Dump Analysis Tool from IBM.

The default command line tools that come with Oracle JDK are useful to a limited extent but have a steep learning curve. Another problem with the default command line tools is that these are not portable across different JVMs—IBM and Oracle. In situations where the dump needs to be taken from an IBM JVM, the Oracle tools are not very helpful.

Eclipse Memory Analysis Tool

Eclipse Memory Analysis Tool (MAT) is a free tool that can be used to acquire and analyse memory dumps from both IBM and Oracle JVMs. As discussed earlier, Oracle JVM dumps are in the HPROF format and IBM dumps usually are in PHD format. The default installation of MAT can only acquire and analyse HPROF dumps. MAT can be downloaded from: www.eclipse.org/mat. Support for IBM dumps needs to be installed separately. Instructions to install the plugin to enable PHD format support are available at: http://www.ibm.com/developerworks/java/jdk/tools/iema/. Once this plugin is installed, MAT can be used to acquire and analyse both HPROF and PHD dumps.

Note: 1) MAT is used mainly for analysing memory dumps. Though it can open IBM’s Java core file and list the threads, this information is not of much use when debugging threading issues. The heap dump files–HPROF and PHD –do not contain the threading information required to troubleshoot threading issues. 2) Both PHD and HPROF cannot be used to analyse the native memory. They are most useful in analysing objects on the heap.

Acquiring heap dump

MAT can obtain heap dumps of a locally running JVM process. To acquire a heap dump, go to File Acquire→Heap Dump. This option lists the various heap dump providers— Sun and IBM. It also gives the options to pass while dumping memory. Details on the options to pass can be obtained from the MAT Help menu. In case the dump cannot be acquired from MAT for any reason, the methods described in the first article in the series can be used to obtain the dump. The dumps obtained using command line utilities can be opened

Figure 1: MAT wizard

Figure 2: MAT overview

in MAT. The techniques listed in that first article are especially useful in case of IBM JVMs, where the dump provider may not be able to detect a locally running JVM. MAT can also be used to analyse heap dumps that are automatically thrown when the JVM exits due to an OutOfMemoryError.

Analysing dumps

Choose File→Open Heap Dump to open a heap dump file. This brings up a dialogue box that gives options to analyse the heap dump. The commonly used option is Leak Suspects Report.

Choosing the leak suspects brings up the Overview page that gives the overall heap information such as the heap size and the number of classes. It also reports the biggest objects in terms of memory consumption. These objects are usually a good place to start the analysis of any memory issues.

The left pane of MAT, by default, shows the object inspector view. This has two sections and is constantly available in most of the views associated with viewing objects. The top part of this view gives the object meta-data such as: ƒ Memory address of the object ƒ Class and package names ƒ Parent object ƒ Memory address of the class object and class loader object ƒ GC roots of the object

The bottom part gives object values such as: ƒ Static members of the object ƒ Instance variables ƒ Class hierarchy ƒ Object value. This is not applicable for most objects, except objects such as char arrays.

In order to see the biggest objects, the ‘dominator tree’ view is very useful. Dominator tree is a term used in graph theory to represent a special type of parent node in a tree. A parent node ‘x’ is a dominating node of child node ‘y’ if, from the root of the tree, all traversal paths to child node ‘y’ go through the parent node ‘x’. The same concept applies to heap analysis, since objects present on the heap can be represented as a graph due to references from one object to another. The objects represented as a dominator tree view give information about the ‘containing’ objects of a particular object or the ‘child’ objects of a particular object. This helps in identifying the biggest objects that consume memory from a GC root.

In order to understand the concept of the dominator tree better, consider the following code segment:

ArrayList<MyBigObject> list = new ArrayList<MyBigObject>(); for (int i = 0; i < 99999; i++) { MyBigObjectobj = new MyBigObject(); list.add(obj);

In the above code, the ArrayList object ‘list’ is the dominator object of all the MyBigObject instances created. So, in MAT, all the instances of MyBigObject can be viewed by expanding the ‘list’ object. The retained size of ArrayList will include the retained size of all instances of MyBigObject. On the other hand, consider the following code segment:

ArrayList<MyBigObject> list = new ArrayList<MyBigObject>(); Map<Integer, MyBigObject> map = new HashMap<Integer, MyBigObject>(); for (int i = 0; i < 99999; i++) { MyBigObjectobj = new MyBigObject(); list.add(obj); map.put(i, obj); } }

In the above code, ‘list’ is no longer the dominating object for all the instances of MyBigObject. This is because these instances are now accessible through another path ‘map’ as well. Hence, in the dominator tree, the retained size of ‘list’ does not include the retained sizes of instances of MyBigObject. It only contains the memory occupied by the references to these instances. This is a very important

aspect to consider while debugging memory issues. If only one collection object is expected to contain (hold references to) instances of a particular object, but in the dominator view, if that collection object’s retained size does not include the retained size of these instances, then it can be Figure 3: MAT object inspector concluded that another object also is holding a reference to those objects. The other object that is holding these references could be preventing a garbage collection of these objects, thus resulting in a memory leak.

The dominator tree Figure 4: MAT opening dominator can be opened by clicking on the Dominator Tree link in the bottom half of the Overview page.

The objects present on the heap, traced from their GC root, are listed in the dominator tree. In the dump being considered, the largest retained heap size originates from the main thread. This thread has a reference to the instance of com.perf.memory. MemoryHogger. This object, in turn, has a reference to an instance of java.util.ArrayList that is backed by an object array. This array is the object that consumes maximum heap space. Drilling down into this object array reveals that it holds a large number of objects of type MyBigObject. This object contains another object of type TestObject and so on. The shallow and the retained size of each object is shown in the dominator tree view.

While the dominator tree view gives a graph view of objects by their retained size from the root, the histogram view lists the objects based on the instance count, shallow size and retained size. By default, neither the dominator tree view nor the histogram view differentiate between classes loaded by the JVM or classes loaded by the bootstrap class loader. Due to this, it will be very difficult to find out the objects created by the applications that are consuming memory vs the objects created by JVM itself. MAT provides a feature to group the objects by class loader. This feature is very useful in analysing the objects created by the program. Though this feature is available in all the major views, it is less useful in the dominator tree view because the dominating objects of the application program could have been created by the system class loader. For instance, in the above example, the GC root of the object consuming maximum memory – the ‘list’ object – is the main thread of type java.lang.Thread. Since this thread class is loaded by the system class loader, the dominator tree view

Figure 5: MAT dominator tree view

Figure 6: MAT histogram view

shows the ‘list’ object in the system class loader group.

The grouping feature is much more helpful in the histogram view, where the objects loaded by the application can be viewed readily by looking at the application class loader’s objects.

This feature of grouping by class loader is more useful when analysing memory dumps from an application server such as Websphere. In most typical configurations, the class loader isolation policy for Web and enterprise applications is set and, hence, each application gets its own class loader. In this situation, even though code in different applications could be in different packages, classes for supporting functions such as logging could be common across all applications. In this kind of situation, it is very important to find out to which application a particular object belongs to. Grouping by class loader comes in very handy in troubleshooting memory issues in these conditions.

An interesting observation from the histogram in Figure 6 is that the objects do not show their parent or child references. To get those references, right click an object and choose Merge Shortest Paths to GC. This action shows which GC root is holding a reference (direct or via another object) to this object. This is essentially the same as going back to the dominator tree view, but for the selected object alone.

Note: MAT reports the memory sizes after aligning the object sizes on the 8 byte boundary. This is in contrast to JVisualVM, which reports just the object size without aligning it along the 8 byte boundary. Due to this, JVisualVM may report lower-than-actual memory usage by objects.

A couple of utilities are available in MAT to show the memory wastage in the application. The Java collections utilities show the wastage in collection objects and the hash collisions

in Map objects. This is useful in finding out if searches can be optimised by choosing a better hash code algorithm for the keys or by setting a different load factor for these maps.

Prior to Java 1.7, substring() implementation in java. lang.String returned a string object that was still backed by the original char array, but with a different offset and starting pointer. This was done to make substring() method faster. But this resulted in a memory leak because even though an original large string was eligible for garbage collection, its underlying huge char array could not be garbage collected even if there was one small sub-string object created. This wastage can be viewed in the ‘Waste in char arrays’ feature.

Note: Since Java 7, the sub-string method of string returns a new string that does not point to the original string’s char array. This is a trade-off in favour of decreased memory usage over increased time to create sub-strings.

In addition to the tools described above, MAT also has an OQL console, where specific queries can be entered and executed. The OQL console can be launched by clicking the OQL button. The exact syntax of the OQL commands can be obtained by hitting F1 in the OQL console.

IBM Thread Analyser

IBM Thread and Monitor Dump Analyser for Java can be downloaded from the IBM developer works site: https:// www.ibm.com/developerworks/community/groups/service/ html/communityview?communityUuid=2245aa39-fa5c4475-b891-14c205f7333c

This is in the form of an executable jar file. It can be used to analyse thread dumps created by IBM JVMs. As mentioned in the introduction, thread and heap dumps for an IBM JVM can be triggered by sending a Control+Break signal to the JVM. Once the dump is generated, the core file, usually with the name javacore.<timestamp>.txt can be opened in the IBM Thread Analyser. Like JVisual VM’s thread analyser, the IBM Thread Analyser also detects any deadlocks in the running application and reports them when the dump is loaded.

When a dump file is loaded, the Java system properties and environment variables are listed along with the thread details. In addition, a heap usage summary at the point where the dump was taken and information on previous garbage collection cycles is given. Most of the useful features are available in the ‘Analysis’ menu option.

The basic analysis that can be done is Thread Status Analysis. This gives a graphical view of the threads, based on their states. The details of those threads are displayed in the bottom panel. For each thread, the name, state, native thread ID, the Java method and stack depth are displayed. Individual threads from this view can be selected and details about them are displayed on the right hand side. The Java stack traces of a thread, along with the monitor that it owns and the monitor it is waiting on, are displayed. This is useful in identifying

Figure 7: MAT Java collections

Figure 8: MAT Java basics arrays

the piece of code that has caused a deadlock (if any) and the monitor that is needed to break a deadlock.

The tool also distinguishes between threads that are waiting on a monitor and threads that are waiting on a condition. A thread that is waiting to acquire a lock is in BLOCKED or waiting on lock condition. A thread that is waiting to be notified, usually via notify() or notifyAll(), of some condition so that execution can proceed will be in WAITING or wait on condition state. In general, threads that are in BLOCKED state are a cause of concern as they indicate a lock condition. Threads in WAITING state may or may not be a cause of concern based on which monitor’s condition they are waiting on. There can be overlaps in thread states. For instance, a blocked thread could also have been deadlocked. In this case, the status is displayed as Deadlock/Blocked. The percentage of threads in each state, displayed in Thread Status Analysis, could exceed 100 per cent because threads in dual conditions, as described above, are reported under both states.

The Method Analysis view organises the threads based on their status and the Java method they are in. No Java stack can be reported in case of the native code.

A very useful feature in analysing the performance of applications is the ability to compare two thread dumps. By comparing the thread dumps taken at different points in

Figure 9: IBMTA deadlock detection

Figure 10: IBMTA analysis

time, the state changes each thread has undergone can be understood and, possibly, the reason behind a deadlocked or a poorly responding thread can be understood better. To compare thread dumps, load multiple thread dumps, choose the dumps to compare, right click and choose Compare Threads. Similarly, the monitors owned by threads in different dumps can be visualised by choosing Compare Monitors.

The native memory analysis gives information on the various memory areas, threads, JIT compiler details and class libraries in a single view. This is useful to analyse the overall memory profile of the JVM, i.e., analyse the memory utilisation of each of these memory areas. The Monitor Detail view is the most useful tool in quickly identifying the monitor dependencies among the threads that have led to a deadlock. The cyclic dependency amongst threads and the monitors that caused these dependencies can be visualised by drilling down the threads. Clicking on a thread gives information on the monitor owned by the thread and the monitor it is waiting on. It can be seen that this monitor is being held by another thread.

A variety of free tools is available to profile and analyse Java applications in order to find memory issues and performance bottlenecks. Most of the tools provide the basic information required to perform the analysis. Advanced features such as Object Query Language (OQL) can be used by experts to perform more detailed analysis. The choice of tool depends on the familiarity with the tool and the kind of JVM being monitored (IBM or Oracle). Expertise in the JVM memory and threading concepts is more important for

Figure11: IBMTA thread status

analysing performance issues than for choosing a tool.

References

[1] Eclipse Memory Analysis Tool (MAT) www.eclipse.org/mat [2] Object Query Language - http://en.wikipedia.org/wiki/Object_Query_

Language [3] IBM Heap Dumps http://publib.boulder.ibm.com/infocenter/ieduasst/ v1r1m0/index.jsp?topic=/com.ibm.iea.was_v6/was/6.0.1/PD/WASv601_ zOS_Heapdumps/player.html [4] HPROF: A Heap/CPU Profiling Tool - http://docs.oracle.com/javase/7/ docs/technotes/samples/hprof.html [5] Path to GC Queries:http://pic.dhe.ibm.com/infocenter/isa/v4r1m0/index. jsp?topic=%2Fcom.ibm.java.diagnostics.memory.analyzer.doc%2Fpath_to_ gcroots.html [6] Java Tools: Jps: http://docs.oracle.com/javase/7/docs/technotes/tools/ share/jps.html

Jstat: http://docs.oracle.com/javase/7/docs/technotes/tools/share/jstat.html

JMap: http://docs.oracle.com/javase/7/docs/technotes/tools/share/jmap.html

JHat:http://docs.oracle.com/javase/7/docs/technotes/tools/share/jhat.html [7] Java Instrumentation package:http://docs.oracle.com/javase/7/docs/api/ index.html?java/lang/instrument/package-summary.html [8] Visual GC help: http://www.oracle.com/technetwork/java/visualgc136680.html [9] IBM Developer Works Community: https://www.ibm.com/developerworks/ community/ [10] Java HPROF: http://docs.oracle.com/javase/7/docs/technotes/samples/ hprof.html [11] Portable Heap Dump: http://pic.dhe.ibm.com/infocenter/ java7sdk/v7r0/index.jsp?topic=%2Fcom.ibm.java.win.70. doc%2Fdiag%2Ftools%2Fheapdump_phd_format.html

By: Murali Suraparaju

The author holds an M Tech in Computer Science. He has worked extensively on building enterprise applications using JEE technologies, and has developed solutions for enterprise and embedded products. He is a member of the Performance Engineering practice in the Financial Services division of Infosys Limited.

This article is from: