Talk: Application is slowed down? The culprit is Log4j!

Talk: Application is slowed down? The culprit is Log4j!

[[338229]]

Some time ago, we discovered that a SaaS application would intermittently freeze and slow down. Because the cause was not located for a long time, the only solution was to restart it.

This phenomenon is different from the previous program lag we have encountered, because we found that this application slowed down not only during high traffic periods, but sometimes also during low traffic periods. So this was very strange to everyone.

The slowdown of this type of application may last for a while after restarting, but it may reappear after a while.

troubleshooting

When we were ready to investigate this issue, we tried to capture a thread dump of the application when the application was slow. There are many ways to capture a thread dump, and we chose the "jstack" tool to obtain it.

Getting a thread dump when the problem occurs is critical!

We then uploaded the captured thread dump to an online thread dump analysis tool (https://fastthread.io/). The tool immediately generated a report for us.

The report immediately identified the root cause of the problem. The analysis tool showed that the "http-nio-8080-exec-121" thread was blocking more than 100 threads. Here is the transitive dependency graph showing the blocked threads:


From the figure, we can see that more than 100 threads are blocked by the "http-nio-8080-exec-121" thread. When we click on the "http-nio-8080-exec-121" hyperlink in the figure, it prints out the stack trace of the thread:

Looking closely at the framed part of the diagram, you can see that the thread has acquired the lock of org.apache.log4j.Logger and is performing other operations.

Next, we randomly find a thread blocked by "http-nio-8080-exec-121" and look at its stack information:

Looking at the boxed portion of the stack trace above, we can see that "http-nio-8080-exec-56" is currently in a BLOCKED state, and the reason for the block is that it is waiting to acquire the lock of org.apache.log4j.Logger.

As we have just analyzed, "http-nio-8080-exec-121" has obtained the lock of org.apache.log4j.Logger and is performing other operations, but the lock has not been released, so other threads that want to obtain the lock can only be blocked.

All the remaining blocked threads are also waiting to acquire the lock of org.apache.log4j.Logger. Therefore, whenever any application thread tries to log, it will enter a blocked state because it cannot acquire the lock.

At first we didn't have much idea, so we tried to use the power of Google. We searched for keywords like "org.apache.log4j.Logger blocking thread" on Google.

We stumbled upon this interesting bug in the Apache Log4j bug database, and it was discovered back in 2015 (https://bz.apache.org/bugzilla/show_bug.cgi?id=57714).

This is one of the known bugs in the Log4J framework and one of the main reasons for developing the new Log4j2 framework.

Due to this bug, any thread trying to print logs went into a blocked state. It caused the entire application to grind to a halt. Once the application was migrated from Log4j to Log4j2 framework, the issue was resolved.

in conclusion

Log4j has been out of maintenance since August 2015. If your application is still using the Log4J framework, it is strongly recommended to upgrade to the Log4j2 framework. Log4j2 is not just the next version of the Log4j framework, it is a new framework written from scratch with many performance improvements.

Finally, if your website is experiencing slowdowns, this factor can also be considered.

About the author: Manhua Programming is a public account that explains boring programming knowledge in the form of comics + audio. It is committed to making programming more fun.

This article is reprinted from the WeChat public account "Manhua Programming", which can be followed through the following QR code. To reprint this article, please contact the Manhua Programming public account.

<<:  What kind of sparks will be created when 5G meets the power grid?

>>:  5G applications with over 100 million users have crossed the starting line

Recommend

What are LACP and PAGP? What are the differences?

In order to create maximum availability and stabi...

HTTP protocol interview challenges

I am an atypical interviewer. For the first quest...

Some Discussions on the Transmission Network in DCI

Preface The exploration of DCI technology has bee...

COVID-19 impacts industries, 5G and broadband will become a top priority

Biden is hoping to finalize an infrastructure bil...

IPv6 global penetration rate reaches 27%, 6G will be deployed in 2030

Recently, Latif Ladid, chairman of the National I...

Knowledge points of wireless network coverage system

1. What is AP? Answer: AP - Wireless Access Point...

Super detailed! Introduction to Ethernet switch security features

Today I will talk to you about the security funct...

Key Roles of Artificial Intelligence in Mobile App Development

[[431728]] 【51CTO.com Quick Translation】 Today, t...