Hang bugs – when software gets stuck, but doesn’t crash – can frustrate both users and programmers, taking weeks for companies to identify and fix. Now researchers from North Carolina State University have developed software that can spot and fix the problems in seconds.
“Many of us have experience with hang bugs – think of a time when you were on website and the wheel just kept spinning and spinning,” says Xiaohui (Helen) Gu, co-author of a paper on the work and a professor of computer science at NC State. “Because these bugs don’t crash the program, they’re hard to detect. But they can frustrate or drive away customers and hurt a company’s bottom line.”
With that in mind, Gu and her collaborators developed an automated program, called HangFix, that can detect hang bugs, diagnose the relevant problem, and apply a patch that corrects the root cause of the error.
The researchers tested a prototype of HangFix against 42 real-world hang bugs in 10 commonly used cloud server applications. The bugs were drawn from a database of hang bugs that programmers discovered affecting various websites. HangFix fixed 40 of the bugs in seconds.
“The remaining two bugs were identified and partially fixed, but required additional input from programmers who had relevant domain knowledge of the application,” Gu says.
For comparison, it took weeks or months to detect, diagnose and fix those hang bugs when they were first discovered.
“We’re optimistic that this tool will make hang bugs less common – and websites less frustrating for many users,” Gu says. “We are working to integrate Hangfix into InsightFinder.” InsightFinder is the AI-based IT operations and analytics startup founded by Gu.
The paper, “HangFix: Automatically Fixing Software Hang Bugs for Production Cloud Systems,” is being presented at the ACM Symposium on Cloud Computing (SoCC’20), being held online October 19-21. The paper was co-authored by Jingzhu He, a Ph.D. student at NC State who is nearing graduation; Ting Dai, a Ph.D. graduate of NC State who is now at IBM Research; and Guoliang Jin, an assistant professor of computer science at NC State.
HangFix is the latest in a long line of tools Gu’s team has developed to address cloud computing challenges. Her 2011 paper, “CloudScale: Elastic Resource Scaling for Multi-tenant Cloud Systems,” was selected as the winner of the 2020 SoCC 10-Year Award at this year’s conference.