Skip to main content

When recurring 10–15 second outages made the database behind #LinkedIn’s user…

When recurring 10–15 second outages made the database behind #LinkedIn’s user feed briefly unavailable - with no useful logs - engineers turned to off-CPU profiling with #eBPF to find the root cause.

The failures were ephemeral, patternless, and had no clear external trigger.

Here is how they solved it ⇨ https://bit.ly/3PS1RtN

#InfoQ #SoftwareArchitecture #Profilers #Monitoring #Database

Preview image for How LinkedIn Identified a Kernel Lock Contention Issue Causing Recurring System Freezes

How LinkedIn Identified a Kernel Lock Contention Issue Causing Recurring System Freezes

When LinkedIn engineers encountered short-lived, recurring outages where the database powering their user feed became unavailable and then recovered without leaving helpful traces, they had to devise a novel approach to uncover the root cause using off-CPU profiling with eBPF.

bit.ly
View original 0 Likes 0 Boosts

Comments (0)

No comments yet.