Servers status

Check our services status at a glance

ID:
34
Title:
Kernel memory leak on http8
Status:
completed
Started date:
05/21/2012 3:50 p.m.
End date:
09/01/2012 12:37 a.m.
Involved servers:
http8

Upgrades

05/21/2012
16:41

This server leaks kernel memory since the end of April. It’s located in the NFS code, and will eat up all the memory after ~10 days. Even the latest kernel version (3.4), released today, has this issue.

We’ll work with the kernel developers to try to isolate the culprit and have the bug fixed. In the meantime, we’ll monitor this server closely and update this operation when we have more details.

05/30/2012
15:04

We’re now seeing leaks of xfs_inode. We’ll have to roll back to an old kernel version and see if it stops.

05/30/2012
19:25

We’ve made a few changes on the kernel (while still running): it now seems to flush its cache permanently, which slows down the server due to high iowait. But the leak is gone. That’s obviously not a solution, we’ll try other things.

05/30/2012
21:23

We’ve managed to stop the permanent flush, which means the server is now operating normally. We’ve tweaked a few parameters again, we’re now waiting to see if the leak is under control (xfs_inode is still increasing, but nothing anormal at this point).

05/31/2012
0:28

Everything seems stable and under control.

05/31/2012
12:23

We’ve finally found the root cause of the XFS leak (rsync on a particular account with more than 7 million files).

The NFS leak is still being investigated.

08/31/2012
13:28

A fix has been released. We’ll deploy a new kernel version on all HTTP servers very soon.

09/01/2012
0:38

http4, http8 and http9 servers have been upgraded.