Status

Check our services status at a glance

Title Kernel memory leak on http8
ID Operation #34
State completed
Beginning date 05/21/2012 3:50 p.m.
End date 09/01/2012 12:37 a.m.
Affected servers
  • http8

Messages

05/21/2012 4:41 p.m.

This server leaks kernel memory since the end of April. It’s located in the NFS code, and will eat up all the memory after ~10 days. Even the latest kernel version (3.4), released today, has this issue.

We’ll work with the kernel developers to try to isolate the culprit and have the bug fixed. In the meantime, we’ll monitor this server closely and update this operation when we have more details.

05/30/2012 3:04 p.m.

We’re now seeing leaks of xfs_inode. We’ll have to roll back to an old kernel version and see if it stops.

05/30/2012 7:25 p.m.

We’ve made a few changes on the kernel (while still running): it now seems to flush its cache permanently, which slows down the server due to high iowait. But the leak is gone. That’s obviously not a solution, we’ll try other things.

05/30/2012 9:23 p.m.

We’ve managed to stop the permanent flush, which means the server is now operating normally. We’ve tweaked a few parameters again, we’re now waiting to see if the leak is under control (xfs_inode is still increasing, but nothing anormal at this point).

05/31/2012 12:28 a.m.

Everything seems stable and under control.

05/31/2012 12:23 p.m.

We’ve finally found the root cause of the XFS leak (rsync on a particular account with more than 7 million files).

The NFS leak is still being investigated.

08/31/2012 1:28 p.m.

A fix has been released. We’ll deploy a new kernel version on all HTTP servers very soon.

09/01/2012 12:38 a.m.

http4, http8 and http9 servers have been upgraded.