Thursday, June 02, 2005

"Too many open files in system"

We got this message on one of our servers recently. I went to our sysadmin guy, who "fixed" it by raising the limit in /proc/sys/fs/file-max. So far, so good. Then we started looking at why we were running out, as that server's been fine for quite a while.

/proc/sys/fs/file-nr reports the number of file descriptors allocated by the system. We had over 200,000 descriptors allocated. /usr/bin/lsof lists open files per process, but showed less than 4000 in use.

I also found an article that explains this file in better detail. It mentioned that there'll usually be a difference between file-nr's and lsof's counts, but it implies that lsof's should be higher, since lsof shows net connections, pipes, maps, etc.

I started shutting down processes, in hopes that one was holding leaked descriptors, but nothing helped. We finally rebooted, and file-nr's allocated count came back nice and small, less than lsof's count, as expected.

If anyone has an explanation for this, I'd love to hear it.

2 Comments:

Anonymous Anonymous said...

I've the same problem right now. Still someone a solution?

7:08 PM  
Blogger Dave said...

I'd forgotten about this. Apart from upgrading to newer kernels, I don't think we did anything that "fixed" this, although we haven't had this problem crop up since about a month after the original post.

2:39 PM  

Post a Comment

<< Home