Wednesday, August 31, 2005

More on File Handles

In response to my "Too many open files in system" problem, I got the following email (posted here with permission, provided it's anonymous):

On our 2.6.9 appliance box we have been suffering the same problem running our application software. I tracked the issue down to the use of ps(1) in a system where /proc is changing quite often. It appears that repeated use of ps can cause the /proc/sys/fs/file-nr count to go up quite rapidly. In our case it was hitting its 200 k limit in just a few days!

The sort of command we were executing often was: count=`ps aux | grep "service" | grep -v grep | grep -v defunct | wc -l`

Things seem to be improved by doing the ps aux to a file, rather than a pipe. the problem seems worse the more output ps produces.

Note that the kernel doesnt increment file-nr too often - there is latency for performance reasons - so it tends to jump up in multiples of 25. When experimenting you have to be patient and wait a while to see if running a command repeatedly is causing a leak.

The problem is that ps itself doesnt appear to be leaking fds directly - it looks like its a "virtual" leak in the kernel, perhaps caused by ps's interaction with /proc in a system that has processes being created and killed very often.

We have worked round this issue by reducing our use of ps to a minimum. Now our application software only leaks a few hundred handles a day, rather than thousands a hour!

I have spent many hours on the net and havent found any kernel or ps patches for this. We will be upgrading to a later kernel at some point - hopefully that will fix the issue.

Thursday, August 25, 2005

Executing Python Files

This is a pesky little thing I just ran across. I was setting up some Python code on my Linux box, pulling it out of CVS on SourceForge, and did the obligatory chmod to make one of my tests runnable. But no matter what I tried, I couldn't run the darn thing.

Turns out, it has DOS-style line endings, (i.e. CRLF, instead of the CR that Unix/Linux likes). Once I used dos2unix to switch that, it worked fine. Apparently, the first-line hash-bang that tells the shell where to find the executable had bash all confused, and unable to find Python.

Wednesday, August 24, 2005

Denying the Worms

I run Apache on my home box. As a result, I get the regular set of script kiddies coming in to see if they can break in. Checking for awstats.pl, phpBB, and the like.

But along with those, I've been getting regular requests for '/' from people on my subnet. It looks like a milder version of the logs we used to get when Code Red was at its peak. The distinguishing characteristic in the logs is that the user agent (browser) is blank.

Well, my root doc is 60k+, and having it hit up to a couple dozen times a day by worms is just a nuisance. So I started poking around in Apache to figure out how to refuse them. It wound up being a tiny bit trickier than I expected; here's what I did:


<IfModule mod_setenvif.c>
...
BrowserMatch "^$" agent-deny
...
</IfModule>

...

<Directory /whatever>
<IfModule mod_access.c>
Order Allow,Deny
Allow from all
Deny from env=agent-deny
</IfModule>
</Directory>


Getting to that point held a pitfall, though. In the logs, the user agent was displayed as "-". So my first shot at a BrowserMatch string was "-", but that matched anything with a dash, i.e. everything. I then tried "^-$" and "^-", but then nothing matched. On Freenode's #apache channel, bare-foot suggested that since the dash is displayed in place of an empty string, that perhaps "^$" would work, and it did.

Prior to getting on irc, I was trying to figure out how to display environment variables, and did manage to change my LogFormat to include "%{agent-deny}e", which indicated that my variable wasn't being set. I then changed the string to match my browser, and saw that it was being set, narrowing my problem down to the regex.

Anyway, now all the bots get is a 295-byte 403 response, and my world is a little bit nicer.

Monday, August 01, 2005

Java Generics

Well, I got all of us in the office moved up to Eclipse 3.1, and have been fairly happy with it so far: a bit faster to start up, filters on the properties pages, etc. And support for JDK 5.0.

I've been interested in playing with generics since I heard Java was getting them. I had messed around with C++ generics, and liked them there, so once I got Eclipse running, I went to work correcting the type warnings that popped up in all of the Iterators and HashMaps we're using.

So I fixed everything in three of our smaller projects, and noticed that the remaining two had 1350+ warnings remaining. Hoo boy, what to do. I mentioned it to Boss, who leaned toward letting our junior guy, Junior, fix them at his leisure. So far, so good.

Then I looked at the code again. It was ugly. < and > everywhere, hideously long names, and after wandering through a few dozen source files, I had found nothing that made me think "hey, glad I put that there, or I wouldn't have found *THAT* bug". So I started thinking about ways to make it cleaner.

I Googled, and found a mention of subclassing, i.e. HashMap, which worked fine (we use HashMap<String,String> a *LOT*), and the code became a lot less ugly, and made further changes a lot easier to make (I called the new class HashMapSS; pretty easy change to that from HashMap). I played with that a bit, then took another look.

I got down to about 1300 warnings, and asked myself: What does it buy us to fix all of these? Initially, I was telling Junior and two other coworkers about generics, and just got blank stares. "Increase coding time and make the code ugly? For what?" With these new classes, we're just down to a minor increase in coding time, but again, for what?

In our case, we're not running into problems that adding these checks would have caught earlier. ClassCastExceptions and the like. As I'm the office code Nazi, I'm the one who would be most eager to code this way, and I'm sitting on the fence. Everyone else has turned off the warnings and continued merrily along.

I'll probably continue to code with the warnings on, and see how it goes, just because, but I'm not as excited about generics as I had expected to be. Perhaps the Python I'm doing in my spare time is rubbing off. Throw a few test cases at it, and all this static typing is more of a burden than an assist.