Tuesday, April 25, 2006

Keeping track of versions of internal libraries

We've got several applications, each in several versions. We've got several internal libraries, with their own versions. Because it was starting to get a bit harder to keep track of what version of an application requires what version of each library, we now have a solution.

Each library now has a class called Version. This class has one static void method called version_n_n_n() that does nothing, with the 'n's reflecting the version of the library. In the applications, a method early in the call chain (i.e. the constructor of an MVC controller) makes the calls.

When the library version gets bumped, we create a new branch in Subversion, and the method gets renamed. Any code that depends on a newer version of the library also gets updated to call the new method.

The big benefit here is that anyone who hasn't pulled down the new library versions, but gets the latest app code, gets immediate and obvious compile errors, and knows to get the updated libraries.

Friday, April 07, 2006

Setting up log4j

I just got told by Boss that we need to make our logging more official. I can buy that. I suggest log4j as a de facto standard. He can buy that, and I'm off and running.

So I wander over to the log4j site, use SuSE's yast to install it, and pull up a source file with a few println() calls, to try it out on. I install the docs and the API javadocs, and pull up the docs front page. The first link is "short manual", which sounds promising.

Here's where the story bogs down, and I slip into rant mode.

I'll start with the punchline. I needed to add to my app's startup code:


BasicConfigurator.configure( new ConsoleAppender( new PatternLayout( "%d{ISO8601} %c%n%p: %m%n" ) ) );


In each source file, I needed to add the following member variable:


private static Logger logger = Logger.getLogger( MyClass.class );


And in place of each println() call, I needed to add the following:


logger.debug( "your message here" );


Actually, the call to logger.debug() could have been a call to logger.info(), logger.warn(), etc., depending on the nature of the message.

How long did it take you to read that? 30 seconds? Given that code, you now can go into the API docs and lookup PatternLayout, and see what that string means, or look up Logger and see what methods you call for what log levels. And while I still haven't mentioned much about what an Appender is, or how you control what log levels get output, (I'm not sure myself yet), you now have working logging, using log4j. Darn close to being as easy as those println() calls you're replacing.

We now return to the rant: How long did it take me to code this? About 20 minutes. The so-called "short manual", in my small Firefox font, takes up more than 22 full-screen pages. I glanced at it, saw bits and pieces of what I needed, poked around the API docs for a while, realized I needed to read short-man a bit more carefully, and gradually figured out what those three simple lines of code would be, to give log4j a whirl.

Why on earth did the so-called "short manual" not start off with three lines of code?! Why did somebody assume (not necessarily the short manual author, but rather the person who put the link to it at the top of the doc page) assume that I'd want to read 22+ small-fonted pages just to get started with log4j?

There was a moment, let me assure you, when, had I not suggested its use, and had I not been tasked with improving our logging, I would have said "println()'s are good enough", and stopped reading.

The key here is that I was in evaluation mode, having not yet bought into its use. 3 lines of code stuck in my app did convince me, once I divined what they were; handing them to me early would have been greatly appreciated, and sold me much quicker. One would assume others might glance around, see 22+ pages, and say without further ado, "println()'s are good enough", which would be a shame.

Later, I'm pretty sure I will want to know more about log4j, and the short manual will be quite useful. But at this moment, I just wanted to see what it'll do, and what I found was a bad way to show me that.

Okay, I've calmed down a bit.

Wednesday, April 05, 2006

Changing Gentoo's CHOST from i386 to i686

I'm running Gentoo, kept pretty much up-to-date. I wanted to run distcc, to let my faster box help my slower box out on long compiles. This let me to want to change my CHOST variable on the slower box from i386-pc-linux-gnu to i686-pc-linux-gnu, to match my faster box. I got the impression at some point that I could have installed gcc as a cross-compiler, but changing CHOST sounded "easier" (or would at least let me end up with a less complicated config).

So I searched the forums, and found people talking about how to make such a change. Opinions fell into three camps: a) DON'T DO IT!, b) you can, but it's hard and not for the faint of heart, and c) I did it, no sweat. Having finished the process, apparently successfully, I'm writing up what it took. If you don't want to read the whole (quite long) thing, the summary is that I think I fall into (b). It wasn't that hard, but it was time consuming, and took a bit of troubleshooting. My full experience follows:

To start off with, I did what I would heartily suggest to you: BACK UP YOUR WORLD! Assume that your machine will end up fragged, and act accordingly. Actually, it probably won't mess with your data, but then actually, you should have a backup plan already in place. (There's two kinds of people: those who backup, and those who have never lost a hard drive.) There. I said it.

Before you start, there is a good thread to read on the forums. In particular, the three-post series in the middle of the page by amne, danielrendell and Bob P give a pretty clear idea of what it should take, and what I decided to do. To spell it out, my plan was to combine suggestions, and do this:


change CHOST
emerge glibc binutils gcc
emerge glibc binutils gcc
emerge -e system
emerge -e system
emerge -e world
emerge -e world


So off I went, to /etc/make.conf, and changed CHOST from i386 to i686. I kicked off my first emerge of the build tools. This ran fine. I kicked off my second tool emerge, and got an error running emerge, saying that Python couldn't find libstdc++.so.6. I "fixed" this by creating a symlink:

ln -s /usr/lib/gcc/i686-pc-linux-gnu/3.4.5/libstdc++.so.6 /usr/lib

Rerunning the 2nd tool emerge gave me a configure error on the glibc emerge, which referred me to config.log, which I found in /var/tmp/portage/glibc-2.3.5-r2/work/build-default-i686-pc-linux-gnu-linuxthreads. It said 'gcc-config error: Could not run/locate "gcc"'. So off I go, into debug mode.

The first thing I found was that PATH still had /usr/i386/gcc-bin/3.4.5 in it. (Keep in mind that when I say i386 or i686 from here on out, I probably mean i386-pc-linux-gnu or its i686 version). Looking in /etc, found i386 in profile.env, which tells you that it is generated by env-update. I ran env-update, but i386 was still there.

I grepped, and found i386 in several files in /etc/env.d/gcc, which led me to search for some way to reconfigure things, which led me to run:


gcc-config -P i686-pc-linux-gnu-3.4.5
source /etc/profile


This fixed PATH quite nicely, and the 2nd tools emerge worked.

On to the first system emerge. Started fine, but stopped at groff, with a message of:

'gcc-config error: Could not run/locate "i386-pc-linux-gnu-gcc"'

I went through a couple of cycles of trial-and-error: I removed /var/tmp/portage/groff* and reran; still had error. I checked /etc, no references to i386 anywhere.

I checked my environment, and found a couple of i386 mentions, in BASH_VERSINFO, HOSTTYPE and MACHTYPE. I bit of digging (and Googling) turned up the fact that MACHTYPE is a bash reserved variable. Did a solo emerge on bash, which fixed MACHTYPE, but a retry on groff still failed, even after removing /var/tmp/portage/groff* again.

I dug deeper, and found the Makefile in /var/tmp/portage/groff* with the i386 in it. I saw that imake was creating this, and used equery to find out where imake was coming from (assuming it was part of make, and that rebuilding make might fix things). No dice, 'equery belongs imake' reported x11-base/xorg-x11.

So now I'm confused. I also did an equery on groff, and found that it depends on X as well. Ah, but doing an 'equery uses groff' says that groff has an X use flag. Now we're getting somewhere. I created /etc/portage/package.use, and added this line to it: 'sys-apps/groff -X'. This tells emerge to remove the 'X' use flag just for groff. Ran emerge just on groff, and this time it worked.

Reran 1st system emerge, and it worked. Ran the 2nd system emerge, and it worked, too.

On to the first world emerge. Started fine, but gave me an md5 error on one of the docbook zip files. Removed that zipfile and emerged just docbook, but noticed that it was emerging a different version. Ah, the problem was on an old slot. Unmerged the entire old version, and started the 1st world emerge again.

This time, it stopped again, but past the docbook problem of the last try. But after a good bit of poking around, I couldn't see any messages indicating why. I had been writing messages out to one output file, and had seen to strangeness with the order that some messages would come out in, and decided to split up stdout and stderr. Did so, restarted 1st world emerge, and it worked just fine.

Given that the 1st one worked, I decided to setup distcc for the 2nd. Started 2nd world emerge, which worked fine until I got down to xephem, which gave an odd ld message: 'cannot find -llilxml'. I couldn't track it down, so I just tried rerunning xephem. That worked fine. Reran 2nd world emerge, which worked fine.

Took out the line in /etc/portage/package.use for groff (by removing the whole file, since that was the only line in it on my box), and re-emerged groff, successfully this time.

Whew! The world emerges took quite some time (days, on my 400mHz slow-box), but I ended up with a fully rebuilt system, and one running distcc, with a i686 CHOST.

All in all, it wasn't too much work, and I got much more familiar with slots, USE flags and equery, all of which are way cool and knowledge of them is quite helpful when maintaining a Gentoo box. If I had to do it again, I'd be willing to, but be aware that it took up a good bit of my spare time for a week or two to accomplish, I've been running Gentoo for a few years, and I am a developer myself, making poking through these problems less intimidating. YMMV.

Enjoy.