Friday, May 20, 2005

Getting the diff between two Postgres databases

These days, I'm pretty much the de facto DBA at our company. We're using PostgreSQL, which I've also been enjoying.

I keep a diagram (in Dia) with our ERD, and generate the DDL using a utility called tedia2sql.pl that we've modifed a slight bit for our own evil purposes. But from time to time, I've needed to get in quick and alter the database, and wanted a way to make sure that the changes I made (that Boss makes, and who am I to tell him not to) make it back into the ERD.

I went searching for a diff utility, and didn't find one, so I rolled my own. Pretty simple, in fact. What I did was use the generated DDL to create an empty database on my own box, then use pg_dump to get PG to give me the DDL out of that empty one, and out of the development database. From there, diff tells me what I need to know.

I even automated it:


dropdb -h localhost dbfoo
createdb -h localhost dbfoo
psql -h localhost -f generated-erd.sql dbfoo
pg_dump -s -h localhost dbfoo > localhost-schema.sql
pg_dump -s -h devbox dbfoo > devbox-schema.sql
diff devbox-schema.sql localhost-schema.sql > diff.txt


Simple, fairly quick, and keeps me in sync. I can then keep the .dia and .sql files in CVS, and get diffs between versions that way as well. The mnemonic I use in reading diff.txt is that 'c' indicates a change (obviously) between the two schemas, 'a' means add to devbox, and 'd' means add to the ERD.

Enjoy.