Monday, July 17, 2006

Postgres Speed, and Table and Index Sizes

I've gotta start paying attention to my indexes in PostgreSQL.

I went to update one field across all rows in a big table I have. I did a test update on a local copy of the db, and it ran fine; I then ran it on our shared db, and it still wasn't finished after 3 hours. And other selects were waiting for it to finish, and so on.

After looking around, I noticed that the difference was that I didn't have any indexes on my local copy. And after reading a bit, I found that updating a row makes a copy of that row, freeing up the old space for more data (although not returning it to the O/S). Same general thing happens with index entries as well; the old entry becomes unused, and a new entry gets written.

As part of the conversation on IRC, I was told about contrib/dbsize, and installed it. I then found that my tables indexes took up 571MB from all the updating. I reindex, and it went down to 160MB. The table data took up 511MB; a vacuum full took that down to 190MB, but the index size went up to 338MB from the index activity. Another reindex, and the table/index sizes are 190MB/160MB, respectively.

I guess the main punchline to all this is to remove/replace the indexes on a file-wide update. I'm not doing this in production, just during table modifications, trying to bring a new column up-to-date, but it's something to be aware of.