Wed Nov 2 16:18:09 EDT 2011

Table size

reading 1/16 of the data set I get the following numbers:

rows:   219121

ip:       5901
date:   193871
req:     69166
status:     14
ref:      2228
client:   2466

So apart from date it makes sense to collect the value of the fields
in separate tables as there are indeed quite some dupes.

I wonder why req is so high though..

I also wonder if it's not easier to just pipe this straight to mysql
as SQL syntax and be done with it.  Let it do the ID generation[1].

[1] http://dev.mysql.com/doc/refman/5.0/en/example-auto-increment.html