PDA

View Full Version : is this the right data model?


peterbaker
08-28-02, 07:50
i am designing a database to store various types of logfile lines, which record proxy-server activity on firewall machines.

Because the lines are not related, it is hard to come up with a model that groups them into categories. In addition to this, it seems more efficient to create tables for fields which contain long strings, like the path to a resource on a domain, and use integer IDs to index them, then post these as foreign keys in a logfile table.

I am designing this db for Oracle8i

here is my data model
http://www.geocities.com/pjam001/index.html

r937
08-28-02, 20:46
ERwin, right?

your data model looks fine, except that you don't have to split things out unless it will represent a real savings somewhere

for example, take username

if username is some humungous 64-character string, and let's say there's only a few different ones, and let's say there's hundreds of thousands of line records, then by substituting userid as a FK to a separate user table, you've saved a boatload of space in the line table

otherwise, i don't see the point

i mean, if you were after a specific user's lines, you could just as easily say

select *
from line
where username='LKJASTOUHERVUHEOIRHGAOIDSHVOAHERG'

as

select line.*
from line, user
where user.userid = line.userid
and username='LKJASTOUHERVUHEOIRHGAOIDSHVOAHERG'

um, what was the question?

:-)

rudy
http://rudy.ca/

philshort
09-01-02, 10:36
The design looks fine, and I am presuming you need to optimise the design for speed of data retreival.

Coming from a multivalue background I would implement the design as is, with each of the index files keyed on id and containing a multivalued list of the lineid's. My personal preference would be to assign id's using a seuqntially assign base 64 key value - 6 digits will give you 68 billion separate ids, which should be enough! I'd add in inversions of these IDs for ease of lookup, i.e. add a user_id keyed on username which holds userID as the only data value - one-to-one reference table.

I am no Oracle expert, but my understanding is that Oracle have abandoned any pretence at being true relational database and implemented multivalued fields in 8i. Maybe you could use those to store the index records.

In multivalue, you could then do queries like:

BSELECT mailfrom_id "JOE BLOGGS" emailID
BSELECT mailfrom emailID lineID
LIST line time day month path file bytes ....

where the first command does one database read to get the relevant emailID for JOE BLOGGS, the second reads the index record for that id and returns a list of line ids which are passed to the third command, which in this example just lists the data values for the ids returned. Lets say JOE BLOGGS has 30 line record out of a database of 1.5 million lines. How many database reads would you perform in this model? One for the id lookup, one for the index read, then 30 dtail lines. Total 32.