Results 1 to 6 of 6
  1. #1
    Join Date
    Mar 2004
    Posts
    4

    optimal design for large database?

    Hi,

    I need some help designing a database containing information from 70GB of proxy logs.
    To keep it basic, there are mainly 4 fields I'm interested in:

    Userid IP Address Time URL

    I have a perl script to insert the data parsed from the logs. I'm wondering though, can I just have one table with the 4 column headings above or do I need a cleverer design than that? As I fill this table up I'm wary that it's going to take longer to run a query, such as 'select URL where Userid="ed"'. What would be the optimal design for something like this? Also which is database product would be best? MySQL or PostgreSQL?

    Thanks for any help,

    Ed.

  2. #2
    Join Date
    Dec 2003
    Location
    Canada
    Posts
    710
    Tell more about your 4 column table.

  3. #3
    Join Date
    Mar 2004
    Posts
    4

    optimal design for large database?

    Quote Originally Posted by certus
    Tell more about your 4 column table.
    The URL column? This is just the website eg,

    http://www.hotmail.com
    http://www.abc.org/news
    https://www.paypal.com

    I thought about using all the url upto the filetype but that might be overkill.

    Thanks for helping out

    Ed.

  4. #4
    Join Date
    Dec 2003
    Location
    Canada
    Posts
    710
    I don't see any problem with a large single table.

    You will want to use indexes of course.

  5. #5
    Join Date
    Mar 2004
    Posts
    4

    optimal design for large database?

    Quote Originally Posted by certus
    I don't see any problem with a large single table.

    You will want to use indexes of course.
    Sorry if this sounds like a stupid question (I don't have much experience with databases as you probably gathered) but what sort of index should I use for this? Could you give me an example please?

    Thanks,

    Ed.

  6. #6
    Join Date
    Dec 2003
    Location
    Canada
    Posts
    710
    You would create your table and then create indexes

    create table tablename(column1, column2...)

    create index indexname on tablename (column1, ...)

    Then you can forget about the indexes and concentrate on population and querying. The indexes will work in the background.

    You create the indexes according to how you query. If you always query on column1 and column2 together, you would create an index for those two columns. Otherwise create an index for each column separately. Separate indexes assure that all available indexing will be used for each query.

    This is standard Data Definition Language in SQL.
    Last edited by certus; 06-06-04 at 15:27.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •