Results 1 to 2 of 2
  1. #1
    Join Date
    May 2002
    Posts
    1

    Post Unanswered: Need a strong DB Structure

    I need help!


    Objective: Build a Mysql database structure flexible enough to handle from 10
    to 100 million members, and more if needed.

    Description: Online dating portal with enhanced matchmaking and
    browsing capabilities. The site allows members to create a profile with
    detailed info. Additionally, each member can post their profile in several
    clubs (adult, dating, alternative, etc...)

    Hardware: RAID 5 and RAID 0 DUAL XEON 1.6Ghz 2GIG of RAM servers.
    Structure: One main update database server, 3 replication select database
    servers.

    Situation: Unlike hotmail or other high capacity database sites. A dating site
    should provide an extensive search tool, which would go through all its
    members. The challenge is to create an database that will be
    scalable and fast enough. The problem is how can we search over 100 million
    rows database?

    Database structure: Please find attached in this email the current database
    structure. This structure needs to be altered and scaled to accomodate up to
    100 million members, and more if needed.
    For a live demonstration of the site, please visit http://demo.iwantu.com!

    Details: Our objective is to design a DB structure that will be able to handle
    up to, and over, 10-100 million users. Each user will have his main user
    information stored in "iwantu_user", his main profile information in
    "iwantu_profile" and for each club profile in "iwantu_clubX" where X
    designated which club it is (1=dating, 2=adult etc...)

    The solution we have now consists of splitting each of these tables into
    multiple smaller tables, ie:
    iwantu_user_1
    iwantu_user_2
    iwantu_user_3

    iwantu_profile_1
    iwantu_profile_2
    iwantu_profile_3

    iwantu_club1_1
    iwantu_club1_2
    iwantu_club1_3

    iwantu_club2_1
    iwantu_club2_2
    iwantu_club2_3

    Always keeping the most recent information in table numbered "1". The
    problem arises when we are searching for users whose profiles contain
    specific criteria. How do we go about getting the information from all
    the needed tables when we don't know in which table the user is.

    If we use a centralized table to store in which tables each user has his
    information, we come across problems where we get 100 results from the
    club tables, then from those results we narrow them further with the
    profile tables...we might end up with 2 results, when we need to display
    a certain number of results per page in the search.

    Consulting: Please provide us with an estimate of the number of hours and
    total amount required to come up with a plan.

    Time frame: One-Two weeks from now.
    Attached Files Attached Files

  2. #2
    Join Date
    May 2002
    Posts
    1
    You are using the wrong tool for this job. While it is *possible* to split data vertically as your schema suggests, the subsequent complexity introduced to the application code and the severe performance hit is, IMHO, unacceptable. I think other knowledgeable DBA's or database developers would agree that this situation should be avoided at all costs. A similar approach (chronological or segmented table naming schemes) sometimes makes sense for archival data storage, but never for live data. Your schema also contains a few other relational and normalization pitfalls, but that's another story...

    If you really must support 100 million users you have no choice but to step up to Oracle, Informix, DB2 or a similar heavy-duty RDBMS. Postgres might do the trick - it can theoretically handle your requirements but I have no direct or anecdotal evidence that it can (or cannot) perform acceptably with huge tables.

    Considering your deadline, a realistic compromise might be to abstract your database calls throughout the application so that you can migrate your backend later on (I'm assuming that you don't need to support 100 million users right away) as your user base grows. This would allow you start off with a relationally correct schema under MySQL, would streamiline your application code and would dramatically increase the system's response time. When you begin to run up against MySQL's capacity / performance limitations, you can move to a new database with minimal downtime (maybe a few minutes) if it's planned correctly.

    Unfortunately, I'm booked solid at the moment so I can't really offer my services for this project but I am always interested in talking to potential future clients. I'm a former senior DBA / Developer for a leading national newspaper and now a freelance web application / database / ecommerce developer. I *might* be able to design a proper schema for your system within your time contraints. Email me at ryan@pixeltweaker.com if you're interested.

    btw, you're hardware setup sounds great. I'm not sure if you can fully exploit its advantages with MySQL, but it will definitely yield a dramatic performance gain, to say the least.

    Thanks and good luck!
    Ryan

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •