Results 1 to 5 of 5
  1. #1
    Join Date
    Oct 2007
    Posts
    6

    Unanswered: Problem with querying many-to-many tables; possible subquery issue

    I know the title is probably not especially descriptive, but i'm not exactly sure how to say it. I hope it becomes clear in the course of my post.

    I'm a relative newcomer to MySQL, and i've set up a very small database with a many-to-many relationship that i would like to query. The database is for free software, available on the internet. I have 3 tables right now: a programs table that has details about each program; a main categories table that lists 20 different categories of software; and a table linking the two, to create a many-to-many relationship, so that each program can be in more than one category.

    Here are the table setups:
    Code:
    CREATE TABLE `programs` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(50) NOT NULL,
      `url` varchar(100) NOT NULL,
      `description` text NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT;
    
    
    CREATE TABLE `main_cats` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `category` varchar(50) NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT;
    
    
    CREATE TABLE `programs_cats` (
      `programs_fk` int(11) NOT NULL,
      `category_fk` int(11) NOT NULL,
      PRIMARY KEY (`programs_fk`,`category_fk`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
    If i want to find out, for example, which Programs are in Main Category 1 (audio), i can easily get the result with the following query:
    Code:
    SELECT name, url, description
    FROM programs
    LEFT JOIN programs_cats ON ( programs.id = programs_cats.programs_fk )
    WHERE category_fk =1;
    The problem arises if i want to return results for programs that are in both Category 1 (Audio) and Category 20 (Codecs).

    The following query:
    Code:
    SELECT distinct name, url, description
    FROM programs
    LEFT JOIN programs_cats ON ( programs.id = programs_cats.programs_fk )
    WHERE category_fk=1 or category_fk=20;
    gives me all programs that are either Cat. 1 OR Cat. 20, but i can't work out a way to restrict the query so that i ONLY get results for programs that are BOTH Cat. 1 AND Cat. 20.

    I'm pretty sure i need a subquery of some sort here, or a more complicated join procedure, but i'm pretty new at this, and i really don't know how to go about it. Any help would be most appreciated.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    your use of LEFT OUTER JOIN is incorrect in these examples, you want to use INNER JOIN instead

    so for the first query,
    Code:
    SELECT programs.name
         , programs.url
         , programs.description
      FROM programs_cats
    INNER
      JOIN programs
        ON programs.id = programs_cats.programs_fk 
     WHERE programs_cats.category_fk = 1
    for the second query, you'll need a subquery to tell you which program_fks qualify...
    Code:
    SELECT programs.name
         , programs.url
         , programs.description
      FROM ( SELECT programs_fk 
               FROM programs_cats
              WHERE category_fk IN ( 1 , 20 )
             GROUP
                 BY programs_fk
             HAVING COUNT(*) = 2
           ) AS bofadem
    INNER
      JOIN programs
        ON programs.id = bofadem.programs_fk
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Oct 2007
    Posts
    6
    Thankyou very much. I tried that, and it worked. But then, you knew it would.

    I think i can see what your second query did, although i'll have to go back to my instruction books and do a bit more reading about subqueries and the IN, AS, and HAVING commands.

    A couple of questions, though, if you don't mind humoring me.

    First, my LEFT OUTER JOIN produced the same result as your INNER JOIN when i ran it on the database. Was that just luck? Is it the case that the OUTER JOIN will work in some situations but fail in others?

    Also, and perhaps more importantly for the overall desgin of my database, i was wondering whether you think my whole setup is the right approach. I was reading the post you made in another thread, where you seem to take a position against the need for an intermediate table, and recommend a two-table-only structure for cases like mine.

    I ask because, in addition to the main_cats table, i'd also like to implement a keywords table that will also be in a many-to-many relationship with the programs table. Most of what i've read before suggests that an intermediate, or linking table with numeric values only is the way to go for many-to-many relationships, but i also found your explanation of the two-table system rather convincing.

    One of the hard things about database design for us newcomers is that there seems to be a fair bit of contradictory advice out there, and it can be hard to tell whose to follow.

    Thanks for your help so far; i appreciate it. And i don't expect you to be my personal tutor; don't answer anything that takes too much of your time and effort.

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    regarding the LEFT OUTER JOIN issue...

    LEFT OUTER JOIN retrieves all rows from the left table, together with matching rows from the right table, if any

    so if there is a row from the left which has no match from the right, then the columns in the result row that would ahve come from the right are all set to NULL

    with me so far?

    okay, then the WHERE clause comes along, and says that the column from the right table has to be equal to a certain value

    well, NULL is not equal to anything, so the unmatched rows all get thrown out

    in effect, a condition on the WHERE clause for a column from the right table effectively turns the query into an inner join, because all the unmatched rows from the LEFT OUTER JOIN are thrown out

    make sense?

    okay, about your design....

    your many-to-many table for categories is fine

    still, it is almost exactly like the situation with a many-to-many table for keywords

    the main points are:

    1. numeric ids are ~not~ required (and may not even be advisable) in order to make a many-to-many relationship work

    2. the keywords or categories table is only really needed if
    2a. you want to control which values get used, or
    2b. you're storing additional data for the keywords or categories

    regarding 2a, you would not try to control values, for example, in a scheme where users can tag entities freely

    regarding 2b, you might want to have a category hierarchy (e.g. see Categories and Subcategories)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Oct 2007
    Posts
    6
    Thankyou. You explanations certainly helped, although it's clear that i need to do more reading (and re-reading) before i understand all this fully. Thanks also for the link to the page on Categories and Subcategories; it has given me a possible solution for another issue that i had been thinking about.

    I really appreciate your help. Cheers.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •