Results 1 to 3 of 3
  1. #1
    Join Date
    Jan 2013
    Posts
    354
    Provided Answers: 1

    Unanswered: Picking a Representative Subset

    This is an old problem, but Ia m looking for new solutions for it. The problem is to find a subset of rows such that each value in each of two columns (animals and food brands in this example) appears in at least one row. The purpose is to produce a set of samples from a large table. The table has a animal_name column and an food_brand column; I want a set of samples that contains at least one of each animal_name and at least one of each food_brand, but no more than necessary.

    CREATE TABLE Feeding_Options
    (license_nbr INTEGER NOT NULL PRIMARY KEY,
    animal_name VARCHAR (10) NOT NULL,
    food_brand VARCHAR(15) NOT NULL);

    INSERT INTO Feeding_Options
    VALUES
    (6401715, 'rat', 'IAMS'),
    (1058337, 'rat', 'IAMS'),
    (459443, 'dog', 'Sci Diet'),
    (4018210, 'cat', 'Purina'),
    (2430656, 'cat', 'Purina'),
    (6802081, 'fish', 'Alpo'),
    (4236511, 'fish', 'Alpo'),
    (2162104, 'cat', 'Purina'),
    (2073679, 'fish', 'Alpo'),
    (8148891, 'cat', 'Purina'),
    (1868445, 'cat', 'Purina'),
    (6749213, 'cat', 'Purina'),
    (8363621, 'fox', 'Alpo'),
    (9999, 'fox', 'IAMS');

    To help frame the problem better, her are the values in each column:
    animals = {'rat', 'dog', 'cat', 'fish', 'fox'}
    food = {'IAMS', 'Sci Diet', 'Purina', 'Alpo'}

    In this data, (6401715, 'rat', 'IAMS') and (1058337, 'rat', 'IAMS') are interchangeable, as are some of the other rows. There can be more than one minimal solution whcihmight be the whole set. But I would be happy to simply find a near-minimal solution.

  2. #2
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Code:
    ;with License_List as
    	(select	MIN(license_nbr) as license_nbr
    	from	Feeding_Options
    	group by animal_name
    	union
    	select	MIN(license_nbr) as license_nbr
    	from	Feeding_Options
    	group by food_brand)
    select	Feeding_Options.*
    from	Feeding_Options
    	inner join License_List on Feeding_Options.license_nbr = License_List.license_nbr
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  3. #3
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Used same strategy as blindman, by using another functions.

    Code:
    /*
    ** Not tested on Microsoft SQL Server.
    */
    SELECT license_nbr , animal_name , food_brand
     FROM  (SELECT Feeding_Options.*
                 , MIN(license_nbr)
                      OVER( PARTITION BY animal_name ) AS min_by_an
                 , MIN(license_nbr) 
                      OVER( PARTITION BY food_brand  ) AS min_by_fb
             FROM  Feeding_Options
           ) AS fo
     WHERE license_nbr IN ( min_by_an , min_by_fb )
    ;

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •