Results 1 to 6 of 6
  1. #1
    Join Date
    Jan 2013
    Posts
    354
    Provided Answers: 1

    Exclamation Unanswered: Celko: Best fit query or T-Join

    Dr. Codd's T-Join

    In the Second Version of the relational model in 1990, Dr. E. F. Codd introduced a set of new theta operators, called T-operators, which were based on the idea of a best-fit or approximate equality (Codd 1990). The algorithm for the operators is easier to understand with an example modified from Dr. Codd.

    The problem is to assign the classes to the available classrooms. We want (class_size < or <= room_size) to be true after the assignments are made. The first < will allow us a few empty seats in each room for late students; the <= can have exact matches.

    Th naive approaches Codd gave were: (1) sort the tables in ascending order by classroom size and then matched the number of students in a class. (2) sort the tables in descending order by classroom size and then matched the number of students in a class.

    We start with the following tables:

    CREATE TABLE Rooms
    (room_nbr CHAR(2) NOT NULL PRIMARY KEY,
    room_size INTEGER NOT NULL
    CHECK (room_size > 0));

    INSERT INTO Rooms
    VALUES
    ('r1', 70),
    ('r2', 40),
    ('r3', 50),
    ('r4', 85),
    ('r5', 30),
    ('r6', 65),
    ('r7', 55);

    CREATE TABLE Classes
    (class_nbr CHAR(2) NOT NULL PRIMARY KEY,
    class_size INTEGER NOT NULL
    CHECK (class_size > 0));

    INSERT INTO Classes (class_nbr, class_size)
    VALUES
    ('c1', 80),
    ('c2', 70),
    ('c3', 65),
    ('c4', 55),
    ('c5', 50),
    ('c6', 40);

    The “best fit” problem is to assign classes to rooms with the fewest empty chairs in the room and nobody standing. The starting point is to find all the rooms that can hold a particular class.

    SELECT room_nbr, class_nbr, (room_size - class_size) AS fit
    FROM Classes, Rooms
    WHERE class_size <= room_size;

    Now, let's try to find the best fits:

    WITH RC --- all legal pairs
    AS
    (SELECT *, (room_size - class_size) AS fit
    FROM Classes, Rooms
    WHERE class_size <= room_size)

    SELECT * -- best fit
    FROM RC
    WHERE fit = (SELECT MIN(fit)
    FROM RC AS RCa
    WHERE RC.class_nbr = RCa.class_nbr);

    Results
    ==============
    c1 80 r4 85 5
    c2 70 r1 70 0
    c3 65 r6 65 0
    c4 55 r7 55 0
    c5 50 r3 50 0
    c6 40 r2 40 0

    This is very good, but it depends on the data! Slight change in the legal pairs CTE or the data, to allow for spare seating, will invalidate the results.

    WITH RC --- all legal pairs with spare room
    AS
    (SELECT *, (room_size - class_size) AS fit
    FROM Classes, Rooms
    WHERE class_size < room_size)

    SELECT * -- best fit
    FROM RC
    WHERE fit = (SELECT MIN(fit)
    FROM RC AS RCa
    WHERE RC.class_nbr = RCa.class_nbr);

    Results
    =================
    c1 80 r4 85 5 ◄ Opps!
    c2 70 r4 85 15 ◄ Opps! Duplicate room number!
    c3 65 r1 70 5
    c4 55 r6 65 10
    c5 50 r7 55 5
    c6 40 r3 50 10

    Anyone want to try for general solution?

  2. #2
    Join Date
    Mar 2003
    Posts
    280
    Quote Originally Posted by Celko View Post
    Dr. Codd's T-Join

    In the Second Version of the relational model in 1990, Dr. E. F. Codd introduced a set of new theta operators, called T-operators, which were based on the idea of a best-fit or approximate equality (Codd 1990). The algorithm for the operators is easier to understand with an example modified from Dr. Codd.

    The problem is to assign the classes to the available classrooms. We want (class_size < or <= room_size) to be true after the assignments are made. The first < will allow us a few empty seats in each room for late students; the <= can have exact matches.

    Th naive approaches Codd gave were: (1) sort the tables in ascending order by classroom size and then matched the number of students in a class. (2) sort the tables in descending order by classroom size and then matched the number of students in a class.

    We start with the following tables:

    CREATE TABLE Rooms
    (room_nbr CHAR(2) NOT NULL PRIMARY KEY,
    room_size INTEGER NOT NULL
    CHECK (room_size > 0));

    INSERT INTO Rooms
    VALUES
    ('r1', 70),
    ('r2', 40),
    ('r3', 50),
    ('r4', 85),
    ('r5', 30),
    ('r6', 65),
    ('r7', 55);

    CREATE TABLE Classes
    (class_nbr CHAR(2) NOT NULL PRIMARY KEY,
    class_size INTEGER NOT NULL
    CHECK (class_size > 0));

    INSERT INTO Classes (class_nbr, class_size)
    VALUES
    ('c1', 80),
    ('c2', 70),
    ('c3', 65),
    ('c4', 55),
    ('c5', 50),
    ('c6', 40);

    The “best fit” problem is to assign classes to rooms with the fewest empty chairs in the room and nobody standing. The starting point is to find all the rooms that can hold a particular class.

    SELECT room_nbr, class_nbr, (room_size - class_size) AS fit
    FROM Classes, Rooms
    WHERE class_size <= room_size;

    Now, let's try to find the best fits:

    WITH RC --- all legal pairs
    AS
    (SELECT *, (room_size - class_size) AS fit
    FROM Classes, Rooms
    WHERE class_size <= room_size)

    SELECT * -- best fit
    FROM RC
    WHERE fit = (SELECT MIN(fit)
    FROM RC AS RCa
    WHERE RC.class_nbr = RCa.class_nbr);

    Results
    ==============
    c1 80 r4 85 5
    c2 70 r1 70 0
    c3 65 r6 65 0
    c4 55 r7 55 0
    c5 50 r3 50 0
    c6 40 r2 40 0

    This is very good, but it depends on the data! Slight change in the legal pairs CTE or the data, to allow for spare seating, will invalidate the results.

    WITH RC --- all legal pairs with spare room
    AS
    (SELECT *, (room_size - class_size) AS fit
    FROM Classes, Rooms
    WHERE class_size < room_size)

    SELECT * -- best fit
    FROM RC
    WHERE fit = (SELECT MIN(fit)
    FROM RC AS RCa
    WHERE RC.class_nbr = RCa.class_nbr);

    Results
    =================
    c1 80 r4 85 5 ◄ Opps!
    c2 70 r4 85 15 ◄ Opps! Duplicate room number!
    c3 65 r1 70 5
    c4 55 r6 65 10
    c5 50 r7 55 5
    c6 40 r3 50 10

    Anyone want to try for general solution?
    For the trivial case (distinct room and class sizes) I believe it is sufficient to discalify rows where there exists a better fit:

    Code:
    SELECT c.*, r.*, (room_size - class_size) AS fit
    FROM Classes c, Rooms r 
    WHERE class_size <= room_size 
    and not exists ( 
        select 1 from classes c2 
        where c2.class_nbr <> c.class_nbr 
           and c2.class_size between c.class_size and r.room_size 
    )
    If classes can have equal and size, dito for rooms we need to pick one over another as a best fit. Seems reasonable to enumerate each set with:

    Code:
    row_number () over (partition over size order by <p.k.>)
    and use that in a predicate.
    Last edited by lelle12; 01-21-13 at 07:38. Reason: wrong order of class_size, room_size in exists clause
    --
    Lennart

  3. #3
    Join Date
    Jan 2013
    Posts
    354
    Provided Answers: 1
    I tried ordering the rooms and the classes by size with ROW_NUMBER(), then match the two lists in sequence at the first legal matchingposition. It does not give an optimal solution. Try this data set:

    To see how this will work, let's add another column to the table of legal (class_nbr, room_nbr) pairs to see how well they fit.

    WITH Pairs (room_nbr, class_nbr, fit)
    AS
    (SELECT R.room_nbr, C.class_nbr,
    (R.room_size - C.class_size)
    FROM Rooms AS R, Classes AS C
    WHERE C.class_size <= R.room_size)

    SELECT P1.room_nbr, P1.class_nbr, fit
    FROM Pairs AS P1
    WHERE P1.fit
    = (SELECT MIN(P2.fit)
    FROM Pairs AS P2
    WHERE P1.room_nbr = P2.room_nbr);

    r01 c04 2
    r02 c04 1
    r03 c06 5
    r04 c06 4
    r05 c09 2
    r06 c12 5
    r07 c13 5
    r08 c13 4
    r09 c15 5
    r10 c15 4
    r11 c18 5

    This time we did not have an exact fit, so let's look for (fit = 1) into a working table and remove 'r02' and 'c04' from their tables. The second step is

    r02 c04 1

    We can now remove the best fit rooms and classes, and look the next set of remaining best fits.

    WITH Pairs (room_nbr, class_nbr, fit)
    AS
    (SELECT R.room_nbr, C.class_nbr,
    (R.room_size - C.class_size)
    FROM Rooms AS R, Classes AS C
    WHERE C.class_size <= R.room_size)

    (SELECT P1.room_nbr, P1.class_nbr, fit
    FROM Pairs AS P1
    WHERE P1.fit = 2);

    r05 c09 2

    Here is the step for a fit of 3, 4, 5 and 6

    r01 c05 3

    r04 06 4
    r08 c13 4
    r10 c15 4

    r06 c12 5
    r11 c18 5

    r03 c07 6
    r07 c14 6
    r09 c16 6

    At this point, the original tables of rooms and classes cannot be paired. Notice that as we removed rooms and classes, the fit numbers changed. This is a characteristic of greedy algorithms; taking the “big bites” can leave sub-optimal leftovers on the plate.

    If you had put the two sorted lists together and matched them to the first fit, then you get ('r01', 'c04') which has a fit of 2. But when we match ('r02', 'c04') we have a fit of 1.

  4. #4
    Join Date
    Mar 2003
    Posts
    280
    Quote Originally Posted by Celko View Post
    I tried ordering the rooms and the classes by size with ROW_NUMBER(), then match the two lists in sequence at the first legal matchingposition. It does not give an optimal solution. Try this data set:
    I don't see a set here so I assume you missed pasting it. However reading the rest of your post I get the feeling that you would like to find an optimal solution for packing classes into rooms. I might have misunderstood, but AFAIK know what you describe is the knapsack problem ( Knapsack problem - Wikipedia, the free encyclopedia ). I wish I had a polynomial time solution for that, but I don't ;-) Perhaps there is a constraint I miss that makes the problem easier than KSP, is there?
    --
    Lennart

  5. #5
    Join Date
    Jan 2013
    Posts
    354
    Provided Answers: 1
    INSERT INTO Classes (class_nbr, class_size)
    VALUES
    ('c01', 106), ('c02', 105), ('c03', 104), ('c04', 100), ('c05', 99),
    ('c06', 90), ('c07', 89), ('c08', 88), ('c09', 83), ('c10', 82),
    ('c11', 81), ('c12', 65), ('c13', 50), ('c14', 49), ('c15', 30),
    ('c16', 29), ('c17', 28), ('c18', 20), ('c19', 19);

    INSERT INTO Rooms (room_nbr, room_size)
    VALUES
    ('r01', 102), ('r02', 101), ('r03', 95), ('r04', 94), ('r05', 85),
    ('r06', 70), ('r07', 55), ('r08', 54), ('r09', 35), ('r10', 34),
    ('r11', 25), ('r12', 18);

    To see how this will work, let's add another column to the table of legal (class_nbr, room_nbr) pairs to see how well they fit.

    WITH Pairs (room_nbr, class_nbr, fit)
    AS
    (SELECT R.room_nbr, C.class_nbr,
    (R.room_size - C.class_size)
    FROM Rooms AS R, Classes AS C
    WHERE C.class_size <= R.room_size)

    SELECT P1.room_nbr, P1.class_nbr, fit
    FROM Pairs AS P1
    WHERE P1.fit
    = (SELECT MIN(P2.fit)
    FROM Pairs AS P2
    WHERE P1.room_nbr = P2.room_nbr);

    r01 c04 2
    r02 c04 1
    r03 c06 5
    r04 c06 4
    r05 c09 2
    r06 c12 5
    r07 c13 5
    r08 c13 4
    r09 c15 5
    r10 c15 4
    r11 c18 5

    This time we did not have an exact fit, so let's look for (fit = 1) into a working table and remove 'r02' and 'c04' from their tables. The second step is

    r02 c04 1

    We can now remove the best fit rooms and classes, and look the next set of remaining best fits.

    WITH Pairs (room_nbr, class_nbr, fit)
    AS
    (SELECT R.room_nbr, C.class_nbr,
    (R.room_size - C.class_size)
    FROM Rooms AS R, Classes AS C
    WHERE C.class_size <= R.room_size)

    (SELECT P1.room_nbr, P1.class_nbr, fit
    FROM Pairs AS P1
    WHERE P1.fit = 2);

    r05 c09 2

    Here is the step for a fit of 3, 4, 5 and 6

    r01 c05 3

    r04 c06 4
    r08 c13 4
    r10 c15 4

    r06 c12 5
    r11 c18 5

    r03 c07 6
    r07 c14 6
    r09 c16 6

    At this point, the original tables of rooms and classes cannot be paired. Notice that as we removed rooms and classes, the fit numbers changed. This is a characteristic of greedy algorithms; taking the “big bites” can leave sub-optimal leftovers on the plate.

  6. #6
    Join Date
    Mar 2003
    Posts
    280
    Quote Originally Posted by Celko View Post
    INSERT INTO Classes (class_nbr, class_size)
    VALUES
    ('c01', 106), ('c02', 105), ('c03', 104), ('c04', 100), ('c05', 99),
    ('c06', 90), ('c07', 89), ('c08', 88), ('c09', 83), ('c10', 82),
    ('c11', 81), ('c12', 65), ('c13', 50), ('c14', 49), ('c15', 30),
    ('c16', 29), ('c17', 28), ('c18', 20), ('c19', 19);

    INSERT INTO Rooms (room_nbr, room_size)
    VALUES
    ('r01', 102), ('r02', 101), ('r03', 95), ('r04', 94), ('r05', 85),
    ('r06', 70), ('r07', 55), ('r08', 54), ('r09', 35), ('r10', 34),
    ('r11', 25), ('r12', 18);

    To see how this will work, let's add another column to the table of legal (class_nbr, room_nbr) pairs to see how well they fit.

    WITH Pairs (room_nbr, class_nbr, fit)
    AS
    (SELECT R.room_nbr, C.class_nbr,
    (R.room_size - C.class_size)
    FROM Rooms AS R, Classes AS C
    WHERE C.class_size <= R.room_size)

    SELECT P1.room_nbr, P1.class_nbr, fit
    FROM Pairs AS P1
    WHERE P1.fit
    = (SELECT MIN(P2.fit)
    FROM Pairs AS P2
    WHERE P1.room_nbr = P2.room_nbr);

    r01 c04 2
    r02 c04 1
    r03 c06 5
    r04 c06 4
    r05 c09 2
    r06 c12 5
    r07 c13 5
    r08 c13 4
    r09 c15 5
    r10 c15 4
    r11 c18 5

    This time we did not have an exact fit, so let's look for (fit = 1) into a working table and remove 'r02' and 'c04' from their tables. The second step is

    r02 c04 1

    We can now remove the best fit rooms and classes, and look the next set of remaining best fits.

    WITH Pairs (room_nbr, class_nbr, fit)
    AS
    (SELECT R.room_nbr, C.class_nbr,
    (R.room_size - C.class_size)
    FROM Rooms AS R, Classes AS C
    WHERE C.class_size <= R.room_size)

    (SELECT P1.room_nbr, P1.class_nbr, fit
    FROM Pairs AS P1
    WHERE P1.fit = 2);

    r05 c09 2

    Here is the step for a fit of 3, 4, 5 and 6

    r01 c05 3

    r04 c06 4
    r08 c13 4
    r10 c15 4

    r06 c12 5
    r11 c18 5

    r03 c07 6
    r07 c14 6
    r09 c16 6

    At this point, the original tables of rooms and classes cannot be paired. Notice that as we removed rooms and classes, the fit numbers changed. This is a characteristic of greedy algorithms; taking the “big bites” can leave sub-optimal leftovers on the plate.
    Ok, I can see where you are heading. Given this algorithm we would need a cursor, or we could probably use a module from 9.7 to create a recursive function. I don't think it will be possible to use a CTE to solve it. I'll do some more thinking and see if I can come up with some additional solution
    --
    Lennart

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •