Results 1 to 6 of 6
Thread: Celko: Best fit query or TJoin

011913, 20:30 #1Registered User
 Join Date
 Jan 2013
 Posts
 359
Provided Answers: 1Unanswered: Celko: Best fit query or TJoin
Dr. Codd's TJoin
In the Second Version of the relational model in 1990, Dr. E. F. Codd introduced a set of new theta operators, called Toperators, which were based on the idea of a bestfit or approximate equality (Codd 1990). The algorithm for the operators is easier to understand with an example modified from Dr. Codd.
The problem is to assign the classes to the available classrooms. We want (class_size < or <= room_size) to be true after the assignments are made. The first < will allow us a few empty seats in each room for late students; the <= can have exact matches.
Th naive approaches Codd gave were: (1) sort the tables in ascending order by classroom size and then matched the number of students in a class. (2) sort the tables in descending order by classroom size and then matched the number of students in a class.
We start with the following tables:
CREATE TABLE Rooms
(room_nbr CHAR(2) NOT NULL PRIMARY KEY,
room_size INTEGER NOT NULL
CHECK (room_size > 0));
INSERT INTO Rooms
VALUES
('r1', 70),
('r2', 40),
('r3', 50),
('r4', 85),
('r5', 30),
('r6', 65),
('r7', 55);
CREATE TABLE Classes
(class_nbr CHAR(2) NOT NULL PRIMARY KEY,
class_size INTEGER NOT NULL
CHECK (class_size > 0));
INSERT INTO Classes (class_nbr, class_size)
VALUES
('c1', 80),
('c2', 70),
('c3', 65),
('c4', 55),
('c5', 50),
('c6', 40);
The “best fit” problem is to assign classes to rooms with the fewest empty chairs in the room and nobody standing. The starting point is to find all the rooms that can hold a particular class.
SELECT room_nbr, class_nbr, (room_size  class_size) AS fit
FROM Classes, Rooms
WHERE class_size <= room_size;
Now, let's try to find the best fits:
WITH RC  all legal pairs
AS
(SELECT *, (room_size  class_size) AS fit
FROM Classes, Rooms
WHERE class_size <= room_size)
SELECT *  best fit
FROM RC
WHERE fit = (SELECT MIN(fit)
FROM RC AS RCa
WHERE RC.class_nbr = RCa.class_nbr);
Results
==============
c1 80 r4 85 5
c2 70 r1 70 0
c3 65 r6 65 0
c4 55 r7 55 0
c5 50 r3 50 0
c6 40 r2 40 0
This is very good, but it depends on the data! Slight change in the legal pairs CTE or the data, to allow for spare seating, will invalidate the results.
WITH RC  all legal pairs with spare room
AS
(SELECT *, (room_size  class_size) AS fit
FROM Classes, Rooms
WHERE class_size < room_size)
SELECT *  best fit
FROM RC
WHERE fit = (SELECT MIN(fit)
FROM RC AS RCa
WHERE RC.class_nbr = RCa.class_nbr);
Results
=================
c1 80 r4 85 5 ◄ Opps!
c2 70 r4 85 15 ◄ Opps! Duplicate room number!
c3 65 r1 70 5
c4 55 r6 65 10
c5 50 r7 55 5
c6 40 r3 50 10
Anyone want to try for general solution?

012113, 04:26 #2Registered User
 Join Date
 Mar 2003
 Posts
 280
For the trivial case (distinct room and class sizes) I believe it is sufficient to discalify rows where there exists a better fit:
Code:SELECT c.*, r.*, (room_size  class_size) AS fit FROM Classes c, Rooms r WHERE class_size <= room_size and not exists ( select 1 from classes c2 where c2.class_nbr <> c.class_nbr and c2.class_size between c.class_size and r.room_size )
Code:row_number () over (partition over size order by <p.k.>)
Last edited by lelle12; 012113 at 07:38. Reason: wrong order of class_size, room_size in exists clause

Lennart

012113, 12:47 #3Registered User
 Join Date
 Jan 2013
 Posts
 359
Provided Answers: 1I tried ordering the rooms and the classes by size with ROW_NUMBER(), then match the two lists in sequence at the first legal matchingposition. It does not give an optimal solution. Try this data set:
To see how this will work, let's add another column to the table of legal (class_nbr, room_nbr) pairs to see how well they fit.
WITH Pairs (room_nbr, class_nbr, fit)
AS
(SELECT R.room_nbr, C.class_nbr,
(R.room_size  C.class_size)
FROM Rooms AS R, Classes AS C
WHERE C.class_size <= R.room_size)
SELECT P1.room_nbr, P1.class_nbr, fit
FROM Pairs AS P1
WHERE P1.fit
= (SELECT MIN(P2.fit)
FROM Pairs AS P2
WHERE P1.room_nbr = P2.room_nbr);
r01 c04 2
r02 c04 1
r03 c06 5
r04 c06 4
r05 c09 2
r06 c12 5
r07 c13 5
r08 c13 4
r09 c15 5
r10 c15 4
r11 c18 5
This time we did not have an exact fit, so let's look for (fit = 1) into a working table and remove 'r02' and 'c04' from their tables. The second step is
r02 c04 1
We can now remove the best fit rooms and classes, and look the next set of remaining best fits.
WITH Pairs (room_nbr, class_nbr, fit)
AS
(SELECT R.room_nbr, C.class_nbr,
(R.room_size  C.class_size)
FROM Rooms AS R, Classes AS C
WHERE C.class_size <= R.room_size)
(SELECT P1.room_nbr, P1.class_nbr, fit
FROM Pairs AS P1
WHERE P1.fit = 2);
r05 c09 2
Here is the step for a fit of 3, 4, 5 and 6
r01 c05 3
r04 06 4
r08 c13 4
r10 c15 4
r06 c12 5
r11 c18 5
r03 c07 6
r07 c14 6
r09 c16 6
At this point, the original tables of rooms and classes cannot be paired. Notice that as we removed rooms and classes, the fit numbers changed. This is a characteristic of greedy algorithms; taking the “big bites” can leave suboptimal leftovers on the plate.
If you had put the two sorted lists together and matched them to the first fit, then you get ('r01', 'c04') which has a fit of 2. But when we match ('r02', 'c04') we have a fit of 1.

012113, 14:26 #4Registered User
 Join Date
 Mar 2003
 Posts
 280
I don't see a set here so I assume you missed pasting it. However reading the rest of your post I get the feeling that you would like to find an optimal solution for packing classes into rooms. I might have misunderstood, but AFAIK know what you describe is the knapsack problem ( Knapsack problem  Wikipedia, the free encyclopedia ). I wish I had a polynomial time solution for that, but I don't ;) Perhaps there is a constraint I miss that makes the problem easier than KSP, is there?

Lennart

012113, 15:03 #5Registered User
 Join Date
 Jan 2013
 Posts
 359
Provided Answers: 1INSERT INTO Classes (class_nbr, class_size)
VALUES
('c01', 106), ('c02', 105), ('c03', 104), ('c04', 100), ('c05', 99),
('c06', 90), ('c07', 89), ('c08', 88), ('c09', 83), ('c10', 82),
('c11', 81), ('c12', 65), ('c13', 50), ('c14', 49), ('c15', 30),
('c16', 29), ('c17', 28), ('c18', 20), ('c19', 19);
INSERT INTO Rooms (room_nbr, room_size)
VALUES
('r01', 102), ('r02', 101), ('r03', 95), ('r04', 94), ('r05', 85),
('r06', 70), ('r07', 55), ('r08', 54), ('r09', 35), ('r10', 34),
('r11', 25), ('r12', 18);
To see how this will work, let's add another column to the table of legal (class_nbr, room_nbr) pairs to see how well they fit.
WITH Pairs (room_nbr, class_nbr, fit)
AS
(SELECT R.room_nbr, C.class_nbr,
(R.room_size  C.class_size)
FROM Rooms AS R, Classes AS C
WHERE C.class_size <= R.room_size)
SELECT P1.room_nbr, P1.class_nbr, fit
FROM Pairs AS P1
WHERE P1.fit
= (SELECT MIN(P2.fit)
FROM Pairs AS P2
WHERE P1.room_nbr = P2.room_nbr);
r01 c04 2
r02 c04 1
r03 c06 5
r04 c06 4
r05 c09 2
r06 c12 5
r07 c13 5
r08 c13 4
r09 c15 5
r10 c15 4
r11 c18 5
This time we did not have an exact fit, so let's look for (fit = 1) into a working table and remove 'r02' and 'c04' from their tables. The second step is
r02 c04 1
We can now remove the best fit rooms and classes, and look the next set of remaining best fits.
WITH Pairs (room_nbr, class_nbr, fit)
AS
(SELECT R.room_nbr, C.class_nbr,
(R.room_size  C.class_size)
FROM Rooms AS R, Classes AS C
WHERE C.class_size <= R.room_size)
(SELECT P1.room_nbr, P1.class_nbr, fit
FROM Pairs AS P1
WHERE P1.fit = 2);
r05 c09 2
Here is the step for a fit of 3, 4, 5 and 6
r01 c05 3
r04 c06 4
r08 c13 4
r10 c15 4
r06 c12 5
r11 c18 5
r03 c07 6
r07 c14 6
r09 c16 6
At this point, the original tables of rooms and classes cannot be paired. Notice that as we removed rooms and classes, the fit numbers changed. This is a characteristic of greedy algorithms; taking the “big bites” can leave suboptimal leftovers on the plate.

012113, 17:51 #6Registered User
 Join Date
 Mar 2003
 Posts
 280
Ok, I can see where you are heading. Given this algorithm we would need a cursor, or we could probably use a module from 9.7 to create a recursive function. I don't think it will be possible to use a CTE to solve it. I'll do some more thinking and see if I can come up with some additional solution

Lennart