| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

04-29-08, 18:25
|
|
Registered User
|
|
Join Date: Jul 2007
Posts: 10
|
|
|
Searching for logical combination of tags
|
|
I have a database where video is tagged with tags. It has three tables: a Videos table with id and other information, a Tags table with id and tag, and I have a join table Videos_Tags with video_id and tag_id which connects videos to tags and vice versa.
I want to offer the users of a web interface the possibility to search for complex combinations of tags, for example:
'romantic'
--AND
'short'
--OR
'action'
--AND
'long'
or any other combination.
So also (('romantic' OR 'short') AND 'action') should be possible.
I need to dynamically build the sql query which returns the right videos.
for the first case, I suppose the query has to be like this:
select * from videos
JOIN Videos_Tags a on videos.id = a.video_id
JOIN Videos_Tags b on videos.id = b.video_id
JOIN Tags x on a.tag_id = x.id
JOIN Tags y on b.tag_id = y.id
WHERE
x.tag = 'romantic' AND y.tag = 'short'
OR
x.tag = 'action' AND y.tag = 'long'
but for the second query:
select * from videos
JOIN Videos_Tags a on videos.id = a.video_id
JOIN Videos_Tags b on videos.id = b.video_id
JOIN Tags x on a.tag_id = x.id
OIN Tags y on b.tag_id = y.id
WHERE
(x.tag = 'romantic' OR x.tag = 'short')
AND
y.tag = 'action'
I'm having a hard time building these queries dynamically. If I have three tags combined with AND, I have to join three times, but it gets complicated when the logical tree gets more complicated.
Are there some general rules here? I suppose there are but I didn't really find something.
I thought it would be a common problem but it seems like most 'solutions' just use a simple 'IN' or even store their tags as a comma-separated string in their main table...
Thanks!
--edit
I think I got it!
'romantic'
--AND
'short'
--OR
'action'
--AND
'long'
results in a tree like this:
_________________OR
_______________/_____\
____________AND_____AND
_________/______\____/___\
_______rom._____sh._act.__long
I have to start at the root of the tree. If it's an OR I give every child the same letter, e.g. 'A'.
If it's an AND I give every child a different letter, e.g 'A', 'B' and 'C'
This has to be done recursive, and every time the newly passed letter has to be added by the letter the child already had. This means that in the example above the AND and AND each get letter 'A', 'romantic' gets 'AA', 'short' gets 'AB', 'action' gets 'AA' and 'long' gets 'AB'.
Any comments?
|
Last edited by CyberMonkey; 04-29-08 at 18:50.
|

04-29-08, 19:10
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
the following solutions assume that you cannot assign the same tag to the same video more than once -- the easiest way to do this is to declare the 2 FKs in the Videos_Tags table as the PK
( 'romantic' AND 'short' ) OR ( 'action' AND 'long' ) --
Code:
SELECT videos.id
, videos.otherstuff
FROM Tags
INNER
JOIN Videos_Tags
ON Videos_Tags.tag_id = Tags.id
INNER
JOIN videos
ON videos.id = Videos_Tags.video_id
WHERE Tags.tag IN ( 'romantic', 'short'
, 'action', 'long' )
GROUP
BY videos.id
HAVING SUM(CASE WHEN Tags.tag IN ( 'romantic', 'short' )
THEN 1 ELSE 0 END ) = 2
OR SUM(CASE WHEN Tags.tag IN ( 'action', 'long' )
THEN 1 ELSE 0 END ) = 2
i'm sure you could come up with other combinations, and they all have similar, simple solutions
note: in mysql, you are allowed to have other columns in the SELECT that aren't mentioned in the GROUP BY
in other database systems, you would have to include them in the GROUP BY (which might be problematic if they include a BLOB column, but you can simply push everything down one subquery level)
|
|

04-29-08, 19:33
|
|
Registered User
|
|
Join Date: Jul 2007
Posts: 10
|
|
|
|
That's pretty nice, I would never have thought of that. Thanks a lot!!
It's indeed a suitable and performant solution for searching for tags. However, I wonder what I have to do if I also want to include where clauses for columns of Videos?
Let's say I want a video with
(tag='romantic' AND tag='short') or (videos.title LIKE 'The%' AND tag='kurosawa')
Is this possible with this solution with GROUP?
Thanks!
|
|

04-29-08, 19:38
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Code:
...
WHERE Tags.tag IN ( 'romantic', 'short' , 'kurosawa' )
...
HAVING SUM(CASE WHEN Tags.tag IN ( 'romantic', 'short' )
THEN 1 ELSE 0 END ) = 2
OR SUM(CASE WHEN Tags.tag = 'kurosawa'
OR videos.title LIKE 'The%'
THEN 1 ELSE 0 END ) = 2
|
|

04-29-08, 19:41
|
|
Registered User
|
|
Join Date: Jul 2007
Posts: 10
|
|
Fantastic! Thanks 
|
|

04-30-08, 05:06
|
|
Registered User
|
|
Join Date: Jul 2007
Posts: 10
|
|
I just tried your last solution. Is it possible that when I have a video with id = 2 and title 'The seven samurai' and with 5 different tags, the solution is not correct?
This is because the join creates five lines with videos.id = 2, each with a different tag but all with the same title.
However, if I change the OR to AND and the outcome of the sum to 1, it works.
Code:
...
WHERE Tags.tag IN ( 'romantic', 'short' , 'kurosawa' )
...
HAVING SUM(CASE WHEN Tags.tag IN ( 'romantic', 'short' )
THEN 1 ELSE 0 END ) = 2
OR SUM(CASE WHEN Tags.tag = 'kurosawa'
AND videos.title LIKE 'The%'
THEN 1 ELSE 0 END ) = 1
This kind of complicates the building of the query I suppose. I have to count the number of tags in a SUM to know what the sum has to be equal to.
It seems to me my initial solution with joins is more elegant as it is easier to implement the logical hierararchy, although the performance will be much worse.
What do you think?
Thanks again 
|
|

04-30-08, 06:37
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by CyberMonkey
I just tried your last solution. Is it possible that when I have a video with id = 2 and title 'The seven samurai' and with 5 different tags, the solution is not correct?
|
it's possible -- which 5 tags does it have?
Quote:
|
Originally Posted by CyberMonkey
I have to count the number of tags in a SUM to know what the sum has to be equal to.
|
and the difficulty with a correlated subquery in the HAVING clause is... ?
Quote:
|
Originally Posted by CyberMonkey
It seems to me my initial solution with joins is more elegant as it is easier to implement the logical hierararchy, although the performance will be much worse.
|
okay, go ahead, enjoy yourself, join as many times as you want

|
|

04-30-08, 07:15
|
|
Registered User
|
|
Join Date: Jul 2007
Posts: 10
|
|
Quote:
|
Originally Posted by r937
and the difficulty with a correlated subquery in the HAVING clause is... ?
|
That's not difficult indeed, the problem is the code which makes it happen.
Not only the query has to work, but the query has to be build-able with code which is robust and elegant and I'm afraid that won't be the case with the group-by.
|
|

04-30-08, 07:28
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
the GROUP BY query is simple and elegant, and involves only counting in the HAVING clause
the join approach has a variable number of joins based on the problem, and if you are more comfortable coding that, then there you have your answer!!
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|