Results 1 to 5 of 5
  1. #1
    Join Date
    Oct 2002
    Posts
    37

    Question Multivalued field seach problems (con't)

    This post is related to one my previous posts in this forum.

    Here is a set of master, detailed tables:

    PROFILE
    userid
    group
    age
    religion

    PROFILE_ETHNICITY
    userid (foreign key of profile table)
    ethnicity

    PREFERENCE
    userid
    min_age
    max_age

    PREFERENCE_ETHNICITY
    userid (foreign key of PREFERENCEe table)
    ethnicity

    PREFERENCE_RELIGION
    userid (foreign key of PREFERENCE table)
    religion

    If I want to find anyone who is from group 101 (which I'm not in) and between 38 and 54 years old and either Christian or Catholic, I can have the following query statement:

    select p.userid, p.age
    from PROFILE p
    where p.group='101' and
    (p.age between '38' and '54') and
    (p.religion in ('Christian', 'Catholic' ))

    Now, if I want to have ethnicity as the selection facter as well, how I can modify the previous statement? I can't add the following up as a subquery.

    ((select ethnicity from profile_ethnicity where userid=p.userid) and
    ethnicity in ('black', 'white'))


    If I want to find anyone I also meet the preference, I need to have the following two query statements:

    select age, religion from profile where userid='Vince'

    The result of this query is my_age and my_religion respectively.

    select p.userid, p.age
    from PROFILE p
    where p.group='101' and
    (p.age between '38' and '54') and
    (p.religion in ('Christian', 'Catholic' )) and
    exists ( select 1 from preference f where f.userid=p.userid and (myage between f.minage and f.maxage) )

    I am quite confused on how to have religion and ethnicity in the previous query.

    Intersect can't be applied here since the search from two different tables.


    I have been working on a solution for a few of days and can't find one. Any helps will be gracefully appreciated.

    Thanks,

    v.

  2. #2
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171

    Re: Multivalued field seach problems (con't)

    Sounds like you want to use EXISTS:

    select p.userid, p.age
    from PROFILE p
    where p.group='101' and
    (p.age between '38' and '54') and
    (p.religion in ('Christian', 'Catholic' ))
    and EXISTS
    ( select 1 from profile_ethnicity pe
    where pe.userid=p.userid
    and pe.ethnicity in ('black', 'white')
    )

  3. #3
    Join Date
    Oct 2002
    Posts
    37
    The statement works. Thanks Tony for your help again.

    I can have a solution with the similar query For the second part of questions as:

    select p.userid from profile p
    where exists (select 1 from profile_ethnicity pe
    where pe.userid = p.userid and
    ethnicity in ('asian')) and
    exists (select 1 from preference_ethnicity pfe
    where pfe.userid=p.userid and
    pfe.ethnicity in (select pe1.ethnicity
    from profile_ethnicity pe1
    where pe1.userid='vince' ));

    However, it comes with a very high performance cost. For a very small set of data, the analyze query shows the following:

    Seq Scan on profile p (cost=0.00..9035020.00 rows=250 width=55) (actual time=2.00..3.00 rows=1 loops=1)

    Can the query be optimised?



    v.

  4. #4
    Join Date
    Sep 2002
    Location
    UK
    Posts
    5,171
    Don't know if these will help, but:

    1) You can change the IN subquery to a join:

    select p.userid from profile p
    where exists
    ( select 1
    from profile_ethnicity pe
    where pe.userid = p.userid
    and ethnicity in ('asian')
    )
    and exists
    ( select 1
    from preference_ethnicity pfe,
    profile_ethnicity pe1
    where pfe.userid=p.userid
    and pfe.ethnicity = pe1.ethnicity
    and pe1.userid='vince'
    );

    2) This gives the same result:

    select pe.userid
    from profile_ethnicity pe
    where ethnicity in ('asian')
    INTERSECT
    select pfe.userid
    from preference_ethnicity pfe,
    profile_ethnicity pe1
    where pfe.ethnicity = pe1.ethnicity
    and pe1.userid='vince' ;

    (There is no need for a SELECT from profile here).

  5. #5
    Join Date
    Oct 2002
    Posts
    37
    Using IN improves the performance a lot based on the plan, thought the actual time increases (that is fine for a very small test data). The following is the totoal output:

    Seq Scan on profile p (cost=0.00..70162.66 rows=250 width=55) (actual time=3.00 ..3.00 rows=1 loops=1)
    SubPlan
    -> Seq Scan on profile_ethnicity pe (cost=0.00..25.00 rows=1 width=0) (actual time=0.00..0.00 rows=1 loops=4)
    -> Merge Join (cost=45.12..45.14 rows=1 width=14) (actual time=1.50..1.50 rows=1 loops=2)
    -> Sort (cost=22.56..22.56 rows=5 width=7) (actual time=1.00..1.00 rows=4 loops=2)
    -> Seq Scan on preference_ethnicity pfe (cost=0.00..22.50 rows=5 width=7) (actual time=0.00..0.50 rows=4 loops=2)
    -> Sort (cost=22.56..22.56 rows=5 width=7) (actual time=0.50..0.50 rows=1 loops=2)
    -> Seq Scan on profile_ethnicity pe1 (cost=0.00..22.50 rows=5 width=7) (actual time=0.00..1.00 rows=1 loops=1)
    Total runtime: 4.00 msec

    I do need to select some other fields of profile table. So I modify the second query to

    select userid
    from profile
    where userid in
    (select pe.userid
    from profile_ethnicity pe
    where ethnicity in ('a')
    INTERSECT
    select pfe.userid
    from preference_ethnicity pfe,profile_ethnicity pe1
    where pfe.ethnicity = pe1.ethnicity and
    pe1.userid='John') ;

    NOTICE: QUERY PLAN:

    Seq Scan on profile (cost=0.00..118545.50 rows=500 width=55) (actual time=3.00..3.00 rows=1 loops=1)
    SubPlan
    -> Materialize (cost=118.52..118.52 rows=3 width=69) (actual time=0.75..0.75 rows=1 loops=4)
    -> SetOp Intersect (cost=118.45..118.52 rows=3 width=69) (actual time=3.00..3.00 rows=1 loops=1)
    -> Sort (cost=118.45..118.45 rows=30 width=69) (actual time=2.00..2.00 rows=4 loops=1)
    -> Append (cost=0.00..117.71 rows=30 width=69) (actual time=0.00..2.00 rows=4 loops=1)
    -> Subquery Scan *SELECT* 1 (cost=0.00..22.50 rows=5 width=55) (actual time=0.00..0.00 rows=2 loops=1)
    -> Seq Scan on profile_ethnicity pe (cost=0.
    00..22.50 rows=5 width=55) (actual time=0.00..0.00 rows=2 loops=1)
    -> Subquery Scan *SELECT* 2 (cost=92.39..95.21 rows=25 width=69) (actual time=2.00..2.00 rows=2 loops=1)
    -> Merge Join (cost=92.39..95.21 rows=25 wid
    th=69) (actual time=2.00..2.00 rows=2 loops=1)
    -> Sort (cost=69.83..69.83 rows=1000 w
    idth=62) (actual time=1.00..1.00 rows=13 loops=1)
    -> Seq Scan on preference_ethnici
    ty pfe (cost=0.00..20.00 rows=1000 width=62) (actual time=0.00..0.00 rows=13 loops=1)
    -> Sort (cost=22.56..22.56 rows=5 widt
    h=7) (actual time=1.00..1.00 rows=1 loops=1)
    -> Seq Scan on profile_ethnicity
    pe1 (cost=0.00..22.50 rows=5 width=7) (actual time=0.00..0.00 rows=1 loops=1)
    Total runtime: 4.00 msec

    This query performance is better the original one, though worser than the first one.

    Thanks very much for your helps.

    v.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •