Results 1 to 1 of 1

082508, 10:23 #1Registered User
 Join Date
 Apr 2004
 Location
 Europe>Sweden>Stockholm
 Posts
 71
Unanswered: Peculiar (slow) strategy in LEFT OUTER JOIN
Hi,
I have encountered a peculiar problem related to a graphlike data structure in which we want to find the different paths to the root of a certain maximum length.
We have a table – call it relation – with a small number of fields. The two of interest here are called "subject" and "object". The column called object contains the ID of the parent of the subject. If it helps, imagine that what we call "subject" is called "category_id" and what we call "object" is called "parent_category_id" instead. The relation table simply describes a treelike structure.
Query examples
So, to find the different paths of length 3 to the root from the leaf "x", I issue the following query (I select both "subject" and "object" for clarity):
Code:select r1.subject AS s1, r1.object AS o1, r2.subject AS s2, r2.object AS o2, r3.subject AS s3, r3.object AS o3 from relation r1 left outer join relation r2 on r2.subject = r1.object left outer join relation r3 on r3.subject = r2.object where r1.subject = 'x';
Code:s1  o1  s2  o2  s3  o3 +++++ x  x2  x2  x3  x3  x4 x  x1  x1  x9  x9  x6 x  x1  x1  x2  x2  x3 x  x1  x1  x6  x6  x2 x  x5  x5  x9  x9  x6 x  x5  x5  x2  x2  x3 x  x5  x5  x8  x8  x2 x  x5  x5  x2  x2  x3 x  x5  x5  x6  x6  x2 x  x6  x6  x2  x2  x3 x  x7  x7  x2  x2  x3 x  x2  x2  x3  x3  x4 x  x8  x8  x2  x2  x3 (13 rows) Time: 20.218 ms
Code:select r1.subject AS s1, r1.object AS o1, r2.subject AS s2, r2.object AS o2, r3.subject AS s3, r3.object AS o3, r4.subject AS s4, r4.object AS o4 from relation r1 left outer join relation r2 on r2.subject = r1.object left outer join relation r3 on r3.subject = r2.object left outer join relation r4 on r4.subject = r3.object where r1.subject = 'x';
Code:s1  o1  s2  o2  s3  o3  s4  o4 +++++++ x  x1  x1  x2  x2  x3  x3  x4 x  x5  x5  x2  x2  x3  x3  x4 x  x5  x5  x2  x2  x3  x3  x4 x  x6  x6  x2  x2  x3  x3  x4 x  x7  x7  x2  x2  x3  x3  x4 x  x8  x8  x2  x2  x3  x3  x4 x  x1  x1  x6  x6  x2  x2  x3 x  x5  x5  x8  x8  x2  x2  x3 x  x5  x5  x6  x6  x2  x2  x3 x  x1  x1  x9  x9  x6  x6  x2 x  x5  x5  x9  x9  x6  x6  x2 x  x2  x2  x3  x3  x4   x  x2  x2  x3  x3  x4   (13 rows) Time: 7496.166 ms <<< !!!
Fast query (three joins)
Code:Nested Loop Left Join (cost=4.08..7283.56 rows=1637 width=111) > Nested Loop Left Join (cost=2.04..621.08 rows=140 width=74) > Index Scan using relation_subject_index on relation r1 (cost=0.00..50.01 rows=12 width=37) Index Cond: ((subject)::text = 'x'::text) > Bitmap Heap Scan on relation r2 (cost=2.04..47.44 rows=12 width=37) Recheck Cond: ((r2.subject)::text = ("outer"."object")::text) > Bitmap Index Scan on relation_subject_index (cost=0.00..2.04 rows=12 width=0) Index Cond: ((r2.subject)::text = ("outer"."object")::text) > Bitmap Heap Scan on relation r3 (cost=2.04..47.44 rows=12 width=37) Recheck Cond: ((r3.subject)::text = ("outer"."object")::text) > Bitmap Index Scan on relation_subject_index (cost=0.00..2.04 rows=12 width=0) Index Cond: ((r3.subject)::text = ("outer"."object")::text) (12 rows)
Code:Merge Left Join (cost=44853.73..46103.24 rows=19138 width=148) Merge Cond: ("outer"."?column7?" = "inner"."?column3?") > Sort (cost=7370.95..7375.04 rows=1637 width=111) Sort Key: (r3."object")::text > Nested Loop Left Join (cost=4.08..7283.56 rows=1637 width=111) > Nested Loop Left Join (cost=2.04..621.08 rows=140 width=74) > Index Scan using relation_subject_index on relation r1 (cost=0.00..50.01 rows=12 width=37) Index Cond: ((subject)::text = 'x'::text) > Bitmap Heap Scan on relation r2 (cost=2.04..47.44 rows=12 width=37) Recheck Cond: ((r2.subject)::text = ("outer"."object")::text) > Bitmap Index Scan on relation_subject_index (cost=0.00..2.04 rows=12 width=0) Index Cond: ((r2.subject)::text = ("outer"."object")::text) > Bitmap Heap Scan on relation r3 (cost=2.04..47.44 rows=12 width=37) Recheck Cond: ((r3.subject)::text = ("outer"."object")::text) > Bitmap Index Scan on relation_subject_index (cost=0.00..2.04 rows=12 width=0) Index Cond: ((r3.subject)::text = ("outer"."object")::text) > Sort (cost=37482.78..38008.96 rows=210471 width=37) Sort Key: (r4.subject)::text > Seq Scan on relation r4 (cost=0.00..4393.71 rows=210471 width=37) (19 rows)
# \d+ relation
Code:Table "public.relation" Column  Type  Modifiers  Description +++ id  bigint  not null default nextval('relation_id_seq'::regclass)  type  bigint  not null  object  character varying(255)  not null  subject  character varying(255)  not null  Indexes: "relation_pkey" PRIMARY KEY, btree (id) "relation_type_key" UNIQUE, btree ("type", "object", subject) "relation_object_index" btree ("object") "relation_subject_index" btree (subject)
Any help would be greately appreciated!
Per