newsgroups-index (beta)

Current group: pgsql.performance

IN() Optimization issue in 8.0rc5

IN() Optimization issue in 8.0rc5  
Josh Berkus
 Re: IN() Optimization issue in 8.0rc5  
Tom Lane
 Re: IN() Optimization issue in 8.0rc5  
Josh Berkus
From:Josh Berkus
Subject:IN() Optimization issue in 8.0rc5
Date:Sat, 15 Jan 2005 12:23:10 -0800
Tom,

Hmmm ... I'm seeing an issue with IN() optimization -- or rather the lack of
it -- in 8.0rc5. It seems to me that this worked better in 7.4, although
I've not been able to load this particular database and test

dm=# explain
dm-# SELECT personid FROM mr.person_attributes_old
dm-# WHERE personid NOT IN (SELECT
personid FROM mr.person_attributes);
QUERY PLAN
-----------------------------------------------------------------------------------
Seq Scan on person_attributes_old (cost=0.00..3226144059.85 rows=235732
width=4)
Filter: (NOT (subplan))
SubPlan
-> Seq Scan on person_attributes (cost=0.00..12671.07 rows=405807
width=4)
(4 rows)

dm=# explain select pao.personid from mr.person_attributes_old pao
dm-# left outer join mr.person_attributes p on pao.personid = p.personid
dm-# where p.personid is null;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Merge Left Join (cost=0.00..34281.83 rows=471464 width=4)
Merge Cond: ("outer".personid = "inner".personid)
Filter: ("inner".personid IS NULL)
-> Index Scan using idx_opa_person on person_attributes_old pao
(cost=0.00..13789.29 rows=471464 width=4)
-> Index Scan using idx_pa_person on person_attributes p
(cost=0.00..14968.25 rows=405807 width=4)
(5 rows)

It seems like the planner ought to recognize that the first form of the query
is optimizable into the 2nd form, and that I've seen it do so in 7.4.
However, *no* amount of manipulation of query parameters I did on the 1st
form of the query were successful in getting the planner to recognize that it
could use indexes for the IN() form of the query.

Thoughts?

--
Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
From:Tom Lane
Subject:Re: IN() Optimization issue in 8.0rc5
Date:Sat, 15 Jan 2005 15:53:37 -0500
Josh Berkus writes:
> dm=# explain
> dm-# SELECT personid FROM mr.person_attributes_old
> dm-# WHERE personid NOT IN (SELECT
> personid FROM mr.person_attributes);
> QUERY PLAN
> -----------------------------------------------------------------------------------
> Seq Scan on person_attributes_old (cost=0.00..3226144059.85 rows=235732
> width=4)
> Filter: (NOT (subplan))
> SubPlan
> -> Seq Scan on person_attributes (cost=0.00..12671.07 rows=405807
> width=4)
> (4 rows)

Hmm. What you want for a NOT IN is for it to say
Filter: (NOT (hashed subplan))
which you are not getting. What's the datatypes of the two personid
columns? Is the 400k-row estimate for person_attributes reasonable?
Maybe you need to increase work_mem (nee sort_mem) to allow a
400k-row hash table?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org
From:Josh Berkus
Subject:Re: IN() Optimization issue in 8.0rc5
Date:Sat, 15 Jan 2005 13:32:27 -0800
Tom,

> Hmm. What you want for a NOT IN is for it to say
> Filter: (NOT (hashed subplan))
> which you are not getting. What's the datatypes of the two personid
> columns?

INT

> Is the 400k-row estimate for person_attributes reasonable?

Yes, the estimates are completely accurate.

> Maybe you need to increase work_mem (nee sort_mem) to allow a
> 400k-row hash table?

Aha, that's it. I thought I'd already set that, but apparently it was a
different session. Fixed. Thanks!

--
Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
   

Copyright © 2006 newsgroups-index   -   All rights reserved   -   Impressum