If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > Can Anybody speed this query up?

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-09-09, 08:33
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
Can Anybody speed this query up?

Hi

I have the following table structure:

table: Customer
column: customer_id (pk)
column: regTime (int field. time user registered)

table: Order_sum
column: customer_id (fk)
column: status

A customer can have multiple orders all with different statuses. 1 to 10.

I need to return a list of customer_ids where the customer has registered over 30 days ago AND who hasn't got any orders of status 2, 3 or 10. So if they have an order of status 1 we want to include them. If the customer has two orders with statuses 1 and 2, we don't it in the results.

The SQL i have that works is:

Code:
SELECT DISTINCT customer.customer_id
FROM customer
WHERE customer .regTime < ( UNIX_TIMESTAMP( ) - ( 60 *60 *24 *30 ) )
AND NOT
EXISTS (
   SELECT 1
   FROM order_sum
   WHERE order_sum.customer_id = customer.customer_id
   AND order_sum.status IN ( 2, 3, 10 )
)
but it takes a long time to run. about 30 seconds.

My customer table has 1717 rows. The order_sum table has 1952 rows.

Can anybody speed my query up please ?

Thanks

Darren
Reply With Quote
  #2 (permalink)  
Old 04-09-09, 09:17
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
Code:
SELECT customer.customer_id
  FROM customer
LEFT OUTER
  JOIN order_sum
    ON order_sum.customer_id = customer.customer_id
   AND order_sum.status IN ( 2, 3, 10 )
 WHERE customer.regTime < UNIX_TIMESTAMP(CURRENT_DATE - INTERVAL 30 DAY)
   AND order_sum.customer_id IS NULL
make sure you have an index declared on customer.regTime and another on order_sum.customer_id

it might be even faster if the index were a compound index declared on (customer_id,status)

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 04-11-09, 13:54
slapo slapo is offline
Registered User
 
Join Date: Nov 2008
Posts: 7
Try this:
Code:
SELECT 
	DISTINCT cr.customer_id
FROM 
	customer cr
	JOIN
	order_sum osm
	ON(osm.customer_id = cr.customer_id)
WHERE 
	datediff(now(), cr.regTime) > 30
	AND osm.status NOT IN (2,3,10);
and

Code:
SELECT 
	DISTINCT cr.customer_id
FROM 
	customer cr
	JOIN
	order_sum osm
	ON(osm.customer_id = cr.customer_id AND datediff(now(), cr.regTime) > 30)
WHERE 
	osm.status NOT IN (2,3,10);
I'm not sure whether it really is faster, so you'll have to try that, but simple joins are fairly fast in MySQL and this way of writing them comes more natural to me.
Make sure to check on you indexes as r937 suggested. You should also have an index on customer.customer_id, particularly if that's the unique row identifier.

Last edited by slapo; 04-11-09 at 13:58.
Reply With Quote
  #4 (permalink)  
Old 04-16-09, 05:03
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
Thanks for your help guys.

I havent got round to testing whose is the fastest query just yet as when i put an index on order_sum.customer_id my query worked within 2 seconds!!

Thanks

Darren
Reply With Quote
  #5 (permalink)  
Old 04-16-09, 08:12
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
regarding performance...

in general, when you place a function on a table column, the optimizer cannot utilize an index on that column

compare my condition --
Code:
customer.regTime < UNIX_TIMESTAMP(CURRENT_DATE - INTERVAL 30 DAY)
with this --
Code:
datediff(now(), cr.regTime) > 30
this second condition will fail to work efficiently

in fact, it will actually fail to work correctly, as well, because it does not take into consideration that regTime is actually an integer, and not a date which can be used in the DATEDIFF function

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #6 (permalink)  
Old 04-16-09, 12:52
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
Quote:
Originally Posted by r937
Code:
SELECT customer.customer_id
  FROM customer
LEFT OUTER
  JOIN order_sum
    ON order_sum.customer_id = customer.customer_id
   AND order_sum.status IN ( 2, 3, 10 )
 WHERE customer.regTime < UNIX_TIMESTAMP(CURRENT_DATE - INTERVAL 30 DAY)
   AND order_sum.customer_id IS NULL
make sure you have an index declared on customer.regTime and another on order_sum.customer_id

it might be even faster if the index were a compound index declared on (customer_id,status)

Hi

How does the above take into account we DONT want to return records if they have a status 2,3,10? we only want to return the customer id if it doesnt have an order of 2,3,10. so orders with status 1,2 would not be return the customer id.

Thanks
Reply With Quote
  #7 (permalink)  
Old 04-16-09, 12:55
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
Ah got it.

It uses a left outer join so may or may not include any orders that have status 2,3,10.

then we just say we want the ones that have no ordersum.customerid.

Nice work
Reply With Quote
  #8 (permalink)  
Old 04-17-09, 07:14
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
Hi

Thanks for the help so far r937, it works a treat.

How would i change the sql to say:

Return those customers who have made ONE order of status (2,3,10) and that order was over 30 days ago?

So they might have 3 orders, two are status 1, one is status 2. This needs to be included.
If they have 3 orders, one of status 1 and two of status 2. This is NOT included.
This is on order time now not customer reg time.
AND if they have made an order (2,3,10) within 30 days the customer is not added.

Im finding it complicated to explain it, no wonder I cant write the SQL !!

Thanks for any help
Reply With Quote
  #9 (permalink)  
Old 04-17-09, 08:41
slapo slapo is offline
Registered User
 
Join Date: Nov 2008
Posts: 7
Quote:
Originally Posted by r937
regarding performance...

in general, when you place a function on a table column, the optimizer cannot utilize an index on that column

compare my condition --
Code:
customer.regTime < UNIX_TIMESTAMP(CURRENT_DATE - INTERVAL 30 DAY)
with this --
Code:
datediff(now(), cr.regTime) > 30
this second condition will fail to work efficiently

in fact, it will actually fail to work correctly, as well, because it does not take into consideration that regTime is actually an integer, and not a date which can be used in the DATEDIFF function

That's right, I failed to notice the datatype of regTime and its possible values :P
The point about the performance should be correct as well.

Sorry DazlerD and thanks for the correction, r937
Reply With Quote
  #10 (permalink)  
Old 04-17-09, 08:48
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
Quote:
Originally Posted by DazlerD
Return those customers who have made ONE order of status (2,3,10) and that order was over 30 days ago?
Code:
SELECT customer.customer_id
  FROM customer
INNER
  JOIN order_sum
    ON order_sum.customer_id = customer.customer_id
   AND order_sum.status IN ( 2, 3, 10 )
   AND order_sum.time < UNIX_TIMESTAMP(CURRENT_DATE - INTERVAL 30 DAY)
GROUP
    BY customer.customer_id
HAVING COUNT(*) = 1
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #11 (permalink)  
Old 04-17-09, 12:05
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
Brilliant r937, thanks a lot.

Ive been looking at this for the last hour, starting from the beginning then adding bits and doing some testing all the way along.

I did work out the inner join so happy with that !!

Thanks again to both of you for your advice.

Have a good weekend.

D
Reply With Quote
  #12 (permalink)  
Old 04-17-09, 12:33
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
Hi

I copied the results into a table and the wrote some sql to check the results:

SELECT sum.cart_order_id, sum.customer_id, sum.status, from_unixtime(sum.time)
FROM order_sum sum, PUR1_01 pur
WHERE sum.customer_id = pur.customer_id
and sum.status in (2,3,10)

return 889 results

SELECT distinct(sum.customer_id)
FROM order_sum sum, PUR1_01 pur
WHERE sum.customer_id = pur.customer_id
and sum.status in (2,3,10)

returned 885 records.

The sql is returning where the customer has ONE order with a status 2,3,10 more than 30 days ago. BUT the ones i found have another order within 30 days.

I only want those customers who have ONE order of 2,3,10 and that order is > 30 days ago.

Then i need to do another query where the cust has made TWO orders of 2,3,10 and BOTH those orders are > 90 days.

See I told you i found it hard to explain !!!

Thanks

D
Reply With Quote
  #13 (permalink)  
Old 07-13-09, 07:40
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
My results return

cust_id order date
628 2008-12-06 14:53:55
628 2009-07-03 17:09:44

How can this query

Code:
SELECT customer.customer_id
  FROM customer
INNER
  JOIN order_sum
    ON order_sum.customer_id = customer.customer_id
   AND order_sum.status IN ( 2, 3, 10 )
   AND order_sum.time < UNIX_TIMESTAMP(CURRENT_DATE - INTERVAL 30 DAY)
GROUP
    BY customer.customer_id
HAVING COUNT(*) = 1
be changed to not return any results that have an order in the last 30 days?

Thanks
Reply With Quote
  #14 (permalink)  
Old 07-13-09, 07:57
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
Code:
SELECT customer.customer_id
  FROM customer
INNER
  JOIN order_sum
    ON order_sum.customer_id = customer.customer_id
GROUP
    BY customer.customer_id
HAVING 1 =
       COUNT(CASE WHEN order_sum.status IN ( 2, 3, 10 )
                   AND order_sum.time < UNIX_TIMESTAMP(CURRENT_DATE - INTERVAL 30 DAY)
                  THEN 'older'
                  ELSE NULL END )
   AND 0 =
       COUNT(CASE WHEN order_sum.status IN ( 2, 3, 10 )
                   AND order_sum.time >= UNIX_TIMESTAMP(CURRENT_DATE - INTERVAL 30 DAY)
                  THEN 'newer'
                  ELSE NULL END )
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #15 (permalink)  
Old 07-15-09, 06:44
DazlerD DazlerD is offline
Registered User
 
Join Date: Apr 2009
Posts: 9
Hi r937

I've implemented and tested your sql statement.

Sweet man, it works a treat. Ive just got to understand it now !!!

Thanks ever so much. Much appreciated.

D
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On