| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

08-10-06, 14:37
|
|
Registered User
|
|
Join Date: Apr 2004
Posts: 33
|
|
|
list all employees that didn't take courses in 2005 (was "SQL statment")
|
|
I need help on the SQL statment.
This SQL statment is extreamly slow and i don't know how to make it run faster. I have a tblTrainingStudents table where it keeps all the employees trainings and tblSHPEmployee is the employee roster and tblBasicSchool is the table where we track all the cadets who graduate from the basic school (from cadets to troopers) in each year. What i want the result to be is list all employees that didn't take courses in say year 2005. New troopers don't have to take any extra courses for year 2005 since they took these course when they were in basic school. I really need someone can help me speed up my SQL statment ASAP.
Thanks.
************************************************** **
SELECT *
FROM
(SELECT *
FROM tblTrainingCourses C, tblSHPEmployee
WHERE year = '#currentYear#'
AND (reg_no LIKE '1%'
OR reg_no LIKE '2%'
OR reg_no LIKE '3%')
AND troop = '#troop#'
AND NOT EXISTS
(SELECT *
FROM tblTrainingStudents S
WHERE C.trainingType = S.trainingType
AND C.courseID = S.courseID
AND year = '#currentYear#'
AND reg_no = regNo)) as employee
WHERE employee.reg_no NOT IN
(SELECT regNo
FROM tblBasicschool
WHERE employee.reg_no = regNo
AND left(enddate,4) = '#currentYear#'
AND (rank='TRP' OR left(regNo, 1) = '3'))
************************************************** *********
ORDER BY troop, lname, fname, minit, trainingType, courseID
|
|

08-10-06, 15:43
|
|
Registered User
|
|
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
|
|
I don(t see any need for the "SELECT FROM SELECT", so what about :
Code:
SELECT *
FROM tblTrainingCourses as C, tblSHPEmployee
WHERE year = '#currentYear#'
AND (reg_no LIKE '1%' OR reg_no LIKE '2%' OR reg_no LIKE '3%')
AND troop = '#troop#'
AND NOT EXISTS
(SELECT 1
FROM tblTrainingStudents
WHERE C.trainingType = trainingType
AND C.courseID = courseID
AND year = '#currentYear#'
AND reg_no = regNo)
AND reg_no NOT IN
(SELECT regNo
FROM tblBasicschool
WHERE left(enddate,4) = '#currentYear#'
AND (rank='TRP' OR regNo LIKE '3%')
)
ORDER BY ...
Clearly the "NOT EXISTS" and "NOT IN" will potentially slow down the query (since no index can be used for resolving these conditions).
Similarly the use of left(enddate,4) may slow down things; you should (if possible) replace it with someting like "enddate LIKE '2006%'".
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
|
Last edited by Peter.Vanroose; 08-10-06 at 15:48.
|

08-17-06, 08:00
|
|
Registered User
|
|
Join Date: Apr 2004
Posts: 33
|
|
|
|
Peter, thanks for your post. It is still so slow. It takes about 3 mins to run the query. Is there any other way to speed this up a lot more? The tblTrainingStudents is kind of big. I really appreciate your help.
|
|

08-17-06, 10:24
|
|
Registered User
|
|
Join Date: Aug 2006
Location: The Netherlands
Posts: 248
|
|
It looks like your main query produces a cartesian product for I don't see a join-clause between 'tblTrainingCourses' and 'tblSHPEmployee'... or did I miss something???
Grts
|
|

08-17-06, 16:29
|
|
Registered User
|
|
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
|
|
What about the following query?
I've removed tblTrainingCourses, and also removed the redundant condition reg_no = regNo in the second subquery.
You will maybe have to re-insert your conditions on year and troop -- I've no idea in which tables they belong.
Also, for the time being, I replaced the "in current year" condition by ">= '2006-01-01'"; adapt this if necessary.
The conditions on reg_no (with "BETWEEN" and ">= '3'") are more performant than a LIKE; but they assume reg_no to be all digits, and moreover assume an ASCII ordering.
Again, adapt if necessary.
Code:
SELECT *
FROM tblSHPEmployee AS e
WHERE reg_no BETWEEN '1' AND '39999999'
AND NOT EXISTS
( SELECT regNo
FROM tblTrainingStudents AS s INNER JOIN tblTrainingCourses AS c
ON s.CourseID = c.CourseID
WHERE regNo = e.reg_No
AND year = '2006'
)
AND reg_no NOT IN
( SELECT regNo
FROM tblBasicschool
WHERE endDate >= '2006-01-01'
AND (rank = 'TRP' OR regNo >= '3')
)
You may replace the "NOT EXISTS" by a "NOT IN" and vice versa; that could give performance differences (depending on the size of the tables), so try out all four possibilities. Don't forget to remove the condition "regNo = e.reg_No" in the "NOT IN" and to add it in the "NOT EXISTS".
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
|
Last edited by Peter.Vanroose; 08-18-06 at 02:05.
|

08-18-06, 07:57
|
|
Registered User
|
|
Join Date: Dec 2005
Location: Texas
Posts: 100
|
|
'LIKE' keywords also slow down queries since they force a table scan. Also you're selecting * when you potentially could not be.
|
|

08-20-06, 10:52
|
|
Registered User
|
|
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
|
|
Quote:
|
Originally Posted by disruptivehair
'LIKE' keywords also slow down queries since they force a table scan.
|
Not necessarily: most RDBMS' will use a matching index scan for "LIKE '3%'". But you're right in the case of a "%" at the beginning of the string.
Quote:
|
Originally Posted by disruptivehair
Also you're selecting * when you potentially could not be.
|
This is less of a performance problem, it's at most a data transmission bottleneck (over a slow communication line).
Except in the case where you would only SELECT a column which happens to be an indexed column (or column combination), since in that case the query *could* be performed with an index-only scan (but that also depends on the WHERE condition of course).
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
|
|

08-21-06, 06:08
|
|
Registered User
|
|
Join Date: Dec 2005
Location: Texas
Posts: 100
|
|
Quote:
|
Originally Posted by Peter.Vanroose
Not necessarily: most RDBMS' will use a matching index scan for "LIKE '3%'". But you're right in the case of a "%" at the beginning of the string.
This is less of a performance problem, it's at most a data transmission bottleneck (over a slow communication line).
Except in the case where you would only SELECT a column which happens to be an indexed column (or column combination), since in that case the query *could* be performed with an index-only scan (but that also depends on the WHERE condition of course).
|
I defer to your superior knowledge Peter!
Our DBAs here refuse to allow us to use LIKE in code because most of the time it does a table scan. 'SELECT *' is also banned since not only is it wasteful, it has a tendency to stop working if the table schema changes. 
|
|

08-21-06, 10:04
|
|
Registered User
|
|
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
|
|
Quote:
|
Originally Posted by disruptivehair
'SELECT *' is also banned since not only is it wasteful, it has a tendency to stop working if the table schema changes.
|
I fully agree with that: it's *always* safer to list all required columns explicitly, even it turns out to be all table columns.
So indeed "SELECT *" should be banned!
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
|
|

08-22-06, 10:02
|
|
Registered User
|
|
Join Date: Dec 2005
Location: Texas
Posts: 100
|
|
Quote:
|
Originally Posted by Peter.Vanroose
I fully agree with that: it's *always* safer to list all required columns explicitly, even it turns out to be all table columns.
So indeed "SELECT *" should be banned!
|
ITA Peter...we've still got some 'select *' stuff hanging around but it is slowly being re-written to explicitly list all the columns.
|
|

08-24-06, 10:46
|
|
Registered User
|
|
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
|
|
Or if you want to list all courses together with all employees who did not take that course:
Code:
SELECT c.CourseName, e.EmployeeName
FROM
tblSHPEmployee AS e
INNER JOIN
tblTrainingCourses AS c
ON NOT EXISTS
( SELECT 1
FROM tblTrainingStudents AS s
WHERE s.CourseID = c.CourseID
AND e.reg_No = s.regNo
)
WHERE
reg_no BETWEEN '1' AND '39999999'
ORDER BY 1, 2
Mind you that this may again run a lot slower!
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
|
|

08-24-06, 11:20
|
|
King of Understatement
|
|
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
|
|
[Non-ANSI Comment]Assuming this is SQL Server then SELECT * within an exists statement is the one time SELECT * is recommended - the optimiser then selects optimal index.[/Non-ANSI Comment]
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
|
|
|

08-24-06, 11:41
|
|
Registered User
|
|
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
|
|
Quote:
|
Originally Posted by pootle flump
SELECT * within an exists statement is recommended - the optimiser then selects optimal index.
|
Are you sure that SQLServer would not do that with "SELECT 1" ?
I know DB2 will typically do an index scan without data lookup when using "SELECT 1".
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
|
|

08-25-06, 09:54
|
|
King of Understatement
|
|
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
|
|
You know - I have lost my original MS link but coincidently today I've read something elsewhere (by the bloomin head of SQL Server Optimisation no less) that contradicts what I posted. So disregard 
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
|
|
|

08-25-06, 11:42
|
|
Registered User
|
|
Join Date: Sep 2004
Location: Belgium
Posts: 1,079
|
|
Anyway, in an EXISTS, "SELECT 1" is better than "SELECT *" since the former does not require data access (when an index is available on the column(s) specified in the WHERE condition) while the latter may need data access (unless the optimizer is clever enough to see that the two are equivalent :-)
__________________
--_Peter Vanroose,
__IBM Certified Database Administrator, DB2 9 for z/OS
__IBM Certified Application Developer
__ABIS Training and Consulting
__http://www.abis.be/
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|