Results 1 to 3 of 3
  1. #1
    Join Date
    Oct 2011
    Posts
    4

    Unanswered: Comparing date ranges to find earliest start date

    Hi. I've done some searching on this and I've tried a few pen-and-pencil operations, but I keep getting stuck in my "worst case" scenario.

    My scenario:

    I am working on an Oracle 11g database with query access.
    I have a table of job records indicating the start and end date of each job and which employee ID held that job.
    I have a table of employee records that shows
    • original hire date
    • current hire date (most recent employment)
    • adjusted date


    The adjusted date is what is supposed to be the earliest date of consecutive employment. Note that "consecutive" in this case is defined as having no more than a 1-year gap of employment.

    Of course, this field has not always been used correctly and I am trying to find the aberrant records.

    Let us assume that an employee had 4 jobs with the company:

    • 1: 01/01/00 to 01/01/01
    • 2: 06/01/02 to 01/01/04
    • 3: 03/01/02 to 06/30/02
    • 4: 02/01/04 to [null]


    The correct Adjusted Date for this employee would be 03/01/02 because there is only a one month gap between jobs 4 and 2 and there is an overlap between jobs 2 and 3.

    So far, I have created a subquery for the current job record, currentJob, with all of the key elements. My original instinct was to create a second table with all of the other jobs (or all of the jobs, for that matter) and look for the MIN(START_DATE) where oldJobs.END_DATE >= (currentJob.START_DATE - 365), but of course that won't catch a situation like the above.

    I'm starting to think that perhaps I could write a recursive query starting with the current job then iterating through the other jobs, but I've tried the START WITH...CONNECT BY PRIOR syntax to no avail.

    For example:

    Code:
    emp.ADJ_SERVICE_DATE <
                  (SELECT MIN (jobs.JOB_BEGIN)
                     FROM (SELECT DISTINCT
                                   PIDM,
                                   JOB_BEGIN,
                                   JOB_END,
                                   POSN,
                                   STATUS,
                                   BCAT,
                                  JOB_TYPE
                             FROM IDENTITY_TABLE sp
                                  INNER JOIN {several tables}
                             ORDER BY JOB_END DESC) jobs
                   START WITH jobs.JOB_END = NULL
                   CONNECT BY jobs.JOB_BEGIN <= PRIOR jobs.JOB_END + 365 )
    ...returns 0 records.

    I have verified that my inner jobs query (abbreviated here) works to return 58 potential issues (adjusted date < current hire date), and 2 or 3 legitimate issues (adjusted date < earliest consecutive job start date).

    If anyone has any pointers or insight, I'd be glad to hear it. I've never tried recursion in Oracle and I don't quite know if I understand the syntax even after reading several message boards and the Oracle documentation.

    Am I doing this right? Will it properly iterate the way I have it set up? Is there a better way to compare sets of dates as described above?

    Thanks,

    ~Mike

  2. #2
    Join Date
    Mar 2007
    Posts
    623
    There are two issues in the query you posted:

    > START WITH jobs.JOB_END = NULL
    This condition is never TRUE, it is NULL. For treating with NULLs, you should use IS NULL/IS NOT NULL conditions. So, correctly it would be
    > START WITH jobs.JOB_END IS NULL

    > CONNECT BY jobs.JOB_BEGIN <= PRIOR jobs.JOB_END + 365
    Here, you search hierarchy rows based on JOB_END, which is NULL in the first row. Maybe, just PRIOR keyword is misplaced; based on your text description, I would write it as
    > CONNECT BY nocycle months_between( PRIOR jobs.JOB_BEGIN, jobs.JOB_END ) <= 12
    which means, between given JOB_BEGIN and searched JOB_END is no longer than 12 months (which is not 365 days in some years). As other found row most probably will satisfy it, you should get rid of them explicitly; NOCYCLE seems to be the easiest way.

    If this is not what you want and you want further help, you should post a testcase - CREATE TABLE + INSERT scripts for the ability of re-creation your tables and data (no need for original data, meta table holding result of that subquery would be enough).

  3. #3
    Join Date
    Oct 2011
    Posts
    4
    Quote Originally Posted by flyboy View Post
    There are two issues in the query you posted:

    > START WITH jobs.JOB_END = NULL
    This condition is never TRUE, it is NULL. For treating with NULLs, you should use IS NULL/IS NOT NULL conditions. So, correctly it would be
    > START WITH jobs.JOB_END IS NULL
    Yeah, that was a harebrained mistake on my part.

    > CONNECT BY jobs.JOB_BEGIN <= PRIOR jobs.JOB_END + 365
    Here, you search hierarchy rows based on JOB_END, which is NULL in the first row. Maybe, just PRIOR keyword is misplaced; based on your text description, I would write it as
    > CONNECT BY nocycle months_between( PRIOR jobs.JOB_BEGIN, jobs.JOB_END ) <= 12
    which means, between given JOB_BEGIN and searched JOB_END is no longer than 12 months (which is not 365 days in some years). As other found row most probably will satisfy it, you should get rid of them explicitly; NOCYCLE seems to be the easiest way.
    This seems like a more elegant way of calculating a year. I was pondering the best way to do that and I was considering using addMonth(12) there.

    If this is not what you want and you want further help, you should post a testcase - CREATE TABLE + INSERT scripts for the ability of re-creation your tables and data (no need for original data, meta table holding result of that subquery would be enough).
    Unfortunately, I don't have WRITE access to an Oracle database, so while I could write an example CREATE TABLE/INSERT statement with sample data, I can't guarantee that said statement would work. That being said, if you'd like me to give it a whirl or just copy/paste some sample data from my subquery, I'd be glad to do so.

    At the moment I am trying the query with the modifications you suggested. Unfortunately, it has been running for over 20 minutes so far, and I'm only looking up one employee's jobs. Something tells me that either the query needs tweaking or there might be a better method for resolving this.

    Oh, and another gem I just discovered: the end date of the most recent job isn't necessarily null. It can be null, future-dated, or past-dated. Perhaps I could use a CASE statement to check for NULL, ELSE use the MAX(job_end)?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •