Results 1 to 10 of 10

Thread: SQL question

  1. #1
    Jane Vohden Guest

    Question Answered: SQL question

    I want to select the first record of each occurrence of a different id within a table after it is sorted. I thought it would be fairly straight forward but I can't get it to work. Here is a copy of my sql in it's most basic form. It returns all the records in the table. I've tried "group by" and "first" and a subquery with no luck.

    Any ideas? What am I missing? Thanks for you help.
    Code:
    select unique id, calc_end, calc_beg 
      from hr_emppay 
      order by id asc, calc_end desc
    __________________
    Jane Vohden
    Database Administrator Fairbanks North Star Borough

  2. Best Answer
    Posted by Gerard Lapidario

    "Hi Jane,

    The SQL you've written will return the unique combination of "id, calc_end, calc_beg." If you want the first ID to be listed then there are several ways of doing it. One is through cursor, fetch the first record then close the cursor. But if you want it on SQL, then a approach will be needed.

    If through SQL:
    Code:
    SELECT a.id, a.calc_end, a.calc_beg 
      FROM hr_emppay a 
      WHERE a.rowid = (SELECT MAX(rowid) 
                        FROM hr_emppay b 
                        WHERE b.id = a.id)
    This will retrieve the latest record for the given id. Placing MIN would get the very first record created for the given id. Please note that this query will tend to be slow, so if this is in a program, id use a cursor with your given SQL statement below and just fetch one record. Hope this help "


  3. #2
    Andrew Hamm Guest
    Nahh - can't be done directly. Since you imply there will be multiple id's in the table, then it follows that it is not the primary key. If you want to get one row where there are many, you need to invent an artificial filter. I've done this before (but I'm not proud of it)
    Code:
    select min(rowid) 
     from hr_emppay 
     group by id
    [Note that you don't need to select the actual ID here for the group by to work. Please read on...]

    Now you'll only get one row per unique id from the select, although it has been arbitrarily chosen. You may select the min of the primary key if you want to be politically correct, but that could be tricky if there are multiple fields.

    If you then use this select as a sub-query you may get your result:
    Code:
    select id, calc_end, calc_beg 
      from hr_emppay 
      where rowid in (select min(rowid) 
                       from hr_emppay 
                       group by id) 
      order by id, calc_end desc
    However, wanting to do this is possibly defective. I'm chosing one id through the very arbitrary rule of min(rowid). Perhaps the way you are going to use this might make it better to do something like
    select PRIMARY_KEY_FIELD, id, min(calc_end) ......
    or at least use min(calc_end) or max(calc_end) or something more intelligent than rowid?

    The need for this query suggests to me (with complete ignorance of your data and application) that the table is not normalised properly. If (ID, calc_end, calc_begin) are not fully dependent on "the key, the whole key and nothing but the key" then it sounds like this one table should be at least two tables. When the tables get into the proper form, then your queries will magically simplify and you won't need to execute really ugly selects like the one I've just suggested.

  4. #3
    Jonathan Leffler Guest
    This sort of query is tough.
    Code:
    select unique t1.id, 
           (select max(calc_end) 
             from hr_emppay t2 
             where b1.id = t1.id) as calc_end 
      from hr_emppay t1 
      order by id asc;
    
    This seems to work on my sample data, which was pretty trivial.
    Yours, Jonathan Leffler
    __________________
    #include <disclaimer.h>
    Guardian of DBD::Informix v1.00.PC1 -- http://www.perl.com/CPAN
    "I don't suffer from insanity; I enjoy every minute of it!"

  5. #4
    Gerard Lapidario Guest
    Hi Jane,

    The SQL you've written will return the unique combination of "id, calc_end, calc_beg." If you want the first ID to be listed then there are several ways of doing it. One is through cursor, fetch the first record then close the cursor. But if you want it on SQL, then a approach will be needed.

    If through SQL:
    Code:
    SELECT a.id, a.calc_end, a.calc_beg 
      FROM hr_emppay a 
      WHERE a.rowid = (SELECT MAX(rowid) 
                        FROM hr_emppay b 
                        WHERE b.id = a.id)
    This will retrieve the latest record for the given id. Placing MIN would get the very first record created for the given id. Please note that this query will tend to be slow, so if this is in a program, id use a cursor with your given SQL statement below and just fetch one record. Hope this help

  6. #5
    Andrew Hamm Guest
    Gerard Lapidario wrote in message <3ACCB34D.B3A57D78@netzero.net>...
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]>SELECT a.id, a.calc_end, a.calc_beg FROM hr_emppay a WHERE a.rowid =[/color]
    [color={usenetquotecolor}]>(SELECT MAX(rowid) FROM hr_emppay b WHERE b.id = a.id)[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]>This will retrieve the latest record for the given id. Placing MIN would[/color]
    get
    [color={usenetquotecolor}]>the very first record created for the given id.[/color]

    Just a small correction - this will find the physically first or last row,
    but not necessarily the first or last in terms of time. Various factors
    may "naturally disorganise" rows by rowid.

    If you delete a row, another row will later take it's place

    (re)clustering by an index will almost certainly render useless any
    perceived relationship between rowid and the chronological sequence of the
    insertions.

    Probably other reasons exist too %^)

    Anyways, since your choice is arbitrary, an arbitrary result will be
    good enough.

  7. #6
    Austin Power Guest
    This is an interesting topic Andrew. Is there a way for us to determine
    the latest record created on a table if the table doesn't contain a
    DATETIME field?

    Hit me back dude. Thanks.

    Andrew Hamm wrote:

    [color={usenetquotecolor}]> Gerard Lapidario wrote in message <3ACCB34D.B3A57D78@netzero.net>...[/color]
    [color={usenetquotecolor2}]> >[/color]
    [color={usenetquotecolor2}]> >SELECT a.id, a.calc_end, a.calc_beg FROM hr_emppay a WHERE a.rowid =[/color]
    [color={usenetquotecolor2}]> >(SELECT MAX(rowid) FROM hr_emppay b WHERE b.id = a.id)[/color]
    [color={usenetquotecolor2}]> >[/color]
    [color={usenetquotecolor2}]> >This will retrieve the latest record for the given id. Placing[/color]
    [color={usenetquotecolor2}]> >MIN would[/color]
    [color={usenetquotecolor}]> get[/color]
    [color={usenetquotecolor2}]> >the very first record created for the given id.[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]> Just a small correction - this will find the physically first or last[/color]
    [color={usenetquotecolor}]> row, but not necessarily the first or last in terms of time. Various[/color]
    [color={usenetquotecolor}]> factors may "naturally disorganise" rows by rowid.[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]> If you delete a row, another row will later take it's place[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]> (re)clustering by an index will almost certainly render useless any[/color]
    [color={usenetquotecolor}]> perceived relationship between rowid and the chronological sequence of[/color]
    [color={usenetquotecolor}]> the insertions.[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]> Probably other reasons exist too %^)[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]> Anyways, since your choice is arbitrary, an arbitrary result will be[/color]
    [color={usenetquotecolor}]> good enough.[/color]

  8. #7
    Andrew Hamm Guest
    Austin Power wrote in message <3AD1C9C1.55A55E38@techie.com>...
    [color={usenetquotecolor}]>This is an interesting topic Andrew. Is there a way for us to determine[/color]
    the
    [color={usenetquotecolor}]>latest record created on a table if the table doesn't contain a DATETIME[/color]
    field?
    [color={usenetquotecolor}]>[/color]
    No - it would be a waste of space and time for the engine to record the
    stamps with every row. You must record your own date & time if you require
    logging. I suggest you don't be misled into thinking that anything more
    accurate than YEAR TO SECOND is relevant, for various reasons.

    I prefer that all records in one transaction get stamped with the exact
    same datetime, because that would make it more believable when you attempt
    to correlate any records that are spilled on the floor. In 4GL, this
    implies writing something like:

    begin work let tran_time = current .... use tran_time to mark all rows
    commit work

    I'm 99% sure that the question which started this thread would be solved
    with stronger normalisation. The question smacks of redundancy.

  9. #8
    Andrew Hamm Guest
    Andrew Hamm wrote in message <9ath00$6l9bi$1@ID-79573.news.dfncis.de>...
    [color={usenetquotecolor}]>>[/color]
    [color={usenetquotecolor}]>No - it would be a waste of space and time for the engine to record the[/color]
    [color={usenetquotecolor}]>stamps with every row. You must record your own date & time if you[/color]
    [color={usenetquotecolor}]>require logging.[/color]

    ^^^^^^^^

    That's a REALLY bad choice of word. I haven't had my morning coffee yet.
    Please read:

    "... if you require TIMESTAMP recording"

    or something like that.

  10. #9
    Austin Power Guest
    Yeah, if there's no internal datetime stamp for the record, I guess the
    next best thing is to create a datetime field or a decimal field (acting
    as serial) to capture the running number for each record creation.

    Thanks Andrew.

    Andrew Hamm wrote:

    [color={usenetquotecolor}]> Austin Power wrote in message <3AD1C9C1.55A55E38@techie.com>...[/color]
    [color={usenetquotecolor2}]> >This is an interesting topic Andrew. Is there a way for us to determine[/color]
    [color={usenetquotecolor}]> the[/color]
    [color={usenetquotecolor2}]> >latest record created on a table if the table doesn't contain a[/color]
    [color={usenetquotecolor2}]> >DATETIME[/color]
    [color={usenetquotecolor}]> field?[/color]
    [color={usenetquotecolor2}]> >[/color]
    [color={usenetquotecolor}]> No - it would be a waste of space and time for the engine to record the[/color]
    [color={usenetquotecolor}]> stamps with every row. You must record your own date & time if you[/color]
    [color={usenetquotecolor}]> require logging. I suggest you don't be misled into thinking that[/color]
    [color={usenetquotecolor}]> anything more accurate than YEAR TO SECOND is relevant, for various[/color]
    [color={usenetquotecolor}]> reasons.[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]> I prefer that all records in one transaction get stamped with the exact[/color]
    [color={usenetquotecolor}]> same datetime, because that would make it more believable when you[/color]
    [color={usenetquotecolor}]> attempt to correlate any records that are spilled on the floor. In 4GL,[/color]
    [color={usenetquotecolor}]> this implies writing something like:[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]> begin work let tran_time = current .... use tran_time to mark all rows[/color]
    [color={usenetquotecolor}]> commit work[/color]
    [color={usenetquotecolor}]>[/color]
    [color={usenetquotecolor}]> I'm 99% sure that the question which started this thread would be solved[/color]
    [color={usenetquotecolor}]> with stronger normalisation. The question smacks of redundancy.[/color]

  11. #10
    Andrew Hamm Guest
    Austin Power wrote in message <3AD2F9CE.BBAA193B@techie.com>...
    [color={usenetquotecolor}]>Yeah, if there's no internal datetime stamp for the record, I guess the[/color]
    next
    [color={usenetquotecolor}]>best thing is to create a datetime field or a decimal field (acting as[/color]
    serial)
    [color={usenetquotecolor}]>to capture the running number for each record creation.[/color]
    [color={usenetquotecolor}]>[/color]
    If you are designing a new table, then you can make the timestamps highly
    automated:

    create table thingo ( ins_time datetime year to second default
    current, ..... );

    Note that the default will be used if the insert statement doesn't
    list an entry for the field - it doesn't just kick in if a null is
    assigned. If that's an issue, then you can define an insert trigger on
    the table instead.

    If you also want a mod_time then you can also setup update triggers
    which automagically reassign the mod_time, and a mod_user if you like.
    Insert and update triggers can also be used if you want a more detailed
    audit trail.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •