Results 1 to 3 of 3
  1. #1
    Join Date
    Jul 2008
    Posts
    2

    Unanswered: Query help on duplicates

    Hi, SQL newbie here requesting some help.
    So, I'm currently working in a web-based database (not made by me) that is limited to only accepting SELECT statments. The database contains many thousands of emails and their attachments.
    Basically, after poking around, I've been able to construct a basic idea of how the database is connected that is relevant to the query. Here's how it looks ---

    Table:
    tItems

    Columns:
    ThreadID (only email messages have these)
    ItemID (given to all emails and attachments, and where ThreadID = ItemID for emails)
    Subject (only given to email subjects)
    Date
    FileType ('Email' and 'File')
    Filename

    Also, I'm limited by the fact that the application only allows SQL queries that return an ID. So the query must start out as:
    Code:
    SELECT DISTINCT(tItems.ItemID) AS ID 
    FROM tblItem
    WHERE
    ...
    What I want to do is to write a query that will check for all similar subject headings and return only the most recent version of the email.
    So for example, if the table were to have the following data:


    Code:
    ItemID    ThreadID     Subject         Filename        FileType         Date
    -------   ---------    -----------    -----------      ------------     ------- 
    1         1            Hello          Hello.msg        EMAIL            1/1/08
    2         2            Re: Hello      Re Hello.msg     EMAIL            1/3/08
    3         -            -              Hello.ppt        FILE             1/3/08
    4         4            Fwd: Hello     Fwd Hello.msg    EMAIL            1/4/08
    5         -            -              Hello.xls        FILE             1/5/08
    6         6            Hi             Hi.msg           EMAIL            1/7/08
    The query should return:
    Code:
    ItemID
    ------- 
    4
    6
    So far everything I've been trying have given me no ItemID results and is also taking forever to run...is a query like this possible with the restrictions I have? I'm pretty sure this would be alot easier if I wasn't only confined to writing SELECT statements...but there's nothing I can do about it, which is really frustrating.

    I appreciate any comments/help that you all have.

  2. #2
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    Accuracy of a result, no matter what you come up with, is gonna be very fuzzy, because based on what you've given us, you can only go by a subject. So if I send 2 or more different emails with the same subject, you'll be picking up only the last one as "the most recent". This leads to a gross ambiguity of email identification, and will ultimately result in enourmous inaccuracy of your data.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  3. #3
    Join Date
    Jul 2008
    Posts
    2
    Yeah, I figured that would be a problem. Oh well, thanks anyways.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •