Results 1 to 9 of 9
  1. #1
    Join Date
    Jun 2009
    Posts
    4

    Unanswered: Multiple records to validate data

    Hi,

    I have a situation where there are containers being delivered and returned at different locations. At the delivery location they are recorded in a delivery sheet and at the return location they are recorded in a return sheet. I decided to design a simple database to track this. Below are the tables I came up with and the results of the SQL query.

    The problem I have now is to fine tune the query. If you take container AAA111 it was delivered on the 1st of January, and the 1st of May. Only two of these records are correct both indicating days out at 31. The other two are incorrect. How do I correct this. There errors associated with the other two containers as well. How do I correct this to show the containers delivered and returned correctly and to show containers that are delivered and not returned as still outstanding. Thanks for your help




    Code:
    TBL_Container
    Container	Owner
    AAA111	       	Alice
    BBB222	       	Bob
    CCC333	       	Charles
    EEE444	       	Estate
    
    
    
    
    
    TBL_DateDelivered
    
    ID	Container	Farmer	DateDelivered
    1	AAA111	   	Alice	01/01/2009
    2	BBB222	   	Bob	01/05/2009
    3	CCC333	   	Charles	01/10/2009
    4	EEE444	   	Alice	03/01/2009
    5	BBB222	   	Bob	03/05/2009
    6	CCC333	   	Charles	03/10/2009
    7	AAA111	   	Alice	05/01/2009
    8	BBB222	   	Bob	05/05/2009
    
    
    TBL_DateReturned
    
    ID	Container   	Farmer	    DateReturned
    1	AAA111		Alice	    02/01/2009
    2	BBB222	   	Bob	    02/05/2009
    3	CCC333	   	Charles	    02/10/2009
    5	BBB222	   	Bob	    04/05/2009
    6	CCC333	   	Charles	    04/10/2009
    7	AAA111	   	Alice	    06/01/2009
    9	CCC333	   	Charles	    06/02/2009
    
    
    Results of SQL Query
    
       		TBL_			TBL_		
    		Date			Date
    		Delivered		Returned
    Container Owner	.Farmer	DateDeliverd	.Farmer	DateReturned	DaysOut
    
    AAA111	Alice	Alice	05/01/2009	Alice	02/01/2009	-89
    AAA111	Alice	Alice	01/01/2009	Alice	06/01/2009	151
    AAA111	Alice	Alice	01/01/2009	Alice	02/01/2009	31
    AAA111	Alice	Alice	05/01/2009	Alice	06/01/2009	31
    BBB222	Bob	Bob	01/05/2009	Bob	02/05/2009	31
    BBB222	Bob	Bob	05/05/2009	Bob	04/05/2009	-30
    BBB222	Bob	Bob	05/05/2009	Bob	02/05/2009	-89
    BBB222	Bob	Bob	03/05/2009	Bob	04/05/2009	31
    BBB222	Bob	Bob	01/05/2009	Bob	04/05/2009	90
    BBB222	Bob	Bob	03/05/2009	Bob	02/05/2009	-28
    CCC333	Charles	Charles	03/10/2009	Charles	06/02/2009	84
    CCC333	Charles	Charles	03/10/2009	Charles	04/10/2009	31
    CCC333	Charles	Charles	03/10/2009	Charles	02/10/2009	-28
    CCC333	Charles	Charles	01/10/2009	Charles	06/02/2009	143
    CCC333	Charles	Charles	01/10/2009	Charles	04/10/2009	90
    CCC333	Charles	Charles	01/10/2009	Charles	02/10/2009	31
    EEE444	Estate	Alice	03/01/2009


    SELECT TBL_Container.Container, TBL_Container.Owner, TBL_DateDelivered.Farmer, TBL_DateDelivered.DateDelivered, TBL_DateReturned.Farmer, TBL_DateReturned.DateReturned, DateDiff('d',TBL_DateDelivered!DateDelivered,TBL_D ateReturned!DateReturned) AS DaysOut
    FROM (TBL_Container LEFT JOIN TBL_DateDelivered ON TBL_Container.Container=TBL_DateDelivered.Containe r) LEFT JOIN TBL_DateReturned ON TBL_Container.Container=TBL_DateReturned.Container ;

  2. #2
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    splitreverse, The problem is that the Delivered and Return table both have 2 rows with Container AAA111. When you join the tables on the non-unique column you get 4 rows (2 * 2). You need something to remove this duplication. I can think of two ways to accomplish this (there may be more).

    1) Add a 'trip-id' to each table. This would indicate that the Delivered / Returned are the same 'round-trip'.

    Delivered
    AAA111 1 2009-01-01
    AAA111 2 2009-05-01

    Returned
    AAA111 1 2009-02-01
    AAA111 2 2009-06-01

    Then you could join on:

    Delivered D
    LEFT OUTER JOIN
    Returned R
    ON D.Container = R.Container
    AND D.Trip_ID = R.Trip_ID

    (By the way, I noticed that you join Container to Delivered and Container to Returned. It might make more sense to join Container to Delivered to Returned. I don't think you can have a Returned Container unless it has been Delivered first).

    2) Change your table design and just have 2 tables. Container and Tracking. Tracking would be the Delivered and Returned tables combined. These tables have identical data except one has a Delivered date and one has a Returned date.

    Tracking

    ID
    Container
    Farmer
    Deliver_Date
    Returned_Date

    Then when you join Container to Tracking it would just be:

    Container C
    Left Outer Join
    Tracking T
    On C.Container = T.Container

    You can generate either Delivered Reports or Returned reports easily enough if you need them separated.

  3. #3
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    I'm a big fan of idea #2 - this normalises your data too (which is related to the bit about "these tables have identical data").
    Testimonial:
    pootle flump
    ur codings are working excelent.

  4. #4
    Join Date
    Jun 2009
    Posts
    4
    Stealth,

    thanks for your response but here is my difficulty. The delivery sheets have all been populated for years and the return sheets likewise. Now there is an effort to track these containers from past records. Moving forward I believe your approach would be the best approach but with the mess I am faced with I need to find someway to reconcile this. I can't manually go and put transaction fields in delivery records and return records. The query results are close but I need to prune it. Note the data might not be clean either. There might be returned dates without deliveries. A farmer brings his container in and it was somehow lost on the estate and never delivered. He would therefore have a claim. To track this now we have to manually look at the return sheets and see if there is a corresponding delivery sheet with a date greater than the returned date. Based on the Query I just need to clean it and prepare a new results removing the erroneous results.

    If we just look at the results for Alice. I would like some way to remove the row with -89 days out based on the fact that there is a record with the same Datedelivered and the -ve numbers in this case would mean the record is erroneous. If however there was no other record with the same date it would simply mean that the container wasn't delivered and the container should be on the estate. In order to do that I believe you would need to compare all records with the same date delivered and select a record based on the logic I have described. Is that possible at all?

    Code:
       		TBL_			TBL_		
    		Date			Date
    		Delivered		Returned
    Container Owner	.Farmer	DateDeliverd	.Farmer	DateReturned	DaysOut
    
    AAA111	Alice	Alice	05/01/2009	Alice	02/01/2009	-89
    AAA111	Alice	Alice	01/01/2009	Alice	06/01/2009	151
    AAA111	Alice	Alice	01/01/2009	Alice	02/01/2009	31
    AAA111	Alice	Alice	05/01/2009	Alice	06/01/2009	31

  5. #5
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    splitreverse, I can't ensure there are no syntax errors but I think this will work:
    Code:
    SELECT C.Container
         , C.Owner
         , D.Farmer
         , D.DateDelivered
         , R.Farmer
         , R.DateReturned
         , DateDiff('d',D.DateDelivered,R.DateReturned) AS DaysOut
    FROM TBL_Container C
           LEFT JOIN 
           TBL_DateDelivered D
             ON C.Container = D.Container
             LEFT JOIN  
               TBL_DateReturned R
                 ON D.Container = R.Container
       , (SELECT D.Container
               , D.DateDelivered
               , MIN(DateDiff('d',D.DateDelivered,R.DateReturned)) AS MinDaysOut
          FROM TBL_DateDelivered D
                 LEFT JOIN  
                   TBL_DateReturned R
                     ON D.Container = R.Container
          GROUP BY D.Container, D.DateDelivered
         ) as MinDays
    WHERE C.Container = MinDays.Container
      AND D.DateDelivered = MinDays.DateDelivered
      AND DateDiff('d',D.DateDelivered,R.DateReturned) = MinDays.MinDaysOut
    Basically it works on the theory that for any given Container, DateDelivered combination, the Minimum amount of Days Out is the correct row.

    I am using a Nested Table (might be called Derived Table in Microsoft's world) to calculate the Minimum date and using that to Join back to your original query using the Minimum date to filter out the correct rows.

    You might be able to use this to covert you historical data into a new design. I am not sure you would want to run this all of the time (performance may be be very good).

  6. #6
    Join Date
    Jun 2009
    Posts
    4
    Stealth,

    I see where you're going with this but unfortunately it wouldn't work. Let us look at just the Alice data for example. The MinDaysOUt column would return the 05/01/2009 date delivered with the 02/01/2009 cause the difference is -89 instead of the 05/01/2009 with the 06/01/2009 data. There are cases when I would want to return -ve MinDaysOut especially if there were no return dates greater than the last delivery date. That situation would indicate the container is still with the farmer. Thanks for your help thus far.

  7. #7
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    In that case add the line in red to the nested table:
    Code:
    (SELECT D.Container
               , D.DateDelivered
               , MIN(DateDiff('d',D.DateDelivered,R.DateReturned)) AS MinDaysOut
          FROM TBL_DateDelivered D
                 LEFT JOIN  
                   TBL_DateReturned R
                     ON     D.Container = R.Container
                        AND D.DeliveredDate <= R.DateReturned
          WHERE
          GROUP BY D.Container, D.DateDelivered
         ) as MinDays
    This will filter out all the 'impossible' DateReturned before DateDelivered.

  8. #8
    Join Date
    Jun 2009
    Posts
    4
    Stealth

    Thanks again but this doesn't work. The syntax creates a problem but I figured you were suggesting that I use subqueries. In MS Access, I first created a table of the data called trans with a primary key as only one record could be returned from the subquery and that was ordered according to the primary key. I believe the table Trans would simplify things. I then ran a query on trans using the syntax of subquery to return previous rows daysout. If you look closely at the results, I just need someway to filter the data based on Days out and prior value. Maybe this might be easier accomplished with VB programming. What do you think?

    Code:
    Container	Owner		Lender	DateDelivered	Returnee	DateReturned	DaysOUt	PriorValue
    AAA111	           Alice	Alice	01/01/2009	Alice	       02/01/2009	31	
    AAA111	           Alice	Alice	01/01/2009	Alice	       06/01/2009	151	31
    AAA111	           Alice	Alice	05/01/2009	Alice	       02/01/2009	-89	
    AAA111	           Alice	Alice	05/01/2009	Alice	       06/01/2009	31	-89
    BBB222	             Bob	Bob	01/05/2009	Bob	       02/05/2009	31	
    BBB222	             Bob	Bob	01/05/2009	Bob	       04/05/2009	90	31
    BBB222	             Bob	Bob	03/05/2009	Bob	       02/05/2009	-28	
    BBB222	             Bob	Bob	03/05/2009	Bob	       04/05/2009	31	-28
    BBB222	             Bob	Bob	05/05/2009	Bob	       02/05/2009	-89	
    BBB222	             Bob	Bob	05/05/2009	Bob	       04/05/2009	-30	-89
    CCC333	          Charles	Charles	01/10/2009	Charles	       02/10/2009	31	
    CCC333	          Charles	Charles	01/10/2009	Charles	       04/10/2009	90	31
    CCC333	          Charles	Charles	01/10/2009	Charles	       06/02/2009	143	31
    CCC333	          Charles	Charles	03/10/2009	Charles	       02/10/2009	-28	
    CCC333	          Charles	Charles	03/10/2009	Charles	       04/10/2009	31	-28
    CCC333	          Charles	Charles	03/10/2009	Charles	       06/02/2009	84	-28
    EEE444	          Estate	Alice	03/01/2009

    SELECT Trans.Container, Trans.Owner, Trans.Lender, Trans.DateDelivered, Trans.Returnee, Trans.DateReturned, Trans.DaysOUt, (Select Top 1 Dupe.DaysOut
    From Trans AS Dupe
    Where
    Trans.Container = Dupe.Container AND
    Trans.Owner = Dupe.Owner AND
    Trans.Lender = Dupe.Lender AND
    Trans.DateDelivered = Dupe.DateDelivered AND
    Trans.DaysOut > Dupe.DaysOut

    Order by Dupe.DateDelivered ASC, Dupe.ID ) AS PriorValue
    FROM Trans;

  9. #9
    Join Date
    May 2009
    Posts
    508
    Provided Answers: 1
    It might be easier in VBA but I think this will still work. In case there was any confusion, my last example was just to show were the new line was to be added to Nested Table (or Derived Table in Microsoft terminology). I just showed Nested Table to make the post smaller. In case that was the problem, here is the entire query with the new line added:
    Code:
    SELECT C.Container
         , C.Owner
         , D.Farmer
         , D.DateDelivered
         , R.Farmer
         , R.DateReturned
         , DateDiff('d',D.DateDelivered,R.DateReturned) AS DaysOut
    FROM TBL_Container C
           LEFT JOIN 
           TBL_DateDelivered D
             ON C.Container = D.Container
             LEFT JOIN  
               TBL_DateReturned R
                 ON D.Container = R.Container
       , (SELECT D.Container
               , D.DateDelivered
               , MIN(DateDiff('d',D.DateDelivered,R.DateReturned)) AS MinDaysOut
          FROM TBL_DateDelivered D
                 LEFT JOIN  
                   TBL_DateReturned R
                     ON     D.Container = R.Container
                        AND D.DeliveredDate <= R.DateReturned
          GROUP BY D.Container, D.DateDelivered
         ) as MinDays
    WHERE C.Container = MinDays.Container
      AND D.DateDelivered = MinDays.DateDelivered
      AND DateDiff('d',D.DateDelivered,R.DateReturned) = MinDays.MinDaysOut
    There may be a syntax error as Access SQL is not something I use very much. I still think this should work but without data to test with, I can't be positive.

    However, I have tried a little different approach below. I stripped out all the nonessential columns leaving just CONTAINER, DATEDELIVERED, and DATERETURNED. If this works, you can make it a Nested (derived) table and join it with the CONTAINER table to get the other columns.
    Code:
    SELECT T1.CONTAINER
         , T1.DATEDELIVERED
         , T2.DATERETURNED
         , DATEDIFF('d',T1.DATEDELIVERED, 'T1.DATERETURNED) AS DaysOut
    FROM (SELECT D.CONTAINER
               , D.DATEDELIVERED
               , R.DATERETURNED
          FROM TBL_DATEDELIVERED D
                 LEFT OUTER JOIN
                 TBL_DATERETURNED  R
                   ON D.CONTAINER = R.CONTAINER
         ) AS T1
    WHERE T1.DATEDELIVERED > T1.DATERETURNED
      AND T1.DATERETURNED  = (SELECT MIN(T2.DATERETURNED
                              FROM (SELECT D1.CONTAINER
                                         , D1.DATEDELIVERED
                                         , R1.DATERETURNED
                                    FROM TBL_DATEDELIVERED D1
                                      LEFT OUTER JOIN
                                      TBL_DATERETURNED  R1
                                        ON D1.CONTAINER = R1.CONTAINER
                                   ) AS T2
                              WHERE T2.DATEDELIVERED > T2.DATERETURNED
                                    T1.DATEDELIVERED = T2.DATEDELIVERED
                                AND T1.CONTAINER     = T2.CONTAINER
                             )
    What this should do is create a result set that is equivalent to Delivered and Returned combined (like the suggested design change).

    The nested table T1 does what your original query did (with out table Container). It uses a Left Outer Join to combine DateDeliverd and DateReturned resulting in the extraneous rows. This same SQL is used in the Subquery to get the MIN DATERETURNED value.
    Container DateDeliverd DateReturned

    AAA111 05/01/2009 02/01/2009
    AAA111 01/01/2009 06/01/2009
    AAA111 01/01/2009 02/01/2009
    AAA111 05/01/2009 06/01/2009
    From this Nested table T1, CONTAINER, DATEDELIVERED and DATERETURNED are selected.

    The Where clause filters out rows where a Container was Returned before it was Delivered (impossible). This eliminates the first row of the example data.

    Then a Correlated Subquery is used to find the MIN DATERETURNED for each distinct CONTAINER/DATEDELIVERD combination.
    For the Second and Third rows (AAA111 / 01/01/2009) this would return 02/01/2009.
    For the Fourth row (AAA111 / 05/01/2009) this would return 06/01/2009. There is only one of these rows as the other row (the First row) was already filtered out by the 'impossible' condition. This should leave you with the results (using your original data sample) of:
    Container DateDeliverd DateReturned

    AAA111 01/01/2009 02/01/2009
    AAA111 05/01/2009 06/01/2009

    BBB222 01/05/2009 02/05/2009
    BBB222 03/05/2009 04/05/2009

    CCC333 03/10/2009 04/10/2009
    CCC333 01/10/2009 02/10/2009

    EEE444 03/01/2009
    The only one I am not sure about is EEE444 as it would have a NULL for DateReturned due to the Left Outer Join.

    Try this and see what happens. I hope it works. If it does, you should be able to join the results to CONTAINER. Or if it would be easier, you can write the results to a table and make the join SQL less complicated.

    PS If it does work, you could use it to convert all you all 2 table data to a single table and make this a lot easier.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •