Results 1 to 8 of 8
  1. #1
    Join Date
    Jan 2008
    Location
    Poland
    Posts
    31

    Unanswered: Need help with transfer many records between tables

    Hi!
    I have 2 tables (both have the same structure):
    ID -> bigint (identity, not for replication, primary key)
    Url -> nvarchar(1000)
    MainUrl -> nvarchar(1000)

    Tbl1 cantains about 0,5 mln records, and tbl2 - 1 mln.
    What I need, is to copy records from tbl2 to tbl1. But records in tbl1 are unique, and it can't change. (Unique must be only "Url"; (and ID, but it's automatic)). How can I do this in fast way? Now I'm using SELECT for each record in tbl2 to see if it exist in tbl1. But it's a bit slow... Is there any faster method? (One thing: I'm beginner in databeses, so I'm wrote VB application to transfer records. How can I do it using only Microsoft Sql server?)
    --------------
    I'm forgot to write, I'm using MsSql 2005.
    Last edited by Ivenesco; 01-27-08 at 11:54.

  2. #2
    Join Date
    Nov 2004
    Location
    on the wrong server
    Posts
    8,835
    Provided Answers: 6
    well instead of doing a one record at a time move you might want to use a set based operation to do it all at once. you can use a SELECT with an insert statement like so...

    INSERT MyTable(myCol,COL2)
    SELECT MyCol,COL3
    FROM MyOtherTable
    WHERE ....

    In your case you are going to want to also look UP IDENTITY INSERT in SQL Server Books Online because that will be useful inserting those nasty identity values you say you have and you will also probably have to add a where condition to prevent duplicates.
    “If one brings so much courage to this world the world has to kill them or break them, so of course it kills them. The world breaks every one and afterward many are strong at the broken places. But those that will not break it kills. It kills the very good and the very gentle and the very brave impartially. If you are none of these you can be sure it will kill you too but there will be no special hurry.” Earnest Hemingway, A Farewell To Arms.

  3. #3
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I take it that you're doing some tricky data cleansing, and that many flavors of duplicates can exist in Tbl2. I also assume that you want consistantly repeatable runs more than you want fresh data, since any given Url ought to only transfer one time. Based on those assumptions, I'd suggest something like:
    Code:
    INSERT INTO [Tbl1] (
       [ID], [Url], [MainUrl]
       ) SELECT
          b.[ID], b.[URL], b.[MainUrl]
          FROM [Tbl2] AS b
          WHERE
             NOT EXISTS (SELECT c.*			-- Ignore any rows with this Url
                FROM [Tbl1] AS c			--    that are already in Tbl1
                WHERE  c.[Url] = b.[Url])
             AND b.[ID] = (SELECT Min(d.[ID])	-- Take only the lowest ID row
                FROM [Tbl2] AS d			--    for this Url from Tbl2
                WHERE  d.[Url] = b.[Url])
    -PatP

  4. #4
    Join Date
    Jan 2008
    Location
    Poland
    Posts
    31
    Thanks for fast reply! This code looks like what I need. I'll test it soon

  5. #5
    Join Date
    Jan 2008
    Location
    Poland
    Posts
    31
    Ok, done It takes more than 1h, but it's nice time for more than 1 mln records. But there is new problem: in Tbl2 some of records (Url's) aren't unique, and finally, in Tbl1 not all records are unique now. How Can delete repeated records?

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    The query that I provided you will only produce unique rows in Tbl1, it will never produce rows with duplicate Url values, ever.

    This is "air code", there is no way for me to test it. Make a backup before you run it!
    Code:
    --  ptp  20080126  See http://www.dbforums.com/showthread.php?p=6318379#post6318379
    
    CREATE INDEX XIE01Tbl1 ON [Tbl1] (Url)		-- Index to speed up queries
    GO
    
    CREATE INDEX XIE01Tbl2 ON [Tbl2] (Url)		-- Index to speed up queries
    GO
    
    CREATE TABLE Tbl1DupeRows(
       [ID]			BIGINT		NOT NULL
    ,  [Url]		NVARCHAR(1000)	NOT NULL
    ,  [MainUrl]		NVARCHAR(1000)	NOT NULL
       )
    
    INSERT INTO Tbl1DupeRows (
       [ID], [Url], [MainUrl]
       ) SELECT
          a.[ID], a.[Url], a.[MainUrl]
          FROM [Tbl1] AS a
          WHERE 1 < (SELECT Count(*)		-- Copy all Tbl1 rows with
             FROM [Tbl1] AS b
             WHERE  b.[Url] = a.[Url])		-- duplicate Urls
    
    DELETE FROM [Tbl1]
       WHERE EXISTS (SELECT *			-- Delete Tbl1 rows with
          FROM [Tbl1] AS c
          WHERE  c.[Url] = [Tbl1].[Url]		--  duplicate Url
             AND c.[ID] < [Tbl1].[ID])		--  and higher ID value
    
    DELETE FROM [Tbl1]				-- Delete Tbl1 rows with
       WHERE 1 < (SELECT Count(*)
          FROM [Tbl1] AS d
          WHERE  d.[Url] = [Tbl1].[Url]		-- duplicate Url
             AND d.[ID] = [Tbl1].[ID])		-- and duplicate ID
    
    INSERT INTO [Tbl1] (				-- Re-run original query
       [ID], [url], [MainUrl]			-- to repopulate, but
       ) SELECT DISTINCT				-- screen out full duplicates
          b.[ID], b.[url], b.[MainUrl]
          FROM [Tbl2] AS b
          WHERE
             NOT EXISTS (SELECT c.*			-- Ignore any rows with this Url
                FROM [Tbl1] AS c			--    that are already in Tbl1
                WHERE  c.[url] = b.[url])
             AND b.[ID] = (SELECT Min(d.[ID])	-- Take only the lowest ID row
                FROM [Tbl2] AS d			--    for this Url from Tbl2
                WHERE  d.[url] = b.[url])
    -PatP

  7. #7
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    One minor retraction from that last post of mine... The original query might produce duplicate rows for the case of duplicate rows having the same ID and Url values (which will never happen naturally with an IDENTITY value, but are possible with manual intervention).

    -PatP

  8. #8
    Join Date
    Jan 2008
    Location
    Poland
    Posts
    31
    Quote Originally Posted by Pat Phelan
    The query that I provided you will only produce unique rows in Tbl1, it will never produce rows with duplicate Url values, ever.
    It will If you have duplicates in tbl2 it'll copy it to tbl1. I suspect; it's beceuse it search for duplicate only in original tbl1. It doesn't care of added records (most probably it add whole rows at the end of query, after all processing).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •