Results 1 to 12 of 12
  1. #1
    Join Date
    Apr 2003
    Posts
    176

    Unanswered: Duplicate strings

    I have a column called professional_degrees in my table
    and it has the following data

    professional_degrees

    bs,ms,mba
    bs,mba,
    bs,ms,bs,ms
    mba,mba
    bs,mba

    In the above data u can see some of the degrees are repeated.So how can i find them and delete them from the table.

    Thanks.

  2. #2
    Join Date
    Jul 2002
    Location
    Village, MD
    Posts
    621
    You have to normalize you data if you can - otherwise use charindex and replace functions for updating. I do not envy you in last case.

  3. #3
    Join Date
    Apr 2003
    Posts
    176
    I was talking abt the duplicates in the same string,not the duplicate recs in database.

  4. #4
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Do you know if they all are comma delimited?

    In any event, you'll have to do something like

    Parse out every row, 1 by 1, into n rows into a temp table
    Then SELECT Distinct
    Then INSERT into a new table
    The UPDATE the original

    probably need a function for the parsing...

    just a guess...
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  5. #5
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Ok, borrowing from a previous post, you can get the parser. You could then create a "glue distinct" function something like:
    Code:
    CREATE FUNCTION dbo.fGlue(@original VARCHAR(8000), @splitter VARCHAR(8000))
    RETURNS VARCHAR(8000) AS BEGIN
    
    DECLARE @c		VARCHAR(8000)
    
    SELECT @c = Coalesce(@c + @splitter + item, item)
       FROM dbo.fSplit(@original, @splitter)
       GROUP BY item
    
    RETURN @c
    END
    GO
    
    SELECT dbo.fGlue('bs,ms,phd,ms,ms,ms,MCSE', ',')
    -PatP

  6. #6
    Join Date
    Jul 2002
    Location
    Village, MD
    Posts
    621
    Quote Originally Posted by bruce_Reid
    I was talking abt the duplicates in the same string,not the duplicate recs in database.
    A database is (technically) normalized if it has no repeating groups - all data in fields are atomic.

  7. #7
    Join Date
    Apr 2003
    Posts
    176
    SELECT @c = Coalesce(@c + @splitter + item, item)
    FROM dbo.fSplit(@original, @splitter)
    GROUP BY item


    I couldn't understand what this query does,partularly with the following line

    FROM dbo.fSplit(@original, @splitter)

    WHy did u put @original,@splitter in the braces after the table name.What does ti do.
    Thanks.

  8. #8
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    dbo.fSplit is the table-valued function that was from the previous posting the I referenced. You need them both (dbo.fSplit and dbo.fGlue) to make this work.

    -PatP

  9. #9
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    I answered something similar over here

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=35121
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  10. #10
    Join Date
    Apr 2003
    Posts
    176
    SELECT dbo.fGlue('bs,ms,phd,ms,ms,ms,MCSE', ',')

    WHEn i run the above query it returned the value 'bs,MCSE,ms,phd'
    But I want that to be as 'bs,ms,phd,mcse'

    Thanks.

  11. #11
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Can you give me every possible notation you might use, in the order that you want to see them presented?

    -PatP

  12. #12
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Better yet, here's a solution that will alllow you to specify which degrees you want presented, and the order that you want them presented! Just put them into the tGlue table, and set the seq column values based on the order that you want the degrees to present. Any degrees not listed in tGlue will be listed in alphabetical order after the degrees that are listed in tGlue.
    Code:
    DROP FUNCTION dbo.fSplit
    GO
    --  ptp  20040507  Split a string on a separator, like VB
    
    CREATE FUNCTION dbo.fSplit(
       @pcString	VARCHAR(8000)
    ,  @pcSep	VARCHAR(8000) = ','
       ) RETURNS @r TABLE (
       item		VARCHAR(8000)
       ) AS
    BEGIN
    
       DECLARE @i		INT
    
       SET @i = CharIndex(@pcSep, @pcString)
    
       WHILE 0 <> @i
          BEGIN
             INSERT INTO @r (item) SELECT Left(@pcString, @i - 1)
             SET @pcString = SubString(@pcString, @i + Len(@pcSep), 8000)
             SET @i = CharIndex(@pcSep, @pcString)
          END
    
       INSERT INTO @r (item) SELECT @pcString
    
       RETURN
    END
    GO
    DROP TABLE tGlue
    GO
    
    CREATE TABLE tGlue(
       seq			INT
       CONSTRAINT XPKtGlue PRIMARY KEY (seq)
    ,  abbrev		VARCHAR(900)
       CONSTRAINT XAK01tGlue UNIQUE (abbrev)
    )
    
    INSERT INTO tGlue (seq, abbrev)		-- you need to decide what
       SELECT              1000, 'aa'	-- abbreviations you want
          UNION ALL SELECT 1100, 'bs'	-- and what sequence you
          UNION ALL SELECT 1200, 'ba'	-- want them presented
          UNION ALL SELECT 1300, 'ms'
          UNION ALL SELECT 1400, 'ma'
          UNION ALL SELECT 1500, 'mba'
          UNION ALL SELECT 1600, 'phd'
    GO
    DROP FUNCTION dbo.fGlue
    GO
    CREATE FUNCTION dbo.fGlue(@original VARCHAR(8000), @splitter VARCHAR(8000))
    RETURNS VARCHAR(8000) AS BEGIN
    
    DECLARE @c		VARCHAR(8000)
    
    SELECT @c = Coalesce(@c + @splitter + item, item)
       FROM (SELECT TOP 100 PERCENT item, Coalesce(seq, 2147483647) AS s2
          FROM dbo.fSplit(@original, @splitter)
          LEFT OUTER JOIN tGlue
             ON (abbrev = item)
          GROUP BY item, seq
          ORDER BY 2, 1) AS a
    
    RETURN @c
    END
    GO
    
    SELECT dbo.fGlue('bs,ms,phd,ms,ms,ms,MCSE', ',')
    -PatP

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •