Results 1 to 9 of 9
  1. #1
    Join Date
    Jul 2004
    Posts
    6

    Unanswered: Duplicate Counts per column

    I'm have a problem with trying to generate a view that has a count of duplicates values per column in a table.

    example I have a table with the following structure:

    Code:
    CREATE TABLE [dbo].[TestDpln](
    	[CountryCode] [smallint] NOT NULL DEFAULT ((0)),
    	[NPA] [smallint] NOT NULL DEFAULT ((0)),
    	[NXX] [smallint] NOT NULL DEFAULT ((0)),
    	[XXXX] [smallint] NOT NULL DEFAULT ((0)),
    	[3-Digit] [nvarchar](3) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
    	[4-Digit] [nvarchar](4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
    	[SiteIdx] [int] NULL DEFAULT ((0)),
     CONSTRAINT [TestDpln$PrimaryKey] PRIMARY KEY NONCLUSTERED 
    (
    	[CountryCode] ASC,
    	[NPA] ASC,
    	[NXX] ASC,
    	[XXXX] ASC
    )WITH (PAD_INDEX  = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
    ) ON [PRIMARY]
    the data loaded looks like this

    Code:
    INSERT INTO dbo.TestDpln VALUES('1','312','555','1212','212','1212','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','555','1213','213','1213','1')
    INSERT INTO dbo.TestDpln VALUES('1','525','555','2212','212','2212','2')
    INSERT INTO dbo.TestDpln VALUES('1','525','555','2213','213','2213','2')
    the results I need are

    Code:
    SiteIdx      3 Digit Count      4 Digit Count
    -------     -------------     -------------
       1              2                      0
       2              2                      0
    I've tried various queries the closest being:

    Code:
    SELECT DISTINCT SiteIdx, COUNT(*) as "3-Digit Count"
    	FROM dbo.TestDpln
    	GROUP BY SiteIdx, "3-Digit"
    	HAVING COUNT(*) > 1
    but it only shows one column and one site I'm not sure how to get the '4 Digit Count' column to show up and the rest of the sites. below are the results I get so far.

    Code:
    SiteIdx      3 Digit Count 
    -------     -------------
       1              2
    any help would be great.
    Thanks
    Mike

  2. #2
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Try:
    COUNT(DISTINCT [YourColumn])
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  3. #3
    Join Date
    Jul 2004
    Posts
    6
    That did not work I was thinking it would need to be some sort of select statement run on each column then grouped by the siteidx then a count done by site. I'm just not sure how to write it.

  4. #4
    Join Date
    May 2005
    Location
    South Africa
    Posts
    1,365
    Provided Answers: 1
    Your data and required output doesn't agree
    Code:
                                                           3d    4d     idx
    INSERT INTO dbo.TestDpln VALUES('1','312','555','1212','212','1212','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','555','1213','213','1213','1')
    No duplicates here but you want                        2     0       1
    
    INSERT INTO dbo.TestDpln VALUES('1','525','555','2212','212','2212','2')
    INSERT INTO dbo.TestDpln VALUES('1','525','555','2213','213','2213','2')
    and no duplicates here but you want                    2     0       2
    And then you also ask how to display the rest of the sites
    So do you want all zeros returned if a siteidx has no duplicates?

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    How do you expect to get 4-Digit counts of zero from the data you supplied, which obviously contains multiple 4-Digit count values?
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  6. #6
    Join Date
    Jul 2004
    Posts
    6
    Sorry for being so unclear it was a long day and not enough coffee

    First the piece that I forgot to put in is that there is an input string that we pass in and are searching for such as '51212' this string is then broken down in to 3 digits from the right giving the search string for the 3 digit column of 212 and then four digits from the right giving the search string for the four digit column of 1212

    so with this information and some updated data this is what I'm looking for

    Code:
    Input string 51212
                                                            3d    4d     Site
    INSERT INTO dbo.TestDpln VALUES('1','312','555','1212','212','1212','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','555','1213','213','1213','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','533','1212','212','1212','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','533','1213','213','1213','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','577','2212','212','2212','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','577','2213','213','2213','1')
    INSERT INTO dbo.TestDpln VALUES('1','525','555','2212','212','2212','2')
    INSERT INTO dbo.TestDpln VALUES('1','525','555','2213','213','2213','2')
    3 digit 212 duplicates would be                         4
    4 digit 1212 duplicates would be                               2
    
    But the final output should look like 
    
    SiteIdx       3digit    4digit
    -------      ------   ------
    1               4         2

  7. #7
    Join Date
    Jul 2004
    Posts
    6
    I hope this example is better

  8. #8
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    You sure pick some bad column names.
    Code:
    set nocount on
    CREATE TABLE [dbo].[TestDpln](
    	[CountryCode] [smallint] NOT NULL DEFAULT ((0)),
    	[NPA] [smallint] NOT NULL DEFAULT ((0)),
    	[NXX] [smallint] NOT NULL DEFAULT ((0)),
    	[XXXX] [smallint] NOT NULL DEFAULT ((0)),
    	[3-Digit] [nvarchar](3) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
    	[4-Digit] [nvarchar](4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
    	[SiteIdx] [int] NULL DEFAULT ((0)),
     CONSTRAINT [TestDpln$PrimaryKey] PRIMARY KEY NONCLUSTERED 
    (
    	[CountryCode] ASC,
    	[NPA] ASC,
    	[NXX] ASC,
    	[XXXX] ASC
    ))
    
    
    INSERT INTO dbo.TestDpln VALUES('1','312','555','1212','212','1212','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','555','1213','213','1213','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','533','1212','212','1212','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','533','1213','213','1213','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','577','2212','212','2212','1')
    INSERT INTO dbo.TestDpln VALUES('1','312','577','2213','213','2213','1')
    INSERT INTO dbo.TestDpln VALUES('1','525','555','2212','212','2212','2')
    INSERT INTO dbo.TestDpln VALUES('1','525','555','2213','213','2213','2')
    
    declare	@InputString char(5)
    set	@InputString = '51212'
    
    select	SiteIDx,
    	sum(case when [3-Digit] = right(@InputString, 3) then 1 else 0 end) as '3digit',
    	sum(case when [4-Digit] = right(@InputString, 4) then 1 else 0 end) as '4digit'
    from	[dbo].[TestDpln]
    group by SiteIDx
    
    drop table [dbo].[TestDpln]
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  9. #9
    Join Date
    Jul 2004
    Posts
    6
    That worked great a million thanks to you I never would have thought about using sum for this.
    P.S.
    I know the column names are bad I inherited this from someone who was trying to use MSACCESS for this data and now I need to really make it work on MSSQL we are redoing the whole schema as we build this.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •