I bcped in data into a database with SQL_Latin1_General_Cp1_CI_AS collation. The input data has an embedded character ®(ascii 174). I did not specify any code page using the -C parm. The data was converted to character «(ascii 171). I ran the bcp trying -C1252 and -CRAW and both maintained the correct character. -C437 and -COEM change the character to «.
Why did this happen? I thought that data would be converted to correctly without any code page specification.
Different code pages map binary values to glyphs (the graphic symbols that humans know and love) differently. One binary value can map to many different glyphs using different code pages.
If BPC doesn't know which code page to use for translation, you get "pot luck", especially for characters that aren't well defined. Typically, you want the code page that created the data. Occaisionally, you want the code page that was intended (or at least used) to view the data. Because of the pot-pouri of mappings supported by the different code pages, the business of getting data from point A to point B has grown yet another potentially "interesting" twist to amuse those of us that do the moving!
From Books Online topic bcp: "OEM Default code page used by the client. This is the default code page used by bcp if -C is not specified."
From that I suppose that SQL Server interpreted your file as being OEM 437 CP. mojza
I guess I still don't understand why the character was changed during the bcp. I can view it correctly from my workstation which is 437, but if I bcp using -C437 or without -C(which uses default OEM code page) it gets converted. I think I'm missing something.
Then, in my opinion, your file was created in code page ANSI 1252 (notepad ok) and bcp interprets your file as cp 437 (default client OEM code page). That leads to a loss of some extended characters that are not compatible between these two pages unless you tell sql server to interpret him as 1252 or without any translation (RAW). Check out this Microsoft article. There is a good explanation and excellent examples. mojza
Thanks for your help. That article definitely helped explain things. I also looked at the nls files for 437 and 1252 and character 174(offset x0178) reflects ® in the 1252 file and « int 437 file.