Results 1 to 9 of 9
  1. #1
    Join Date
    Dec 2003
    Location
    India
    Posts
    33

    Unanswered: Changing characterset

    Hi,

    I need Spanish language to be included apart from English in the database as a new requirement. I am using oracle 10g as database (64 bit) in Solaris 9 server.

    The NLS parameters are as shown below.

    SQL> select * from v$NLS_PARAMETERS;

    PARAMETER VALUE
    ---------------------------------------------------------------- -----------------------------
    NLS_LANGUAGE AMERICAN
    NLS_TERRITORY AMERICA
    NLS_CURRENCY $
    NLS_ISO_CURRENCY AMERICA
    NLS_NUMERIC_CHARACTERS .,
    NLS_CALENDAR GREGORIAN
    NLS_DATE_FORMAT MM/DD/YYYY
    NLS_DATE_LANGUAGE AMERICAN
    NLS_CHARACTERSET AL32UTF8
    NLS_SORT BINARY
    NLS_TIME_FORMAT HH.MI.SSXFF AM
    NLS_TIMESTAMP_FORMAT YYYY-MM-DD-HH24.MI.SS.FF
    NLS_TIME_TZ_FORMAT HH.MI.SSXFF AM TZR
    NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
    NLS_DUAL_CURRENCY $
    NLS_NCHAR_CHARACTERSET AL16UTF16
    NLS_COMP BINARY
    NLS_LENGTH_SEMANTICS BYTE
    NLS_NCHAR_CONV_EXCP FALSE

    Please tell me to which characterset I have to change.

    Thanks in advance.
    Papps...

  2. #2
    Join Date
    Jan 2004
    Location
    Croatia, Europe
    Posts
    4,094
    Provided Answers: 4
    NLS_CHARACTERSET shows a database character set (AL32UTF8 in your DB) which is used for CHAR and VARCHAR2 columns, PL/SQL variables and such identifiers, while NLS_NCHAR_CHARACTERSET shows a national language character set (AL16UTF16) (used for NCHAR and NVARCHAR2 columns) in your DB. Those represent Unicode 3.1 UTF-8, variable length (that is, 16 - fixed length) Universal multibyte character sets.

    Can't tell for sure, but - don't your character sets already support spanish characters? Did you try to store those characters into the table columns and/or variables?

    Check this page to view character set names etc.; perhaps there you'll find what you need.

  3. #3
    Join Date
    Dec 2003
    Location
    India
    Posts
    33

    Unhappy

    Thanks for the reply....

    I have created one table and inserted rows as follows...

    SQL> create table nls_temp(col1 varchar2(1));

    Table created.

    Elapsed: 00:00:00.12
    SQL> insert into nls_temp values(chr(65));

    1 row created.

    Elapsed: 00:00:00.00
    SQL> insert nls_temp values(chr(209));

    1 row created.

    Elapsed: 00:00:00.00
    SQL> insert into nls_temp values(chr(210));

    1 row created.

    Elapsed: 00:00:00.01
    SQL> insert into nls_temp values(chr(66));

    1 row created.

    Elapsed: 00:00:00.01
    SQL> commit;

    Commit complete.

    Now, I tried to query "nls_temp"

    SQL> select * from nls_temp;

    C
    -
    A


    B

    Elapsed: 00:00:00.00

    I got 2 rows as null... But I expected for chr(209) and for chr(210).
    Papps...

  4. #4
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    I think it doesn't come from data not being correctly written in the DB, but from data not correctly read by the client. What is your NLS_LANG environment variable (OS) like on the client machine ? If the character set on the client is not the one on the server, Oracle will translate data from the db charset to the client charset, and I think that here you must have something like US7ASCII as client charset (the default if you have no NLS_LANG environment parameter). If you use PUTTY, change the charset to UTF-8 in "Window-> Translation", then have your NLS_LANG be something like this :

    NLS_LANG=AMERICAN_AMERICA.UTF8

    You're not obliged to use UTF-8 on the client, but be sure that the charset you use is compatible with the letters you want to display, and that your client terminal's charset and NLS_LANG's charset are the same.

    HTH & Regards,

    RBARAER

  5. #5
    Join Date
    Dec 2003
    Location
    India
    Posts
    33
    I set the env variable NLS_LANG. But still I could see from the query. I created a similar table in another database whose characterset is WE8ISO8859P1. and its showing the correct reults as it is.

    SQL> select ascii(col1) from nls_temp;

    C
    -
    65
    209
    210
    66

    Elapsed: 00:00:00.00

    So, will it mean that, the character set AL32UTF8 is not supporting the required language but WE8ISO8859P1 is supporting that??

    If so, do I wanna change the nls_characterset. and wanna change NLS_NCHAR_CHARACTERSET correspongingly....

    what all precautions should I take for a character set conversion...

    Thanks in advance...
    Papps...

  6. #6
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    No, with UTF-8 you can encode ANY character (including all european languages, japanese, korean... everything). So if you plan to have a multilanguage database, AL32UTF8 is excellent.

    ... I think I see : UTF-8 is multi-bytes and chr(n) in UTF-8 will certainly NOT correspond to the character with ASCII code n. For all characters whose ASCII code is under 127, their UTF-8 representation will be one-byte with the same code, but for all other characters, they will be coded on 2 to 4 bytes with codes completely different from their ASCII codes. In fact, I think your test is flawed.

    Here is part of the CHR() function doc :
    For single-byte character sets, if n > 256, then Oracle returns the binary equivalent of n mod 256.
    For multibyte character sets, n must resolve to one entire codepoint. Invalid codepoints are not validated, and the result of specifying invalid codepoints is indeterminate.
    Conclusion : don't test with chr(), and be sure UTF-8 is very good for supporting multi-language databases. Keep it, and if you want to be sure, just insert spanish text from the web, configure NLS_LANG correctly, and read these data from your client.

    Furthermore, your charset conversion would have been complicated because UTF-8 is a super-set of all other character sets.

    Regards,

    RBARAER

  7. #7
    Join Date
    Dec 2003
    Location
    India
    Posts
    33

    Unhappy

    Thanks for the reply. I too think UTF8 characterset is the best to suit the requirement (to include spanish language). So I tried to fix it with the same characterset as AL32UTF8. But I couln't give any suggestions for how to insert and fetch the spanish characters from the table, to the developers. They r facing the same problem.

    Also, I have set the env variable NLS_LANG to AMERICAN_AMERICA.UTF8

    Please advice me.
    Papps...

  8. #8
    Join Date
    Aug 2004
    Location
    France
    Posts
    754
    Sorry, I may have misguided you for the NLS_LANG, because UTF8 and AL32UTF8 are different in Oracle. Try :

    NLS_LANG=AMERICAN_AMERICA.AL32UTF8

    Remember this variable must be set on the client (on the machine on which you run your client, it may be the machine where the Oracle server runs), and your client terminal must be compliant with UTF-8, or it won't display data correctly. If your client terminal can't display UTF-8, then use ISO-8859-1, and set NLS_LANG to AMERICAN_AMERICA.WE8ISO8859P1. You just have to be consistent on the client (always use the charset that is in NLS_LANG) and on the server (but here you always use the same charset ), Oracle will do the right charset translations for you.

    These charset translations are a bit hard to understand at first, and having a UTF-8 db, I myself spent some time understanding how it worked exactly, sometimes wondering which magical spells were hiding behind these translations . But when you get the thing, it always works the same and becomes easy enough.

    HTH & Regards,

    RBARAER

  9. #9
    Join Date
    Dec 2003
    Location
    India
    Posts
    33

    Talking

    I have got the result as per the requirement. I have changed the client settings to UTF8. No its showing the correct character.

    Thanks,
    Papps...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •