Results 1 to 4 of 4
  1. #1
    Join Date
    Jun 2012
    Posts
    2

    Unanswered: How to query and filter normal text datatype and xml datatype

    I have a table [T1] which stored the text data type. Unfortunately, the management decided to replace the following input data by xml data without change the field's datatype. So the field is mix with text string and xml data. Is it possible to use sql to query both datatype?

    -----------------------
    | T1 |
    -----------------------
    King

    Queen

    <Food> Apple </Food>

    Teacher

    Student

    <Vegetable> Tomato </Vegetable>

    ---------------------------

    The table field is mix with text data and xml data.

    If i query using extraxt value from the field, some issue will happened as some record are not xml. If i query out as normal string, im not able to extraxt some xml node.


    Is it possible to convert all the data into xml and retrieve it as xml ? If possible, what is the example code will looks like ?

  2. #2
    Join Date
    Oct 2002
    Location
    Cape Town, South Africa
    Posts
    253
    I would not try to do some magic SELECT statement that converts "text" to "xml" on the fly while selecting. I think you should rather run a one time UPDATE that will convert the "text" values to "xml".
    Determining which items are not "xml" can be more difficult than you think, but given the sample data you have shown, I think this select statement will identify them:
    Code:
    select * from t1
    where not regexp_like(trim(text_col), '^<[a-zA-Z0-9-_. ]+>.*</[a-zA-Z0-9-_. ]+>$')
    You will likely need to tweak the regular expression to fit your data accurately. Once you have a reliable selection, then use that criteria for an update statemet, someting like
    Code:
    update t1 set text_col = '<Food>' || text_col || '</Food>'
    where not regexp_like(trim(text_col), '^<[a-zA-Z0-9-_. ]+>.*</[a-zA-Z0-9-_. ]+>$')
    Once you have the data transformed, the next thing I would do is convert to column to an XMLType column. This will act as the constraint that will only allow XML values in the column in future, and you know you will be safe using extract on all values.

  3. #3
    Join Date
    Jun 2012
    Posts
    2
    Quote Originally Posted by dayneo View Post
    I would not try to do some magic SELECT statement that converts "text" to "xml" on the fly while selecting. I think you should rather run a one time UPDATE that will convert the "text" values to "xml".
    Determining which items are not "xml" can be more difficult than you think, but given the sample data you have shown, I think this select statement will identify them:
    Code:
    select * from t1
    where not regexp_like(trim(text_col), '^<[a-zA-Z0-9-_. ]+>.*</[a-zA-Z0-9-_. ]+>$')
    You will likely need to tweak the regular expression to fit your data accurately. Once you have a reliable selection, then use that criteria for an update statemet, someting like
    Code:
    update t1 set text_col = '<Food>' || text_col || '</Food>'
    where not regexp_like(trim(text_col), '^<[a-zA-Z0-9-_. ]+>.*</[a-zA-Z0-9-_. ]+>$')
    Once you have the data transformed, the next thing I would do is convert to column to an XMLType column. This will act as the constraint that will only allow XML values in the column in future, and you know you will be safe using extract on all values.


    Hi, thanks for your reply. After i done the first statement, it seem that some empty column will deleted or not selected. Some of the column was empty. Is it possible to keep the empty column ?

    And also i don't have the permission to update the database. What solution u prefer for this kind of solution ? thanks a lot
    Last edited by ocs2me; 06-19-12 at 06:39.

  4. #4
    Join Date
    Feb 2006
    Posts
    173
    Provided Answers: 1
    You could use a regular expression, Try this regular expression on your table:
    Code:
    Example:
    select 'Teacher' as input1,
           nvl( regexp_substr( 'Teacher', '(<.+>)?(.+)(<\/.+>)', 1, 1, 'i', 2 ),
                regexp_substr( 'Teacher', '.+' ) ) as output1,
           '<Food> Apple </Food>' as input2,
           nvl( regexp_substr( '<Food> Apple </Food>', '(<.+>)?(.+)(<\/.+>)', 1, 1, 'i', 2 ),
                regexp_substr( '<Food> Apple </Food>', '.+' ) ) as output2
     from  dual;
    
    input1  output1  input2                output2
    Teacher Teacher  <Food> Apple </Food>  Apple

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •