Results 1 to 4 of 4
  1. #1
    Join Date
    Apr 2012
    Posts
    2

    Unanswered: Search and REPLACE fails on MEDIUMTEXT (BINARY), can find solution?

    For a blog or website served by PHP with a MySQL database, many people have the same problem as myself but no solution, after researching Google over many hours, in that we have a TEXT field which stores its data as BINARY (am I correct?), e.g.,

    "PHAgY2xhc3M9c3ViaGVhZ...."

    instead of plain text,

    "The quick brown fox ..."

    This means that search & replace doesn't work by SQL, for example to update a company address on all pages of a blog or website served from a database, using the usual SQL command:

    UPDATE `table_pages` SET `pagebody`=REPLACE(`pagebody`, '123 Main Street', '456 Lincoln Avenue');

    It fails because it never finds '123 Main Street' in the binary data, though it would if the column were still in plain text.

    (This is a simple example, updating a company address, but I and others have many applications of this search and replace need, e.g., misspelled words, updated or dead links cited, etc.)

    To solve this, one method would be to convert all the column data from binary back to readable text, and then run the above UPDATE command.

    (Or maybe there's a better way?)

    It appears to me (and I may be mistaken) that MySQL automatically converts text to binary for MEDIUMTEXT on up. At least, all the TEXT fields are in plain text when browsing in phpmyadmin, and only the MEDIUMTEXT field is binary.

    In my case, my predecessor set all the sites to use MEDIUMTEXT (up to 1 megabyte per page) for the one column pagebody, whereas TEXT would have sufficed since none of our pages exceed or even approach 65535 bytes.

    I've tried converting to TEXT, latin1, ASCII, and some other character sets, which changes the field type but does not "convert" the data. It just stays in binary.

    Can anybody think of a solution?

  2. #2
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    880
    What blog engine are you using? I have never experienced that. Here is an extract of my wordpress blogging engine:

    Code:
    |  1 |           1 | 2010-05-06 15:49:23 | 2010-05-06 15:49:23 | Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Hello world!                  |              | publish     | open           | open        |               | hello-world         |         |        | 2010-05-06 15:49:23 | 2010-05-06 15:49:23 |                       |           0 | http://server/wordpress/?p=1                                      |          0 | post         |                |             0 |
    The definition of this table is:

    Code:
    mysql> desc wp_posts;
    +-----------------------+---------------------+------+-----+---------------------+----------------+
    | Field                 | Type                | Null | Key | Default             | Extra          |
    +-----------------------+---------------------+------+-----+---------------------+----------------+
    | ID                    | bigint(20) unsigned | NO   | PRI | NULL                | auto_increment | 
    | post_author           | bigint(20) unsigned | NO   | MUL | 0                   |                | 
    | post_date             | datetime            | NO   |     | 0000-00-00 00:00:00 |                | 
    | post_date_gmt         | datetime            | NO   |     | 0000-00-00 00:00:00 |                | 
    | post_content          | longtext            | NO   |     |                     |                | 
    | post_title            | text                | NO   |     |                     |                | 
    | post_excerpt          | text                | NO   |     |                     |                | 
    | post_status           | varchar(20)         | NO   |     | publish             |                | 
    | comment_status        | varchar(20)         | NO   |     | open                |                | 
    | ping_status           | varchar(20)         | NO   |     | open                |                | 
    | post_password         | varchar(20)         | NO   |     |                     |                | 
    | post_name             | varchar(200)        | NO   | MUL |                     |                | 
    | to_ping               | text                | NO   |     |                     |                | 
    | pinged                | text                | NO   |     |                     |                | 
    | post_modified         | datetime            | NO   |     | 0000-00-00 00:00:00 |                | 
    | post_modified_gmt     | datetime            | NO   |     | 0000-00-00 00:00:00 |                | 
    | post_content_filtered | text                | NO   |     |                     |                | 
    | post_parent           | bigint(20) unsigned | NO   | MUL | 0                   |                | 
    | guid                  | varchar(255)        | NO   |     |                     |                | 
    | menu_order            | int(11)             | NO   |     | 0                   |                | 
    | post_type             | varchar(20)         | NO   | MUL | post                |                | 
    | post_mime_type        | varchar(100)        | NO   |     |                     |                | 
    | comment_count         | bigint(20)          | NO   |     | 0                   |                | 
    +-----------------------+---------------------+------+-----+---------------------+----------------+
    23 rows in set (0.01 sec)
    Could it be that your blogging engine is encrypting the data before storing it into the database?
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com
    Follow me on Twitter

  3. #3
    Join Date
    Apr 2012
    Posts
    2

    Search a base64_encode field

    Quote Originally Posted by it-iss.com View Post
    Could it be that your blogging engine is encrypting the data before storing it into the database?
    Yes, Ronan, you're right, thank you. My particular PHP script is encoding it, specifically in base64_encode , something I should have figured out before I posted my original message. Why my predecessor decided to encode only that particular field, I don't know. Maybe base64_encode fixed some problem in storing and retrieving some page code into the MySQL database, but if others are saving their HTML and other code into text fields in a MySQL database without problems, and securely, that would be interesting and very useful to know.

    A disadvantage of base64_encode is that you cannot do a simple text search using phpmyadmin, much less a search & replace, which is what is needed.

    It appears that I will need to write a PHP script to fetch every record, base64_decode the field, perform string search & replace on it, and then base64_encode and update each record ...

    ... unless somebody knows a way to do this in phpmyadmin, for example,

    UPDATE `table_pages` SET `pagebody`=REPLACE(`pagebody`, '123 Main Street', '456 Lincoln Avenue');

    but somehow get base64_decode/encode processing into there, since it will never find 123 Main Street in the base64_encoded fields.

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Thread bump. Last post got caught in moderation queue!
    George
    Home | Blog

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •