Results 1 to 6 of 6
  1. #1
    Join Date
    Jul 2004
    Posts
    494

    Unanswered: preg_match_all tr/td/th problem

    I want all the trs tds and ths extracted
    PHP Code:
    <?php
    $text 
    '
    <table>
    <tr>
    <th>Unit Type</th>
    <th>Availability</th>
    <th>Rates</th>
    </tr>

    <tr>
    <td>One Bedroom</td>
    <td>Call for Availability</td>
    <td>hello</td>
    </tr>

    <tr>
    <td>One Living Room</td>
    <td>Call for Availability</td>
    <td>hello</td>
    </tr>
    </table>'
    ;
    $extract_th="#<th.*>(.+)</th#Ui";
    $extract_tr="/<tr>(.*)<\/tr>/isU";
    $extract_td="/<td.*>(.*)<\/td>/Ui";
    echo 
    $text."<br />\n";
    preg_match_all($extract_tr$text$match_trPREG_SET_ORDER);
    //print_r($match_tr[1][1]);
    for($i=0$i<count($match_tr); $i++){
        for(
    $td=0$td<count($match_tr[$i]); $td++){
            
    preg_match_all($extract_td$match_tr[$i][$td], $match_tdPREG_SET_ORDER);
            
    print_r($match_td[$i]);
        }
    }
    ?>
    I'm getting:
    Code:
    Array
    (
        [0] => <td>Call for Availability</td>
        [1] => Call for Availability
    )
    Array
    (
        [0] => <td>Call for Availability</td>
        [1] => Call for Availability
    )
    Array
    (
        [0] => <td>hello</td>
        [1] => hello
    )
    Array
    (
        [0] => <td>hello</td>
    
        [1] => hello
    )

  2. #2
    Join Date
    Jul 2004
    Posts
    494
    PHP Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
    <title></title>
    </head>

    <body>
    <?php
    $text 
    '<table><tr><th>Unit Type</th><th>Availability</th><th>Rates</th></tr><tr><td>One Bedroom</td><td>Call for Availability</td><td>hello</td></tr><tr><td>One Living Room</td><td>Call not for Availability</td><td>hello</td></tr></table>';
    $extract_th="#<th.*>(.+)</th#Ui";
    $extract_tr="/<tr>(.*)<\/tr>/isU";
    $extract_td="/<td.*>(.*)<\/td>/Ui";


    echo 
    $text."<br />\n";
    preg_match_all($extract_tr$text$match_trPREG_SET_ORDER);
    //print_r($match_tr[1][1]);
    //var_dump($match_tr);
    //echo count($match_tr);

    //print_r($match_tr);
    //print_r($match_tr[0][1]);
    //print_r($match_tr[1][1]);
    //print_r($match_tr[2][1]);

    preg_match($extract_td$match_tr[0][1], $match_th);
    print_r($match_th);

    for(
    $td=1$td<count($match_tr); $td++){
        
    //preg_match_all($extract_td, $match_tr[$td][1], $match_td, PREG_SET_ORDER);
        
    echo "[".$match_tr[$td][1]."]<br />\n";
        
    preg_match($extract_td$match_tr[$td][1], $match_td);
        
    print_r($match_td[$td]);
    }
    ?> 
    </body>
    </html>
    The output:
    Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
    <title></title>
    </head>
    
    <body>
    <table><tr><th>Unit Type</th><th>Availability</th><th>Rates</th></tr><tr><td>One Bedroom</td><td>Call for Availability</td><td>hello</td></tr><tr><td>One Living Room</td><td>Call not for Availability</td><td>hello</td></tr></table><br />
    
    Array
    (
    )
    [<td>One Bedroom</td><td>Call for Availability</td><td>hello</td>]<br />
    One Bedroom[<td>One Living Room</td><td>Call not for Availability</td><td>hello</td>]<br />
     
    </body>
    </html>
    It's not working.

  3. #3
    Join Date
    Jul 2004
    Posts
    494
    PHP Code:
    <?php
    $text 
    '<table><tr><th>Unit Type</th><th>Availability</th><th>Rates</th></tr><tr><td>One Bedroom</td><td>Call for Availability</td><td>hello</td></tr><tr><td>One Living Room</td><td>Call not for Availability</td><td>hello</td></tr></table>';
    $extract_th="#<th.*>(.+)</th#Ui";
    //$extract_tr="/<tr>(.*)<\/tr>/isU";
    $extract_tr='@<tr>(.*)<\/tr>@siU';
    $extract_td="/<td.*>(.*)<\/td>/Ui";

    echo 
    $text."<br />\n";
    preg_match_all($extract_tr$text$match_trPREG_SET_ORDER);
    //print_r($match_tr[1][1]);
    //var_dump($match_tr);
    //echo count($match_tr);

    print_r($match_tr);
    echo 
    "[".$match_tr[0][1]."]<br />\n";
    echo 
    "[".$match_tr[1][1]."]<br />\n";
    echo 
    "[".$match_tr[2][1]."]<br />\n";

    //preg_match($extract_td, $match_tr[0][1], $match_th);
    //print_r($match_th);

    for($td=1$td<count($match_tr); $td++){
        
    //preg_match_all($extract_td, $match_tr[$td][1], $match_td, PREG_SET_ORDER);
        //echo "[".$match_tr[$td][1]."]<br />\n";
        
    preg_match($extract_td$match_tr[$td][1], $match_td);
        
    print_r($match_td);
    }
    ?>
    Why is it that the
    PHP Code:
        preg_match($extract_td$match_tr[$td][1], $match_td);
        
    print_r($match_td); 
    in the for loop prints
    Array ( [0] => One Bedroom [1] => One Bedroom ) Array ( [0] => One Living Room [1] => One Living Room )
    and not the whole thing? That's my question.

  4. #4
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    It would be far simpler if you just show what the input is and what you want the output to be. It might be obvious to you but it's not to me. Supplying your current code is a fine idea but it's hard work for people to read through this in order to find out what you want.

  5. #5
    Join Date
    Jul 2004
    Posts
    494
    Quote Originally Posted by mike_bike_kite View Post
    It would be far simpler if you just show what the input is and what you want the output to be. It might be obvious to you but it's not to me. Supplying your current code is a fine idea but it's hard work for people to read through this in order to find out what you want.
    I need to know if :
    PHP Code:
    preg_match($extract_td$match_tr[$td][1], $match_td); 
    is written correctly in the for loop.
    I want:
    Array(
    [0] => One Bedroom
    [1] => Call for Availability
    [2] => hello
    )
    Array(
    [0] => One Living Room
    [1] => Call not for Availability
    [2] => hello
    )
    I'm getting:
    Array ( [0] => One Bedroom [1] => One Bedroom ) Array ( [0] => One Living Room [1] => One Living Room )

  6. #6
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    The preg_match functions put the whole matching string into $match[0] and then put each successive bracketed match into $match[1], $match[2] etc. So when you do:

    Code:
    preg_match( "/<td.*>(.*)<\/td>/Ui", 
           '<tr><td>One Bedroom</td><td>Call for Availability</td><td>hello</td></tr>',
           $match );
    
    print_r($match);
    It will pull the first "<td>One Bedroom</td>" and put it into $match[0] then it will look for the text within the tags and put it in [1]. This seems to be what's happening. I suggest you don't use the brackets but instead use strip_tags to remove any tags.

    Of course your code will still only work with perfectly formed tables. I can promise you that most of the html code out there is a real mess - you'll need to cope with other html tags, missing tags, multiple tables or embedded tables. It would probably be easier to find a class that deals with all this stuff directly and simply use that.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •