get table part from html use php regex
Hi friends, happy new year.
I have some html code, I want to use php regular to get all the table part. however, the table part like <table class="name">...</table>`,`<table id="share">...</table>I use preg_match_all('/<table*?[^>]*?</table>/', $source, $match); but nothing get, how to write correctly? Thanks.
13 Answers
<a href="http://canadianpharmacytousa.com/#">online pharmacies canada</a> canada medication list <a href="http://canadianpharmacytousa.com/#">canadianpharmacytousa.com</a>
trust pharmacy canada <a href=http://canadianpharmacytousa.com/#>http://canadianpharmacytousa.com/</a> pharmacy canada plus http://canadianpharmacytousa.com/ <a href=http://canadianpharmacytousa.com/#>canada pharmacies online</a> http://touriosity.com/__media__/js/netsoltrademark.php?d=canadianpharmacytousa.com <a href="http://ourmylanisyourmylan.net/__media__/js/netsoltrademark.php?d=canadianpharmacytousa.com#">buy viagra usa</a> http://elitstroy31.ru/bitrix/redirect.php?event1=&event2=&event3=&goto=http://canadianpharmacytousa.com/ <a href="http://zooskidka.ru/bitrix/rk.php?goto=http://canadianpharmacytousa.com/#">canada medications information</a> Posted: DouglasWhona 0 of 0 people found this answer helpful. Did you? Yes No
<a href="http://canadian-pharmacyibuy.com">cialis over the counter usa</a> order cialis europe http://canadian-pharmacyibuy.com
Posted: ytaletoppq 0 of 0 people found this answer helpful. Did you? Yes No
<a href="http://waltzweekend.com">viagra prices</a> generic viagra pills http://waltzweekend.com
Posted: gtaletugbm 0 of 0 people found this answer helpful. Did you? Yes No
<a href="http://rabbitinahat.com">order cialis online canada</a> how to use cialis 20mg tablets http://rabbitinahat.com
Posted: ztaletbzza 0 of 0 people found this answer helpful. Did you? Yes No
<a href="http://mphasset.com">viagra las vegas</a> viagra does it work http://mphasset.com
Posted: ntaletykrm 0 of 0 people found this answer helpful. Did you? Yes No
<a href="http://baymontelreno.com">cialis online</a> can i buy cialis in canada http://baymontelreno.com
Posted: utaletkgaq 0 of 0 people found this answer helpful. Did you? Yes No
<a href="http://gigawatt6.com">cialis and levitra together</a> cheapest generic cialis http://gigawatt6.com
Posted: ltalethfex 0 of 0 people found this answer helpful. Did you? Yes No
<a href="http://istanbulexpressonline.com">viagra las vegas</a> will viagra make me last longer http://istanbulexpressonline.com
Posted: htaletiiyu 0 of 0 people found this answer helpful. Did you? Yes No
<a href="http://rabbitinahat.com">cialis coupon</a> cialis how long to kick in http://rabbitinahat.com
Posted: qtaletcfxn 0 of 0 people found this answer helpful. Did you? Yes No
Cheapest Non Prescription Celias <a href=http://cialibuy.com>cialis</a> Propecia Zorgverzekering Ordering Cheap Prednisone Without X Amoxicillin Online Without
Posted: Ellgaibra 0 of 0 people found this answer helpful. Did you? Yes No
If it has to be regular expressions, you might try reading up on them first.
Your regex matches anything like <table> foo bar</table>. Plus its still baytracking like hell. Try this: ~<table[^>]*>(.+)(?!</table)</table>~ Posted: goreSplatter 1 of 1 people found this answer helpful. Did you? Yes No
There are far too many variations, for one, and more importantly, regex isn\'t very good with the hierarchal nature of HTML. It\'s best to use an XML parser or better-yet an HTML-specific parser.
Whenever I need to scrape HTML, I tend to use the Simple HTML DOM Parser library, which takes an HTML tree and parses it into a traversable PHP object, which you can query something like JQuery. require \'simplehtmldom/simple_html_dom.php\'; $sHtml = <<<EOS <table border=\"1\" > <tbody style=\"\" > <tr style=\"\" > <td style=\"color:blue;\"> data0 </td> <td style=\"font-size:15px;\"> data1 </td> <td style=\"font-size:15px;\"> data2 </td> <td style=\"color:blue;\"> data3 </td> <td style=\"color:blue;\"> data4 </td> </tr> <tr style=\"\" > <td style=\"color:blue;\"> data00 </td> <td style=\"font-size:15px;\"> data11 </td> <td style=\"font-size:15px;\"> data22 </td> <td style=\"color:blue;\"> data33 </td> <td style=\"color:blue;\"> data44 </td> </tr> <tr style=\"color:black\" > <td style=\"color:blue;\"> data000 </td> <td style=\"font-size:15px;\"> data111 </td> <td style=\"font-size:15px;\"> data222 </td> <td style=\"color:blue;\"> data333 </td> <td style=\"color:blue;\"> data444 </td> </tr> </tbody> </table> EOS; $oHTML = str_get_html($sHtml); $oTRs = $oHTML->find(\'table tr\'); $aData = array(); foreach($oTRs as $oTR) { $aRow = array(); $oTDs = $oTR->find(\'td\'); foreach($oTDs as $oTD) { $aRow[] = trim($oTD->plaintext); } $aData[] = $aRow; } var_dump($aData);And the output: array 0 => array 0 => string \'data0\' (length=5) 1 => string \'data1\' (length=5) 2 => string \'data2\' (length=5) 3 => string \'data3\' (length=5) 4 => string \'data4\' (length=5) 1 => array 0 => string \'data00\' (length=6) 1 => string \'data11\' (length=6) 2 => string \'data22\' (length=6) 3 => string \'data33\' (length=6) 4 => string \'data44\' (length=6) 2 => array 0 => string \'data000\' (length=7) 1 => string \'data111\' (length=7) 2 => string \'data222\' (length=7) 3 => string \'data333\' (length=7) 4 => string \'data444\' (length=7) Posted: Go 2 of 2 people found this answer helpful. Did you? Yes No
I prefer using one of the native XML extensions, like If you prefer a 3rd party lib, I'd suggest not to use SimpleHtmlDom, but a lib that actually uses DOM/libxml underneath instead of String Parsing:
Posted: MacOS 2 of 2 people found this answer helpful. Did you? Yes No Thanks, I will learn it. |
© Advanced Web Core. All rights reserved