how i cant match this html by regex
I need to convert
$text = \"i\'m here <i>and</i> <a href=\'http://example.com\'>this is my site</a>\";to $text = \"i\'m here and this is my site (http://example.com)\";and There could be multiple links in the text All HTML tags are to be removed and the href value from <a> tags needs to be added like above. What would be an efficient way to solve this with regex? Any code snippet would be great.
3 Answers
It's also very easy to do with 'simplehtmldom'
include('simple_html_dom.php'); # parse and echo $html = str_get_html("i'm here <i>and</i> <a href='http://example.com'>this is my site</a>"); $a = $html->find('a'); $a[0]->outertext = "{$a[0]->innertext} ( {$a[0]->href} )"; echo strip_tags($html); And that produces the code you want in your test case. Posted: MacOS 1 of 1 people found this answer helpful. Did you? Yes No
The DOM solution:
$dom = new DOMDocument; $dom->loadHTML($html); $xpath = new DOMXPath($dom); foreach($xpath->query('//a[@href]') as $node) { $textNode = new DOMText(sprintf('%s (%s)', $node->nodeValue, $node->getAttribute('href'))); $node->parentNode->replaceChild($textNode, $node); } echo strip_tags($dom->saveHTML());and the same without XPath: $dom = new DOMDocument; $dom->loadHTML($html); foreach($dom->getElementsByTagName('a') as $node) { if($node->hasAttribute('href')) { $textNode = new DOMText(sprintf('%s (%s)', $node->nodeValue, $node->getAttribute('href'))); $node->parentNode->replaceChild($textNode, $node); } } echo strip_tags($dom->saveHTML()); All it does is load any HTML into a DomDocument instance. In the first case it uses an XPath expression, which is kinda like SQL for XML, and gets all links with an href attribute. It then creates a text node element from the innerHTML and the href attribute and replaces the link. The second version just uses the DOM API and no Xpath. Yes, it's a few lines more than Regex but this is clean and easy to understand and it won't give you any headaches when you need to add additional logic. Posted: codeberg 1 of 1 people found this answer helpful. Did you? Yes No
regex do make your life easy here. Just match the URL.
preg_match_all('/href="([^"]+)"/', $text, $m); $text = str_replace('</a>', ' (' . $m[1][0] . ')', $text); $text = strip_tags($text); Posted: xtremex 1 of 1 people found this answer helpful. Did you? Yes No |
© Advanced Web Core. All rights reserved