DOMDocument

PHP XPath and DOMDocument class: Encode URLs to valid HTML standard from a piece of HTML markup

Submitted by Peter Majmesku on Mon, 08/29/2016 - 20:48

The following code example shows how it works:

<?php   $dom = new \DOMDocument('1.0''UTF-8');
  
// PHP will output warnings about non-standard HTML. Suppress it by "@".
  
@$dom->loadHTML($source);
  
// Iterate over all link-elements.
  
foreach ($dom->getElementsByTagName('link') as $node) {
    
// Copy the element to be able to replace it.
    
$updated_element $node;
    
$href_value $updated_element->getAttribute'href' );
    
// Checks if the value contains a standard violating character.
    
if (is_int(strpos($href_value']=within'))) {
      
// Encodes the URL to valid href value.
      
$href_value drupal_urlencode($href_value);
      
$updated_element->setAttribute('href'$href_value);
      
// Replace the wrong html markup.
      
$node->parentNode->replaceChild($updated_element$node);
    }
  }
  
// Get the HTML markup.
  
$html_markup_with_wrappers $dom->saveHtml();
  
// Remove the unnecessary wrappers.
  
$my_html_markup preg_replace('~<(?:!DOCTYPE|/?(?:html|head|body))[^>]*>\s*~i''',
    
$html_markup_with_wrappers); 
Subscribe to DOMDocument