(PHP5) Extracting a title tag and RSS feed address from HTML using PHP DOM or Regex -
I want to get title tags and RSS feed address (if any) from the given URL, but the method I used He has not worked at all yet. I have been able to get the title tag using preg_match and a regular expression, but I am not able to find RSS feed anywhere with it.
($ webcontent contains html
I have copied my code below for reference ...
` // Get the title tag preg_match ('@ (. *) @ I', $ webcontent, $ titleTagArray);
// If the title tag is found, it assigns a variable ($ TitleTagArray & amp; $ titleTagArray [3]) $ webTitle = $ headline tagArere [3]; link ("(*)." (*) Rel = "optional" (*) href = *) Type = "application / rss + xml"; // RSS or Atom feed address preg_match ('get!! & Lt ... \ s / & gt; @i', $ webcontent, $ feedAddrArray); // if feed Address has been found, it is a Variable ($ feedAddrArray & amp; $ feedAddrArray [2]) $ webFeedAddr = $ feedAddrArray [2]; `
I was reading here that regular Is not the best way to use expression? Hopefully someone can give me a hand with this: -.)
thanks
an approach
$ dom = new DOMDocument; // init new DOMDocument $ dom- & gt; Load HTML ($ html); // Load HTML into $ xpath = new DOMXPath ($ dom); // a new XPath $ node = $ xpath- & gt; Create a query ('// title'); // Find all the title elements in the document foreach ($ node as $ node) {// found on Elements $ node-> Node vale echo; // product title text}
to obtain the href attribute of all the link tags with "application / rss + xml" you will use this XPath:
$ xpath-> Query ('// link [@ type = "application / rss + xml"] / @ href');
Comments
Post a Comment