(PHP5) Extracting a title tag and RSS feed address from HTML using PHP DOM or Regex -

- March 15, 2015

I want to get title tags and RSS feed address (if any) from the given URL, but the method I used He has not worked at all yet. I have been able to get the title tag using preg_match and a regular expression, but I am not able to find RSS feed anywhere with it.

($ webcontent contains html

I have copied my code below for reference ...

` // Get the title tag preg_match ('@ (. *) @ I', $ webcontent, $ titleTagArray);

  // If the title tag is found, it assigns a variable ($ TitleTagArray & amp; $ titleTagArray [3]) $ webTitle = $ headline tagArere [3]; link ("(*)." (*) Rel = "optional" (*) href = *) Type = "application / rss + xml"; // RSS or Atom feed address preg_match ('get!! & Lt ... \ s / & gt; @i', $ webcontent, $ feedAddrArray); // if feed Address has been found, it is a Variable ($ feedAddrArray & amp; $ feedAddrArray [2]) $ webFeedAddr = $ feedAddrArray [2]; `

I was reading here that regular Is not the best way to use expression? Hopefully someone can give me a hand with this: -.)

thanks

an approach

  $ dom = new DOMDocument; // init new DOMDocument $ dom- & gt; Load HTML ($ html); // Load HTML into $ xpath = new DOMXPath ($ dom); // a new XPath $ node = $ xpath- & gt; Create a query ('// title'); // Find all the title elements in the document foreach ($ node as $ node) {// found on Elements $ node-> Node vale echo; // product title text}

to obtain the href attribute of all the link tags with "application / rss + xml" you will use this XPath:

  $ xpath-> Query ('// link [@ type = "application / rss + xml"] / @ href');

Search This Blog

Com

(PHP5) Extracting a title tag and RSS feed address from HTML using PHP DOM or Regex -

Comments

Post a Comment

Popular posts from this blog

python - rename keys in a dictionary -

windows - Heroku throws SQLITE3 Read only exception -

lex - Building a lexical Analyzer in Java -