Extracting specific data from HTML using the HTMLAgilityPack in c# -
i've been trying extract info website giving html string. did research , figured out had utilize htmlagilitypack; however, can't figure out how apply examples case.
i've done different tests none seem work.
a webpage illustration http://www.trivago.com/?adaterange[arr]=2014-11-02&adaterange[dep]=2014-11-03&iroomtype=7&ipathid=34741&igeodistanceitem=0&iviewtype=0&bisseopage=false&bissitemap=false&
i need extract contact data, address, telephone, official homepage link , title of element in list.
i tried moving through source firebug , class construction info follows:
class="no-touch" class="web10152" class="page_wrapper" class="main_content" class="main" class="centercol content" class="content" class="container_itemlist itemlist_simplified" class="itemlist hotellist grouping component" // has list of each item // item (undernode of itemlist hotellist grouping component) class="hotel item bookmarkable historisable" //item main class // path title class="cf item_wrapper" class="item_prices" <h3 title="item title" </h3> // path contact info class="slideout_wrapper component expand" class="slideout_content_container" class="slideout_content info item_info js_trivago_info active" class="item_info_block contact" // contains info <em> address info </em> <em> telephone info </em> class="partnerhomepagelink link" //contains link info
i don't know how communicate htmlagilitypack. here lastly thing tried...
htmlagilitypack.htmldocument doc = new htmldocument(); doc.loadhtml(page); seek { var table = doc.documentnode.selectsinglenode("//h3[@class='jsheadline js_slideout_trigger js_trackable']/title"); var table1 = doc.documentnode.selectsinglenode("//div[@class='item_info_block contact']"); var ele = table1.elements("em"); } grab { program.changecolor(program.textcolors.program_error); console.writeline("\nerror report: failed parse page!"); }
how can accomplish this?
c# html parsing webclient html-agility-pack
No comments:
Post a Comment