Breedlove: Extracting specific data from HTML using the HTMLAgilityPack in c# -

Wednesday, 15 August 2012

Extracting specific data from HTML using the HTMLAgilityPack in c# -

i've been trying extract info website giving html string. did research , figured out had utilize htmlagilitypack; however, can't figure out how apply examples case.

i've done different tests none seem work.

a webpage illustration http://www.trivago.com/?adaterange[arr]=2014-11-02&adaterange[dep]=2014-11-03&iroomtype=7&ipathid=34741&igeodistanceitem=0&iviewtype=0&bisseopage=false&bissitemap=false&

i need extract contact data, address, telephone, official homepage link , title of element in list.

i tried moving through source firebug , class construction info follows:

class="no-touch"     class="web10152"         class="page_wrapper"             class="main_content"                 class="main"                     class="centercol content"                         class="content"                             class="container_itemlist itemlist_simplified"                                     class="itemlist hotellist  grouping component"  // has list of each item  // item (undernode of itemlist hotellist  grouping component)                                class="hotel item bookmarkable historisable"        //item main class      // path title     class="cf item_wrapper"         class="item_prices"             <h3 title="item title" </h3>      // path contact info     class="slideout_wrapper component expand"         class="slideout_content_container"             class="slideout_content info item_info js_trivago_info active"                 class="item_info_block contact"     // contains info                     <em> address  info </em>                     <em> telephone info </em>                     class="partnerhomepagelink link"                         //contains link info

i don't know how communicate htmlagilitypack. here lastly thing tried...

htmlagilitypack.htmldocument doc = new htmldocument(); doc.loadhtml(page);  seek {     var table = doc.documentnode.selectsinglenode("//h3[@class='jsheadline js_slideout_trigger js_trackable']/title");      var table1 = doc.documentnode.selectsinglenode("//div[@class='item_info_block contact']");     var ele = table1.elements("em"); }  grab { program.changecolor(program.textcolors.program_error);                console.writeline("\nerror report: failed parse page!");  }

how can accomplish this?

c# html parsing webclient html-agility-pack

Breedlove

Wednesday, 15 August 2012

Extracting specific data from HTML using the HTMLAgilityPack in c# -

No comments:

Post a Comment