Sunday 15 August 2010

Xpath expression to find element that do NOT have a matching ancestor -



Xpath expression to find element that do NOT have a matching ancestor -

i'm trying utilize xpath extract html5 microdata page. i'm trying "find nested nodes itemprop=name attribute not nested within itemscope element (at depth)". given next illustration i'm trying find name of product (shoes) don't want brand name (nike).

<div itemscope itemtype="http://schema.org/product> <div itemscope itemtype="http://schema.org/brand"> <div itemprop="name">nike</div> <!-- don't want --> </div> <div itemprop="name">shoes</div> <!-- want --> </div>

i can find itemprop=name element using //*[@itemprop=name] pull in brand name. btw elements shown in illustration may nested within other tags can't simple "whose immediate parent not have itemscope attribute" believe there may relating ancestors can utilize don't know plenty xpath. ideas?

a single look find itemprop="name" elements @ 1 itemscope ancestor be

//*[@itemprop = 'name'][not(ancestor::*[@itemscope][2])]

if wanted start 1 specific itemscope node , find names nested in it (and not nested scope) that's not can in 1 xpath 1.0 expression. you'd have first extract descendant names

.//*[@itemprop='name']

and each of those, find nearest itemscope ancestor

ancestor::*[@itemscope][1]

and check (on python side) whether or not node same node 1 started from. in xpath 2.0 in 1 with

for $me in . homecoming (.//*[@itemprop='name'][ancestor::*[@itemscope][1] $me])

but 1.0 doesn't have for $x in y homecoming z construction binding variables, or is operator compare node identity.

xpath xpath-1.0

No comments:

Post a Comment