How to get element by Index in Spark RDD (Java) -
i know method rdd.first() gives me first element in rdd.
also there method rdd.take(num) gives me first "num" elements.
but isn't there possibility element index?
thanks.
this should possible first indexing rdd. transformation 'zipwithindex' provides stable indexing, numbering each element in original order.
given: rdd = (a,b,c)
val withindex = rdd.zipwithindex // ((a,0),(b,1),(c,2))
to lookup element index, form not useful. first need utilize index key:
val indexkey = withindex.map{case (k,v) => (v,k)} //((0,a),(1,b),(2,c))
now, it's possible utilize 'lookup' action in pairrdd find element key:
val b = indexkey.lookup(1) // array(b)
if you're expecting utilize lookup
on same rdd, i'd recommend cache indexkey
rdd improve performance.
how using java api exercise left reader.
java apache-spark
No comments:
Post a Comment