Tuesday 15 May 2012

Custom Hadoop Distribution support to Spark components in Talend -



Custom Hadoop Distribution support to Spark components in Talend -

i working cluster have custom hadoop 2.4. trying utilize talend spark components. tsparkconnection components, have set relevant sparkhost, sparkhome.

for distribution, 2 available options cloudera , custom( unsupported). when custom( unsupported ) distribution selected, there provision take custom hadoop version include relavant libraries. options available here are: cloudera, hortonworks, mapr, apache, amazon emr, pivotalhd. me, when take cloudera comes hadoop 2.3 , assuming essential libraries missing, , hence "noclassdeffounderror" leads not beingness able load file in spark via spark connection. btw, spark version have 1.0.0

i know how prepare , way version of spark running hadoop 2.4.

the error copied , pasted below:

[statistics] connecting socket on port 3637 [statistics] connected exception in thread "main" java.lang.noclassdeffounderror: org/apache/spark/api/java/javasparkcontext @ sparktest.sparktest_0_1.sparktest.tsparkconnection_2process(sparktest.java:491) @ sparktest.sparktest_0_1.sparktest.runjobintos(sparktest.java:1643) @ sparktest.sparktest_0_1.sparktest.main(sparktest.java:1502) caused by: java.lang.classnotfoundexception: org.apache.spark.api.java.javasparkcontext @ java.net.urlclassloader$1.run(urlclassloader.java:372) @ java.net.urlclassloader$1.run(urlclassloader.java:361) @ java.security.accesscontroller.doprivileged(native method) @ java.net.urlclassloader.findclass(urlclassloader.java:360) @ java.lang.classloader.loadclass(classloader.java:424) @ sun.misc.launcher$appclassloader.loadclass(launcher.java:308) @ java.lang.classloader.loadclass(classloader.java:357) ... 3 more [statistics] disconnected job sparktest ended @ 13:19 21/10/2014. [exit code=1]

thanks!

yes cdh 5.0.0 contains hadoop 2.3. hadoop 2.4.0 on roadmap , sounds available cdh 5.x.

best.

hadoop apache-spark talend

No comments:

Post a Comment