Tuesday 15 July 2014

hadoop-1.0.3 sequenceFile.Writer overwrites instead of appending images into a sequencefile -



hadoop-1.0.3 sequenceFile.Writer overwrites instead of appending images into a sequencefile -

i using hadoop 1.0.3 (i can't upgrade right now,thats later. ) have around 100 images in hdfs , trying combine them single sequencefile ( default no compression etc.. )

here's code:

fsdatainputstream in = null; byteswritable value = new byteswritable(); text key = new text(); path inpath = new path(fs.gethomedirectory(),"/user/hduser/input"); path seq_path = new path(fs.gethomedirectory(),"/user/hduser/output/file.seq"); filestatus[] files = fs.liststatus(inpath); sequencefile.writer author = null; for( filestatus filestatus : files){ inpath = filestatus.getpath(); seek { in = fs.open(inpath); byte bufffer[] = new byte[in.available()]; in.read(bufffer); author = sequencefile.createwriter(fs,conf,seq_path,key.getclass(),value.getclass()); writer.append(new text(inpath.getname()), new byteswritable(bufffer)); }catch (exception e) { system.out.println("exception messages = "+e.getmessage()); e.printstacktrace(); }}

this goes through files in input/ , 1 1 appends them. however overwrites sequence file instead of appending , see lastly image in sequencefile.

note not closing author before loop ends , can help me please. not sure how can append images?

your main issue next line :

author = sequencefile.createwriter(fs, conf, seq_path, key.getclass(), value.getclass());

which within for, creating new writer in each pass. replaces previous file @ path seq_path. lastly image available.

pull out of loop, , problem should vanish.

hadoop image-processing filesystems hdfs sequence

No comments:

Post a Comment