Breedlove: eTag algorithm for multipart S3 uploads in Java? -

Monday, 15 August 2011

eTag algorithm for multipart S3 uploads in Java? -

i understand, in theory, algorithm generating s3 multi-part upload etag. but, i'm not getting expected results. can help?

theory of etag multi-part uploads (at to the lowest degree understanding):

take md5 of each upload part , concatenate them. then, take md5 of concatenated md5s. finally, add together "-" , number of parts uploaded.

note: illustration below uses made md5 values. resulting md5 not actual md5 of part md5's

e.g.

283771245d05b26c35768d1f182fbac0 - file part 1's md5 673c3f1ad03d60ea0f64315095ad7131 - file part 2's md5 11c68be603cbe39357a0f93be6ab9e2c - file part 3's md5

concatenated md5: 283771245d05b26c35768d1f182fbac0673c3f1ad03d60ea0f64315095ad713111c68be603cbe39357a0f93be6ab9e2c

the md5 of concatenated string above dash , number of file parts: 115671880dfdfe8860d6aabd09139708-3

to in java i've tried 2 methods - neither of returns right etag value

int mb = 1048576; int buffersize = 5 * mb; byte[] buffer = new byte[ buffersize ];   seek {  // string method     fileinputstream fis = new fileinputstream( new file( filename ) );      int bytesread;     string md5s = "";      {         bytesread = fis.read( buffer );         string md5 =  org.apache.commons.codec.digest.digestutils.md5hex( new string( buffer ) );         md5s += md5;     }  while ( bytesread == buffersize );      system.out.println( org.apache.commons.codec.digest.digestutils.md5hex( md5s ) );     fis.close();  } catch( exception e ) {     system.out.println( e ); }    seek {  //  byte array method     fileinputstream fis = new fileinputstream( new file( filename ) );      int bytesread;     bytearrayoutputstream bytearrayoutputstream = new bytearrayoutputstream();      {         bytesread = fis.read( buffer );         bytearrayoutputstream.write( org.apache.commons.codec.digest.digestutils.md5( buffer ) );     }  while ( bytesread == buffersize );      system.out.println( org.apache.commons.codec.digest.digestutils.md5hex( bytearrayoutputstream.tobytearray() ) );     fis.close(); } catch( exception e ) {     system.out.println( e ); }

can spot why neither algorithm working?

you should utilize byte oriented method.

that fails because:

} while ( bytesread == buffersize );

fails if file consists of exactly x parts.

besides fails for:

bytearrayoutputstream.write( org.apache.commons.codec.digest.digestutils.md5( buffer ) );

if block isn't filled bytes, i.e. when file not consist of exactly x parts.

in other words, fails.

java algorithm amazon-web-services hash amazon-s3

Breedlove

Monday, 15 August 2011

eTag algorithm for multipart S3 uploads in Java? -

No comments:

Post a Comment