Thursday, 15 January 2015

Different results for raw MD5 base64 encoded string between PHP and Clojure (Java) code for some characters -



Different results for raw MD5 base64 encoded string between PHP and Clojure (Java) code for some characters -

i have server create hash using next code:

base64_encode(md5("some value", true))

what have produce same hash value in clojure (using java interop). did create next clojure functions:

(defn md5-raw [s] (let [algorithm (java.security.messagedigest/getinstance "md5") size (* 2 (.getdigestlength algorithm))] (.digest algorithm (.getbytes s)))) (defn bytes-to-base64-string [bytes] (string. (b64/encode bytes) "utf-8"))

then utilize code way:

(bytes-to-base64-string (md5-raw "some value")

now, works fine normal strings. however, after processing multiple different examples, found the next character causing issues:

this utf-8 character #8217.

if run next php code:

base64_encode(md5("’", true))

what returned is:

yoy9/y97p/gfapvelvqaha==

if run next clojure code:

(bytes-to-base64-string (md5-raw "’"))

i next value:

af1zconzutegrn2yxakpoq==

why that? suspecting character encoding issue, appears handled utf-8 far can see.

not can guaranteed utf-8 in example, next look depends on default charset:

(.getbytes s)

you should - well, depends on utilize case - use:

(.getbytes s "utf-8")

demonstration:

(defn md5-with-charset [s charset] (let [algorithm (java.security.messagedigest/getinstance "md5")] (.digest algorithm (.getbytes s charset)))) (b64 (md5-with-charset "’" "utf-8")) ;; => "yoy9/y97p/gfapvelvqaha==" (b64 (md5-with-charset "’" "ascii")) ;; => "0uv7csp7mjomcrja7z6rxq==" (b64 (md5-with-charset "’" "utf-16")) ;; => "3clvthylt2kkrocdupxipg==" (b64 (md5-with-charset "’" "utf-32")) ;; => "ihbmmmzkwtbpu+n8gchitq=="

(where b64 base64 encoding step)

and found it:

(b64 (md5-with-charset "’" "windows-1250")) ;; => "af1zconzutegrn2yxakpoq=="

java php encoding utf-8 clojure

No comments:

Post a Comment