Different results for raw MD5 base64 encoded string between PHP and Clojure (Java) code for some characters -
i have server create hash using next code:
base64_encode(md5("some value", true))
what have produce same hash value in clojure (using java interop). did create next clojure functions:
(defn md5-raw [s] (let [algorithm (java.security.messagedigest/getinstance "md5") size (* 2 (.getdigestlength algorithm))] (.digest algorithm (.getbytes s)))) (defn bytes-to-base64-string [bytes] (string. (b64/encode bytes) "utf-8"))
then utilize code way:
(bytes-to-base64-string (md5-raw "some value")
now, works fine normal strings. however, after processing multiple different examples, found the next character causing issues:
’
this utf-8
character #8217
.
if run next php
code:
base64_encode(md5("’", true))
what returned is:
yoy9/y97p/gfapvelvqaha==
if run next clojure
code:
(bytes-to-base64-string (md5-raw "’"))
i next value:
af1zconzutegrn2yxakpoq==
why that? suspecting character encoding issue, appears handled utf-8 far can see.
not can guaranteed utf-8 in example, next look depends on default charset:
(.getbytes s)
you should - well, depends on utilize case - use:
(.getbytes s "utf-8")
demonstration:
(defn md5-with-charset [s charset] (let [algorithm (java.security.messagedigest/getinstance "md5")] (.digest algorithm (.getbytes s charset)))) (b64 (md5-with-charset "’" "utf-8")) ;; => "yoy9/y97p/gfapvelvqaha==" (b64 (md5-with-charset "’" "ascii")) ;; => "0uv7csp7mjomcrja7z6rxq==" (b64 (md5-with-charset "’" "utf-16")) ;; => "3clvthylt2kkrocdupxipg==" (b64 (md5-with-charset "’" "utf-32")) ;; => "ihbmmmzkwtbpu+n8gchitq=="
(where b64
base64 encoding step)
and found it:
(b64 (md5-with-charset "’" "windows-1250")) ;; => "af1zconzutegrn2yxakpoq=="
java php encoding utf-8 clojure
No comments:
Post a Comment