Wednesday, 15 January 2014

.net - PDF Convert to Black And White PNGs -



.net - PDF Convert to Black And White PNGs -

i'm trying compress pdfs using itextsharp. there lot of pages color images stored jpegs (dctdecode)...so i'm converting them black , white pngs , replacing them in document (the png much smaller jpg black , white format)

i have next methods:

private static bool trycompresspdfimages(pdfreader reader) { seek { int n = reader.xrefsize; (int = 0; < n; i++) { pdfobject obj = reader.getpdfobject(i); if (obj == null || !obj.isstream()) { continue; } var dict = (pdfdictionary)pdfreader.getpdfobject(obj); var subtype = (pdfname)pdfreader.getpdfobject(dict.get(pdfname.subtype)); if (!pdfname.image.equals(subtype)) { continue; } var stream = (prstream)obj; seek { var image = new pdfimageobject(stream); image img = image.getdrawingimage(); if (img == null) continue; using (img) { int width = img.width; int height = img.height; using (var msimg = new memorystream()) using (var bw = img.toblackandwhite()) { bw.save(msimg, imageformat.png); msimg.position = 0; stream.setdata(msimg.toarray(), false, pdfstream.no_compression); stream.put(pdfname.type, pdfname.xobject); stream.put(pdfname.subtype, pdfname.image); stream.put(pdfname.filter, pdfname.flatedecode); stream.put(pdfname.width, new pdfnumber(width)); stream.put(pdfname.height, new pdfnumber(height)); stream.put(pdfname.bitspercomponent, new pdfnumber(8)); stream.put(pdfname.colorspace, pdfname.devicergb); stream.put(pdfname.length, new pdfnumber(msimg.length)); } } } grab (exception ex) { trace.traceerror(ex.tostring()); } { // may or may not help reader.removeunusedobjects(); } } homecoming true; } grab (exception ex) { trace.traceerror(ex.tostring()); homecoming false; } } public static image toblackandwhite(this image image) { image = new bitmap(image); using (graphics gr = graphics.fromimage(image)) { var graymatrix = new[] { new[] {0.299f, 0.299f, 0.299f, 0, 0}, new[] {0.587f, 0.587f, 0.587f, 0, 0}, new[] {0.114f, 0.114f, 0.114f, 0, 0}, new [] {0f, 0, 0, 1, 0}, new [] {0f, 0, 0, 0, 1} }; var ia = new imageattributes(); ia.setcolormatrix(new colormatrix(graymatrix)); ia.setthreshold((float)0.8); // alter threshold needed var rc = new rectangle(0, 0, image.width, image.height); gr.drawimage(image, rc, 0, 0, image.width, image.height, graphicsunit.pixel, ia); } homecoming image; }

i've tried varieties of colorspaces , bitspercomponents, "insufficient info image", "out of memory", or "an error exists on page" upon trying open resulting pdf...so must doing wrong. i'm pretty sure flatedecode right thing use.

any assistance much appreciated.

the question:

you have pdf colored jpg. instance: image.pdf

if within pdf, you'll see filter of image stream /dctdecode , color space /devicergb.

now want replace image in pdf, result looks this: image_replaced.pdf

in pdf, filter /flatedecode , color space alter /devicegray.

in conversion process, want user png format.

the example:

i have made illustration makes conversion: replaceimage

i explain illustration step step:

step 1: finding image

in example, know there's 1 image, i'm retrieving prstream image dictionary , image bytes in quick , dirty way.

pdfreader reader = new pdfreader(src); pdfdictionary page = reader.getpagen(1); pdfdictionary resources = page.getasdict(pdfname.resources); pdfdictionary xobjects = resources.getasdict(pdfname.xobject); pdfname imgref = xobjects.getkeys().iterator().next(); prstream stream = (prstream) xobjects.getasstream(imgref);

i go /xobject dictionary /resources listed in page dictionary of page 1. take first xobject encounter, assuming imagem , image prstream object.

your code improve mine, part of code isn't relevant question , works in context of example, let's ignore fact won't work other pdfs. care steps 2 , 3.

step 2: converting colored jpg black , white png

let's write method takes pdfimageobject , converts image object changed grayness colors , stored png:

public static image makeblackandwhitepng(pdfimageobject image) throws ioexception, documentexception { bufferedimage bi = image.getbufferedimage(); bufferedimage newbi = new bufferedimage(bi.getwidth(), bi.getheight(), bufferedimage.type_ushort_gray); newbi.getgraphics().drawimage(bi, 0, 0, null); bytearrayoutputstream baos = new bytearrayoutputstream(); imageio.write(newbi, "png", baos); homecoming image.getinstance(baos.tobytearray()); }

we convert original image black , white image using standard bufferedimage manipulations: draw original image bi new image newbi of type type_ushort_gray.

once done, want image bytes in png format. done using standard imageio functionaltiy: write bufferedimage byte array telling imageio want "png".

we can utilize resulting bytes create image object.

image img = makeblackandwhitepng(new pdfimageobject(stream));

now have itext image object, please note image bytes stored in image object no longer in png format. mentioned in comments, png not supported in pdf. itext alter image bytes format supported in pdf (for more details see section 4.2.6.2 of the abc of pdf).

step 3: replacing original image stream new image stream

we have image object, need replace original image stream new 1 , need adapt image dictionary /dctdecode alter /flatedecode, /devicergb alter /devicegray, , value of /length different.

you creating image stream , dictionary manually. that's brave. leave job itext's pdfimage object:

pdfimage image = new pdfimage(makeblackandwhitepng(new pdfimageobject(stream)), "", null);

pdfimage extends pdfstream, , can replace original stream new stream:

public static void replacestream(prstream orig, pdfstream stream) throws ioexception { orig.clear(); bytearrayoutputstream baos = new bytearrayoutputstream(); stream.writecontent(baos); orig.setdata(baos.tobytearray(), false); (pdfname name : stream.getkeys()) { orig.put(name, stream.get(name)); } }

the order in things here important. don't want setdata() method tamper length , filter.

step 4: persisting document after replacing stream

i guess it's not hard figure part out:

replacestream(stream, image); pdfstamper stamper = new pdfstamper(reader, new fileoutputstream(dest)); stamper.close(); reader.close();

problem:

i not c# developer. know pdf inside-out , know java.

if problem caused in step 2, you'll have post question asking how convert colored jpeg image black , white png image. if problem caused in step 3 (for instance because using /devicergb instead of /devicegray), reply solve problem.

.net pdf itextsharp system.drawing

No comments:

Post a Comment