.net - PDF Convert to Black And White PNGs -
i'm trying compress pdfs using itextsharp. there lot of pages color images stored jpegs (dctdecode)...so i'm converting them black , white pngs , replacing them in document (the png much smaller jpg black , white format)
i have next methods:
private static bool trycompresspdfimages(pdfreader reader) { seek { int n = reader.xrefsize; (int = 0; < n; i++) { pdfobject obj = reader.getpdfobject(i); if (obj == null || !obj.isstream()) { continue; } var dict = (pdfdictionary)pdfreader.getpdfobject(obj); var subtype = (pdfname)pdfreader.getpdfobject(dict.get(pdfname.subtype)); if (!pdfname.image.equals(subtype)) { continue; } var stream = (prstream)obj; seek { var image = new pdfimageobject(stream); image img = image.getdrawingimage(); if (img == null) continue; using (img) { int width = img.width; int height = img.height; using (var msimg = new memorystream()) using (var bw = img.toblackandwhite()) { bw.save(msimg, imageformat.png); msimg.position = 0; stream.setdata(msimg.toarray(), false, pdfstream.no_compression); stream.put(pdfname.type, pdfname.xobject); stream.put(pdfname.subtype, pdfname.image); stream.put(pdfname.filter, pdfname.flatedecode); stream.put(pdfname.width, new pdfnumber(width)); stream.put(pdfname.height, new pdfnumber(height)); stream.put(pdfname.bitspercomponent, new pdfnumber(8)); stream.put(pdfname.colorspace, pdfname.devicergb); stream.put(pdfname.length, new pdfnumber(msimg.length)); } } } grab (exception ex) { trace.traceerror(ex.tostring()); } { // may or may not help reader.removeunusedobjects(); } } homecoming true; } grab (exception ex) { trace.traceerror(ex.tostring()); homecoming false; } } public static image toblackandwhite(this image image) { image = new bitmap(image); using (graphics gr = graphics.fromimage(image)) { var graymatrix = new[] { new[] {0.299f, 0.299f, 0.299f, 0, 0}, new[] {0.587f, 0.587f, 0.587f, 0, 0}, new[] {0.114f, 0.114f, 0.114f, 0, 0}, new [] {0f, 0, 0, 1, 0}, new [] {0f, 0, 0, 0, 1} }; var ia = new imageattributes(); ia.setcolormatrix(new colormatrix(graymatrix)); ia.setthreshold((float)0.8); // alter threshold needed var rc = new rectangle(0, 0, image.width, image.height); gr.drawimage(image, rc, 0, 0, image.width, image.height, graphicsunit.pixel, ia); } homecoming image; }
i've tried varieties of colorspaces , bitspercomponents, "insufficient info image", "out of memory", or "an error exists on page" upon trying open resulting pdf...so must doing wrong. i'm pretty sure flatedecode right thing use.
any assistance much appreciated.
the question:
you have pdf colored jpg. instance: image.pdf
if within pdf, you'll see filter of image stream /dctdecode
, color space /devicergb
.
now want replace image in pdf, result looks this: image_replaced.pdf
in pdf, filter /flatedecode
, color space alter /devicegray
.
in conversion process, want user png format.
the example:
i have made illustration makes conversion: replaceimage
i explain illustration step step:
step 1: finding image
in example, know there's 1 image, i'm retrieving prstream
image dictionary , image bytes in quick , dirty way.
pdfreader reader = new pdfreader(src); pdfdictionary page = reader.getpagen(1); pdfdictionary resources = page.getasdict(pdfname.resources); pdfdictionary xobjects = resources.getasdict(pdfname.xobject); pdfname imgref = xobjects.getkeys().iterator().next(); prstream stream = (prstream) xobjects.getasstream(imgref);
i go /xobject
dictionary /resources
listed in page dictionary of page 1. take first xobject encounter, assuming imagem , image prstream
object.
your code improve mine, part of code isn't relevant question , works in context of example, let's ignore fact won't work other pdfs. care steps 2 , 3.
step 2: converting colored jpg black , white png
let's write method takes pdfimageobject
, converts image
object changed grayness colors , stored png:
public static image makeblackandwhitepng(pdfimageobject image) throws ioexception, documentexception { bufferedimage bi = image.getbufferedimage(); bufferedimage newbi = new bufferedimage(bi.getwidth(), bi.getheight(), bufferedimage.type_ushort_gray); newbi.getgraphics().drawimage(bi, 0, 0, null); bytearrayoutputstream baos = new bytearrayoutputstream(); imageio.write(newbi, "png", baos); homecoming image.getinstance(baos.tobytearray()); }
we convert original image black , white image using standard bufferedimage
manipulations: draw original image bi
new image newbi
of type type_ushort_gray
.
once done, want image bytes in png format. done using standard imageio
functionaltiy: write bufferedimage
byte array telling imageio
want "png"
.
we can utilize resulting bytes create image
object.
image img = makeblackandwhitepng(new pdfimageobject(stream));
now have itext image
object, please note image bytes stored in image
object no longer in png format. mentioned in comments, png not supported in pdf. itext alter image bytes format supported in pdf (for more details see section 4.2.6.2 of the abc of pdf).
step 3: replacing original image stream new image stream
we have image
object, need replace original image stream new 1 , need adapt image dictionary /dctdecode
alter /flatedecode
, /devicergb
alter /devicegray
, , value of /length
different.
you creating image stream , dictionary manually. that's brave. leave job itext's pdfimage
object:
pdfimage image = new pdfimage(makeblackandwhitepng(new pdfimageobject(stream)), "", null);
pdfimage
extends pdfstream
, , can replace original stream new stream:
public static void replacestream(prstream orig, pdfstream stream) throws ioexception { orig.clear(); bytearrayoutputstream baos = new bytearrayoutputstream(); stream.writecontent(baos); orig.setdata(baos.tobytearray(), false); (pdfname name : stream.getkeys()) { orig.put(name, stream.get(name)); } }
the order in things here important. don't want setdata()
method tamper length , filter.
step 4: persisting document after replacing stream
i guess it's not hard figure part out:
replacestream(stream, image); pdfstamper stamper = new pdfstamper(reader, new fileoutputstream(dest)); stamper.close(); reader.close();
problem:
i not c# developer. know pdf inside-out , know java.
if problem caused in step 2, you'll have post question asking how convert colored jpeg image black , white png image. if problem caused in step 3 (for instance because using
/devicergb
instead of
/devicegray
), reply solve problem.
.net pdf itextsharp system.drawing