Tuesday 15 January 2013

java code to split text file into chunks based on chunk size -



java code to split text file into chunks based on chunk size -

i need split given text file as sized chunks , store them array. input set of many text files in same folder. im using next code this:

int inc = 0; file dir = new file("c:\\folder"); file[] files = dir.listfiles(); (file f : files) { if(f.isfile()) { bufferedreader inputstream = null; seek { inputstream = new bufferedreader(new filereader(f)); string line; while ((line = inputstream.readline()) != null) { string c[] = splitbylength(line, chunksize); (int i=0;i<c.length;i++) { chunk[inc] = c[i]; inc++; } } } { if (inputstream != null) { inputstream.close(); } } } } public static string[] splitbylength(string s, int chunksize) { int arraysize = (int) math.ceil((double) s.length() / chunksize); string[] returnarray = new string[arraysize]; int index = 0; for(int i=0; i<s.length(); i=i+chunksize) { if(s.length() - < chunksize) { returnarray[index++] = s.substring(i); } else { returnarray[index++] = s.substring(i, i+chunksize); } } homecoming returnarray; }

here chunk values stored in "chunk" array. problem here since have used readline() command parse text file, result obtained right if chunk size less number of characters in line. lets every line has 10 characters , number of lines in file 5. if provide chunk size of value greater 10 split file 10 chunks each line in each chunk.

example, consider file next contents,

abcdefghij abcdefghij abcdefghij abcdefghij abcdefghij

if chunk size = 5 then,

abcde | fghij | abcde | fghij | abcde | fghij | abcde | fghij | abcde | fghij |

if chunk size = 10 then,

abcdefghij | abcdefghij | abcdefghij | abcdefghij | abcdefghij |

if chunk size > 10 code provides same before,

abcdefghij | abcdefghij | abcdefghij | abcdefghij | abcdefghij |

i tried using randomaccessfile , filechannel wasnt able obtain needed results... can help me solve problem? give thanks you..

that's because bufferedreader.readline() reads line not whole file.

i assume line break characters \r , \n not part of normal content interested in.

maybe helps.

// ... stringbuilder sb = new stringbuilder(); string line; while ((line = inputstream.readline()) != null) { sb.append(line); // if plenty content read, extract chunk while (sb.length() >= chunksize) { string c = sb.substring(0, chunksize); // string // add together remaining content next chunk sb = new stringbuilder(sb.substring(chunksize)); } } // thats lastly chunk string c = sb.tostring(); // string

java file chunks chunking

No comments:

Post a Comment