Saturday 15 February 2014

Using Python to manipulate data from a CSV only applying it to the first result -



Using Python to manipulate data from a CSV only applying it to the first result -

i have csv i'm attempting build little python script 'convert' csv (basically prepare info acceptable format).

i'm hitting bit of road block need observe first result out of 'blocks' of results;

for example

aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 eeffgghhii-2.4-5.6-7.5

the first part (preceding dash) has variable length , way observe 'individual' listing in particular database. want insert flag in separate column identifies each cluster share same code.

there several hundred one thousand listings can't come list search through.

thanks help.

if info grouped shown, itertools.groupby can iterate ordered info grouping mutual key:

import csv import itertools import operator data1 = '''\ aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 eeffgghhii-2.4-5.6-7.5 ''' data2 = '''\ shirt-red shirt-blue shirt-green shoe-red shoe-blue ''' def setup(): '''generate sample input files.''' open('sample1.hsv','w') f: f.write(data1) open('sample2.hsv','w') f: f.write(data2) def process(infile,outfile): open(infile,'r',newline='') ifile, open(outfile,'w',newline='') ofile: r = csv.reader(ifile,delimiter='-') w = csv.writer(ofile,delimiter=',') # key first column (offset 0) # grouping iterator on lines have same key key,group in itertools.groupby(r,operator.itemgetter(0)): # add together final column row list. 1 first item. w.writerow(next(group) + [1]) # remaining items in grouping 0 value in new column. other in group: w.writerow(other + [0]) if __name__ == '__main__': setup() process('sample1.hsv','sample1.csv') process('sample2.hsv','sample2.csv') results:

sample1.hsv

aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 aabbccdd-1.2-2.4-2.6 eeffgghhii-2.4-5.6-7.5

sample1.csv

aabbccdd,1.2,2.4,2.6,1 aabbccdd,1.2,2.4,2.6,0 aabbccdd,1.2,2.4,2.6,0 aabbccdd,1.2,2.4,2.6,0 eeffgghhii,2.4,5.6,7.5,1

sample2.hsv

shirt-red shirt-blue shirt-green shoe-red shoe-blue

sample2.csv

shirt,red,1 shirt,blue,0 shirt,green,0 shoe,red,1 shoe,blue,0

python csv

No comments:

Post a Comment