Saturday 15 March 2014

regex - In Python how to strip dollar signs and commas from dollar related fields only -



regex - In Python how to strip dollar signs and commas from dollar related fields only -

i'm reading in big text file lots of columns, dollar related , not, , i'm trying figure out how strip dollar fields of $ , , characters.

so have:

a|b|c $1,000|hi,you|$45.43 $300.03|$ms2|$55,000

where , c dollar-fields , b not. output needs be:

a|b|c 1000|hi,you|45.43 300.03|$ms2|55000

i thinking regex way go, can't figure out how express replacement:

f=open('sample1_fixed.txt','wb') line in open('sample1.txt', 'rb'): new_line = re.sub(r'(\$\d+([,\.]\d+)?k?)',????, line) f.write(new_line) f.close()

anyone have idea?

thanks in advance.

a simple approach:

>>> import re >>> exp = '\$\d+(,|\.)?\d+' >>> s = '$1,000|hi,you|$45.43' >>> '|'.join(i.translate(none, '$,') if re.match(exp, i) else in s.split('|')) '1000|hi,you|45.43'

python regex

No comments:

Post a Comment