regex - In Python how to strip dollar signs and commas from dollar related fields only -
i'm reading in big text file lots of columns, dollar related , not, , i'm trying figure out how strip dollar fields of $ , , characters.
so have:
a|b|c $1,000|hi,you|$45.43 $300.03|$ms2|$55,000
where , c dollar-fields , b not. output needs be:
a|b|c 1000|hi,you|45.43 300.03|$ms2|55000
i thinking regex way go, can't figure out how express replacement:
f=open('sample1_fixed.txt','wb') line in open('sample1.txt', 'rb'): new_line = re.sub(r'(\$\d+([,\.]\d+)?k?)',????, line) f.write(new_line) f.close()
anyone have idea?
thanks in advance.
a simple approach:
>>> import re >>> exp = '\$\d+(,|\.)?\d+' >>> s = '$1,000|hi,you|$45.43' >>> '|'.join(i.translate(none, '$,') if re.match(exp, i) else in s.split('|')) '1000|hi,you|45.43'
python regex
No comments:
Post a Comment