Thursday 15 May 2014

regex - python re sub vs findall -



regex - python re sub vs findall -

my code:

import urllib import re xml = urllib.urlopen('url').read()

i interested in removing tags , contents xml file, hence, seek using regular expressions.

for instance:

re.findall(r'<fig(.*?)</fig>', xml, re.dotall)

returns matches , non empty.

however,

re.sub(r'<fig(.*?)</fig>', ' ', xml, re.dotall)

does nothing, xml string unchanged. confused why, please help.

the 4th parameter of re.sub not flags, counts. value of re.dotall 16 (at to the lowest degree in python 2.7 / 3.4). re.sub(.., re.dotall) replace 16 times.

specifying flags keywords argument solve problem:

re.sub(r'<fig(.*?)</fig>', ' ', xml, flags=re.dotall)

in add-on that, re.sub returns replaced string, not alter 3rd argument in-place. create sure assigned homecoming value of function.

python regex

No comments:

Post a Comment