regex - python re sub vs findall -
my code:
import urllib import re xml = urllib.urlopen('url').read()
i interested in removing tags , contents xml file, hence, seek using regular expressions.
for instance:
re.findall(r'<fig(.*?)</fig>', xml, re.dotall)
returns matches , non empty.
however,
re.sub(r'<fig(.*?)</fig>', ' ', xml, re.dotall)
does nothing, xml string unchanged. confused why, please help.
the 4th parameter of re.sub
not flags
, counts
. value of re.dotall
16 (at to the lowest degree in python 2.7 / 3.4). re.sub(.., re.dotall)
replace 16 times.
specifying flags
keywords argument solve problem:
re.sub(r'<fig(.*?)</fig>', ' ', xml, flags=re.dotall)
in add-on that, re.sub
returns replaced string, not alter 3rd argument in-place. create sure assigned homecoming value of function.
python regex
No comments:
Post a Comment