regex - python re sub vs findall -
my code:
import urllib import re xml = urllib.urlopen('url').read() i interested in removing tags , contents xml file, hence, seek using regular expressions.
for instance:
re.findall(r'<fig(.*?)</fig>', xml, re.dotall) returns matches , non empty.
however,
re.sub(r'<fig(.*?)</fig>', ' ', xml, re.dotall) does nothing, xml string unchanged. confused why, please help.
the 4th parameter of re.sub not flags, counts. value of re.dotall 16 (at to the lowest degree in python 2.7 / 3.4). re.sub(.., re.dotall) replace 16 times.
specifying flags keywords argument solve problem:
re.sub(r'<fig(.*?)</fig>', ' ', xml, flags=re.dotall) in add-on that, re.sub returns replaced string, not alter 3rd argument in-place. create sure assigned homecoming value of function.
python regex
No comments:
Post a Comment