Regex to parse and replace img src in C#/.NET? -
ahoy,
i have problem, see; have strings like:
<img width="594" height="392" src="/sites/it_kb/siteassets/pages/exploding%20the%20vdi%20vdesktop/vdi3.png" alt="" style="margin:5px;width:619px;height:232px" />
they not consistently formatted.
i need parse strings this, , homecoming following:
<img width="594" height="392" src="/exploding%20the%20vdi%20vdesktop-vdi3.png" alt="" style="margin:5px;width:619px;height:232px" />
changes:
remove except immediate directory in image file lay. instead of directory beingness subdirectory, prepend onto file name.so if file in /blabla/bla/blaaaaah/pickles/pickle.png
then want img src attribute pickles-pickle.png
now, i've been trying regex, after 3 hours, i've discovered myself... awful @ regex. @ weeks, , i'd never anywhere.
thus, asking wonderful community 2 things:
how this? regex right answer? need able parse src attributes within img tags (whether or not have height/width or other attributes). what resources recommend me larn regex .net?now problem @ hand, suppose string.replace i....
find img tag, , indexes of surrounding '<' , '>' find index of 'src=' , ' ' (space) between 2 instances find lastly index of '/' between src , space indexes find sec lastly index of '/' between src , space indexes replace... er no, remove... before sec lastly instance of '/'... ...string.replace remaining '/' '-'. ....i.. think that'd it?but damn ugly. regex much prettier, don't think?
any advice?
note: tagged 'homework', it's not homework. i'm volunteering work after-hours save company 200k. literally lastly piece of incredibly convoluted (to me) puzzle. of course, don't see penny of 200k, doing it.
to tag, suggest using htmlagilitypack. it's safer regex on entire html page.
use image nodes:
htmldocument doc = new htmldocument(); doc.loadhtml(html); var imgs = doc.documentnode.selectnodes("//img");
use get/set attributes:
foreach (var img in imgs) { string orig = img.attributes["src"].value; //do replacements on orig new string, newsrc img.setattributevalue("src",newsrc); }
so, kind of replacements should do? agree using regex much more elegant. things these it's after all!
something should trick:
string s = @"/sites/it_kb/siteassets/pages/exploding%20the%20vdi%20vdesktop/vdi3.png"; string n = regex.replace(s,@"(.*?)\/([^\/]*?)\/([^\/]*?)$",@"/$2-$3");
some resources can utilize larn c# regexing:
dotnetperls regex.match
msdn: regex.match method
msdn regex cheat sheet
c# regex image expression
No comments:
Post a Comment