linux - find links with regex -
i trying larn linux commands , regular expressions, stuck on little problem have trying find series of links within file using sed , regular expressions, can help me work out , going wrong. links
<a href="../a-lot-of-different/words-that/should-link.html">useful links</a> <a href="..//a-lot-of-different/words-that/should-find-lots-of-links.html">multiple links</a> <a href="../another-word-and-links/multiple-words/sjshfi-dfg.html">more links</a>
this have.
sed -n '/<a*href=”^[../"]*\([a-z]*\)^[.html](["]*\)/p' /file > newfile
regular expressions less ideal parsing html.
you didn't show desired output. guessing want extract links. if so, try:
$ sed -rn 's/.*<a\s+href="([^"]*)".*/\1/p' file ../a-lot-of-different/words-that/should-link.html ..//a-lot-of-different/words-that/should-find-lots-of-links.html ../another-word-and-links/multiple-words/sjshfi-dfg.html
how works:
.*<a\s+href="
this matches before link.
([^"]*)
this matches link , captures grouping \1
.
".*
this matches double-quote after line , follows.
linux sed
No comments:
Post a Comment