[solved]使用sed和awk提取出浏览器导出的书签的网址
发表于 : 2010-12-28 22:58
我想要提取出从chromium里导出的书签的url。下面是bookmarks.html里的一个片断。请高手帮忙写一个脚本或者命令行来得到url。谢谢
<!DOCTYPE NETSCAPE-Bookmark-file-1>
<!-- This is an automatically generated file.
It will be read and overwritten.
DO NOT EDIT! -->
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<TITLE>Bookmarks</TITLE>
<H1>Bookmarks</H1>
<DL><p>
<DT><H3 ADD_DATE="0" LAST_MODIFIED="1293430925" PERSONAL_TOOLBAR_FOLDER="true">Bookmarks Bar</H3>
<DL><p>
<DT><A HREF="https://www.google.com/reader/view/" ADD_DATE="1286711776" ICON="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAACRklEQVQ4jW2SS0hUYRTHf3fuvJ1p1CFprCwLwkQzUIpoE1hQYES7chMYCLkpgvZtKmzfJqJWCrWQKNxli2gj2IgwIBiTD6RobGbUeTn3fo8Wd7jmtbM6538e3/8752/gsfL4mAawCpvIyg4A9XyR8mrOrel49ZZod48B4Ptf">Reader</A>
<!DOCTYPE NETSCAPE-Bookmark-file-1>
<!-- This is an automatically generated file.
It will be read and overwritten.
DO NOT EDIT! -->
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<TITLE>Bookmarks</TITLE>
<H1>Bookmarks</H1>
<DL><p>
<DT><H3 ADD_DATE="0" LAST_MODIFIED="1293430925" PERSONAL_TOOLBAR_FOLDER="true">Bookmarks Bar</H3>
<DL><p>
<DT><A HREF="https://www.google.com/reader/view/" ADD_DATE="1286711776" ICON="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAACRklEQVQ4jW2SS0hUYRTHf3fuvJ1p1CFprCwLwkQzUIpoE1hQYES7chMYCLkpgvZtKmzfJqJWCrWQKNxli2gj2IgwIBiTD6RobGbUeTn3fo8Wd7jmtbM6538e3/8752/gsfL4mAawCpvIyg4A9XyR8mrOrel49ZZod48B4Ptf">Reader</A>