With a little Ruby, a little GNU, and some reading, I built a small utility to help me understand web pages.
The utility takes the HTML, picks out the links, and draws a map of the web site. It shows which pages link to other pages. (It works only for static pages, not dynamic ones.) The result is a PostScript image with pages listed and arrows connecting the pages.
I can use this to navigate web pages and get a view from a height. I can also find 'orphan' web pages, pages that are not linked.
I used a lot of off-the-shelf components: Ruby and its built-in functions for parsing HTML, GNU sort, and GraphViz. The components do the heavy lifting, and save me a lot of time.
It was a good exercise. I learned a lot, and now I have a useful tool!
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment