webcheck is a Python program that allows web masters to: view the structure of a site; track down broken links; find potentially outdated HTML pages; list links pointing to external sites; view portfolio of inline images; and do all this periodically and without user intervention. Results are displayed in a set of HTML pages.
Warning: I've found webcheck to not be robust. It fails on fairly simple HTML pages. However, it is actively maintained so it should improve over time.
webcheck is not called directly. Instead, I wrote a simple shell program
that sets/modifies the environment variable PYTHONPATH
to include the configuration file /etc/webcheck/config.py
. The
actual executable is placed in the /usr/lib/webcheck/python
directory.
At the suggestion of Gregor
Hoffleit (the current Debian python package maintainer), I have
placed the python packages that come with webcheck into the directory
/usr/lib/python2.2/site-packages/webcheck
. This follows Gregor's
recommendation in /usr/doc/python/README.Debian.gz
.
Here are list of tasks that I'm at least contemplating for webcheck. They are roughly in order of when I plan to start them. My main guide is to increase robustness before adding new functionality. Feel free to email me if you have additional suggestions or would like to help.
The Python
urllib.py
package is documented as capable of using
proxy servers. However, I haven't been able to get this to work.
There is currently little documentation on the configuration file and no man page.
marduk <marduk@starship.skyport.net> is the primary author. Oleg Broytmann <phd@comus.ru> contributed the man page. Pierre LeJacq <jplejacq@quoininc.com> wrote this HTML page, Bastian Kleineidam is the Debian package maintainer.