HTML-EXTRACT can be downloaded from http://weitz.de/files/html-extract.tar.gz. It comes with a BSD-style license so you can basically do with it whatever you want.
To build HTML-EXTRACT on a Unix-like OS you just have to execute the
shell script build.sh
in the HTML-EXTRACT directory - you
might have to adjust the CLISP
variable there first.
This'll result in a small executable html-extract
that you can
put into, say, /usr/local/bin
and use like this:
html-extract <input.html >output.txtHere,
input.html
is an arbitrary HTML file
and output.txt
will be the result of stripping all HTML
tags off of this file.
$Header: /usr/local/cvsrep/html-extract/doc/index.html,v 1.1.1.1 2005/09/22 22:09:22 edi Exp $