URL Utility
Brief:
This program can do basically two things- it
can list all src and href
(not case sensitive) values in an html file (basically it looks for the
sequence " src" and " href" but
it does not deal with ".src" and
".href") and it can also convert relative URLs into absolute URLs. This
program comes really handy if you are maintaining a website (I use it
with a shell script)- I code internally using relative URLs so that I
can check my links even though I have no internet access and I just
create the copy I would be uploading using the script.
How to use:
Program Usage
-to print this instructions run the program with no additional parameter
-to print all href and src values in a given html file, run the program
with
an additional parameter being the file name ie
the command:
<executable name>
index.html
prints out all href and src values in index.html
-to prefix a string to all URL values in a file and the output is in
another
file do something like this-
<executable name>
f1.html f2.html http://cookiebreak.tripod.com
this would attach http://cookiebreak.tripod.com
to all URL values
(except those starting with "www.", "http://",
"ftp://", "mailto:")
in f1.html and the output is f2.html
-to ignore all URLs that starts with certain keywords do something like
this
<executable name>
f1.html f2.html http://cookiebreak.tripod.com k1 k2 kn
this would do as the previous command except
that it ignores all
that starts with k1, k2 and kn
Sample output:
[user@localhost src]$ run index.html index2.html styles
styles/style.css -->> style/styles/style.css
../../index.html -->> style/index.html
styles/ban2.png -->> style/styles/ban2.png
../../courses/index.html -->> style/courses/index.html
../../gallery/index.html -->> style/gallery/index.html
../../library/index.html -->> style/library/index.html
../../faqs/index.html -->> style/faqs/index.html
../../links/index.html -->> style/links/index.html
../../speech/index.html -->> style/speech/index.html
../index.html -->> style/index.html
../goodies/downloads.html -->> style/goodies/downloads.html
../goodies/docs.html -->> style/goodies/docs.html
www.someSite.com not changed
ftp://www.someSite.com not changed
http://www.someSite.com not changed
fdl-license.html -->> style/fdl-license.html
mailto:vampire_janus@yahoo.com not changed
../goodies/downloads.html -->> style/goodies/downloads.html
../../index.html -->> style/index.html
styles/home1.png -->> style/styles/home1.png
../index.html -->> style/index.html
styles/main1.png -->> style/styles/main1.png
#top not changed
styles/up1.png -->> style/styles/up1.png
To do:
-test and debugging (please report bugs)
-just try it and give
me suggestions (better yet patch it yourself =))
Recommended improvements:
-maybe add a bash script that would wrap the
program (for example,
URLUtil -l index.html
would list all src and href
values in index.hml)
Contact the original author through vampire_janus@yahoo.com.
Download everything in here.
|