Shell Script to traverse all internal URLs and reporting any errors in the “traverse.errors” file

Q: What is Shell Script to traverse all internal URLs and reporting any errors in the “traverse.errors” file?

In this article, we will learn Shell Script to traverse all internal URLs and reporting any errors in the “traverse.errors” file,This free Linux tutorial for complete beginners will help you learn Linux from scratch.

Shell Script To Show All the Internal and External Links From a URL

If you are using a web server or are responsible for a website, either simple or complex, you probably find yourself doing certain tasks with high frequency, significantly identifying broken internal and external site links. Using shell scripts, you can create many of these tasks, as well as other normal clients/server functions such as managing access information to the password-protected website index. The Below Shell script is used to traverse all internal URLs on the given Web site, reporting errors (if any) in the “traverse.errors” file.

Usage: traverse.sh <URL LINK>

lynx="/usr/local/bin/lynx"

trap "$(which rm) -f traverse.dat traverse2.dat" 0
if [ -z "$1" ] ; then
  echo "Usage: checklinks URL" >&2
  exit 1
fi
baseurl="$(echo $1 | cut -d/ -f3 | sed 's/http:\/\///')"
lynx -traversal -accept_all_cookies -realm "$1" > /dev/null
if [ -s "traverse.errors" ] ; then
  /bin/echo -n $(wc -l < traverse.errors) errors encountered.
  echo Checked $(grep '^http' traverse.dat | wc -l) pages at ${1}:
  sed "s|$1||g" < traverse.errors
  mv traverse.errors ${baseurl}.errors
  echo "A copy of this output has been saved in ${baseurl}.errors"
else
  /bin/echo -n "No errors encountered. ";
  echo Checked $(grep '^http' traverse.dat | wc -l) pages at ${1}
fi
if [ -s "reject.dat" ]; then
  mv reject.dat ${baseurl}.rejects
fi
exit 0

Scenario 1: No Errors

Fig 1.2 – No Errors

Scenario 2: Some Errors

Fig 1.3 5 errors encountered

Tags: