Rob van der Woude's Scripting Pages
Powered by GeSHi

Source code for spider.bat

(view source code of spider.bat as plain text)

  1. @ECHO OFF
  2. :: Check Windows version and command line arguments
  3. IF NOT "%OS%"=="Windows_NT" GOTO Syntax
  4. IF NOT "%~3"=="" GOTO Syntax
  5. IF NOT "%~2"=="" IF /I NOT "%~1"=="/E" GOTO Syntax
  6. IF NOT "%~1"=="" IF /I NOT "%~1"=="/E" GOTO Syntax
  7.  
  8. :: Localize variables
  9. SETLOCAL
  10.  
  11. :: Interpret command line arguments
  12. IF /I "%~1"=="/E" (SET ErrOnly=1) ELSE (SET ErrOnly=0)
  13. IF "%ErrOnly%"=="1" (
  14. 	IF "%~2"=="" (SET ErrType= ) ELSE (SET ErrType=%~2)
  15. )
  16.  
  17. :: This batch file may generate some stray files that we want to
  18. :: clean up afterwards, so we'll create our own temporary directory
  19. SET Temp=%Temp:"=%
  20. SET MyTemp="%Temp%.\_spider"
  21. MD %MyTemp% 2>NUL
  22.  
  23. :: Search all HTML files for hyperlinks, suppress error messages
  24. FOR %%? IN (*.htm *.html) DO CALL :Search "%%~f?" 2>NUL
  25.  
  26. :: Remove our own temporary directory
  27. PUSHD "%Temp%"
  28. RD /S /Q _spider
  29. POPD
  30.  
  31. :: Done
  32. ENDLOCAL
  33. GOTO:EOF
  34.  
  35.  
  36. :Search
  37. PUSHD %MyTemp%
  38. FOR /F "tokens=*" %%A IN ('TYPE "%~1" ^| FIND.EXE /I "<A HREF=" ^| FIND.EXE "http://" ^| CUT.EXE -F:2 -L:1 -D:^'\^^^"^' ^| FIND.EXE "http://"') DO CALL :Check "%~1" "%%~A"
  39. POPD
  40. GOTO:EOF
  41.  
  42.  
  43. :Check
  44. SET _html=%1
  45. SET _url=%2
  46. FOR /F "tokens=*" %%x IN ('WGET.EXE --spider --tries=2 --timeout=30 %2 2^>^&1') DO SET line=%%x
  47. IF "%ErrOnly%"=="1" (
  48. 	IF NOT "%line%"=="200 OK" (
  49. 		ECHO.%line% | FIND.EXE /I "%ErrType%" >NUL
  50. 		IF NOT ERRORLEVEL 1 ECHO.%line%	%1	%2
  51. 	)
  52. ) ELSE (
  53. 	ECHO.%line%	%1	%2
  54. )
  55. GOTO:EOF
  56.  
  57.  
  58. :Syntax
  59. ECHO.
  60. ECHO Spider.bat,  Version 0.22 alpha for Windows NT 4 and later
  61. ECHO Display the validity of hyperlinks in a group of locally stored HTML files.
  62. ECHO.
  63. ECHO Usage:  SPIDER  [ /E [ error# ] ]
  64. ECHO.
  65. ECHO Where:  /E      forces display of errors only (default is ALL results)
  66. ECHO         error#  will display errors of this type only (e.g. /E 404)
  67. ECHO.
  68. ECHO Notes:  Must be started in the directory where the HTML files are located.
  69. ECHO         Uses WGET --spider to check the validity of the hyperlinks; tested
  70. ECHO         with WGET that came with ApacheFriends XAMPP for windows Version 1.0
  71. ECHO         available at http://www.apachefriends.org/
  72. ECHO         Also uses CUT, a compiled version of my CUT.PL script, which is
  73. ECHO         available at http://www.robvanderwoude.com/cut.html
  74. ECHO         Being an alpha version, and a quick-and-dirty one at that, I can in
  75. ECHO         no way guarantee this batch file will work correctly on any computer;
  76. ECHO         in fact I can guarantee there are still some bugs that need correction.
  77. ECHO.
  78. ECHO Written by Rob van der Woude to check his own site
  79. ECHO http://www.robvanderwoude.com
  80.  

page last uploaded: 2017-07-06, 12:37