Skip to content

Python web scraping/info gathering utility created for Applications Development module.

Notifications You must be signed in to change notification settings

kepcon/webpage-scraper

Repository files navigation

webpage-scraper

Python web scraping/info gathering utility created for Applications Development module.

Enter link into getwebinfo.py script then run to call all other scripts.

Each script can be run individually by uncommenting the 'sys.argv.append' line in each script.

####webpage_get : Retrieves html structure of website via URL.

####webpage_getfiles : Retrieves page content via URL, parsing out files. Downloads all files and produces formatted list of matched files(download path must be specified in script). Compares downloaded files with a set of bad hashes.

####webpage_getlinks : Parses links out of page content and produces formatted list of matched links.

####webpage_getuniqueinfo_crackhash : Retrieves page content, parses md5hash hex numbers, email addresses and phone numbers then produces a formatted list of matched information. Cracks hash passwords using dictionary of common passwords.

About

Python web scraping/info gathering utility created for Applications Development module.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages