Performing reconnaissance for pentesting, specifically corporate email grabbing, is essential in assessing an organization’s security posture. This process involves collecting information about an organization’s email infrastructure to identify vulnerabilities and security weaknesses. With its versatility and numerous libraries, Python can be a valuable tool for conducting this reconnaissance. Here’s an overview of the process:
Information Gathering
Reconnaissance begins with gathering publicly available information about the target organization. This data can include the organization’s name, domain name, and potentially known email addresses. Python’s libraries, such as requests
and beautifulsoup
, can be used to scrape information from websites, search engines, and social media profiles.
Access code on my GitHub repo with the instructions on how to run it. Get Code on GitHub.
import requests
import requests.exceptions
from collections import deque
import urllib.parse
import re
from bs4 import BeautifulSoup
original_url = str(input('[+] Enter Target URL: '))
unscraped_urls = deque([original_url])
scraped_urls = set()
emails = set()
# The scraping process
depth = 0
try:
while len(unscraped_urls):
depth += 1
if depth == 200:
break
url = unscraped_urls.popleft()
scraped_urls.add(url)
parts = urllib.parse.urlsplit(url)
base_url = '{0.scheme}://{0.netloc}'.format(parts)
if '/' in parts.path:
path=url[:url.rfind('/')+1]
else:
path = url
print('[%d] Scrapping %s' % (depth, url))
try:
response = requests.get(url)
except (requests.exceptions.MissingSchema, requests.exceptions.ConnectionError):
continue
new_emails = set(re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+]", response.text, re.I))
emails.update(new_emails)
soup = BeautifulSoup(response.text, features="lxml")
for anchor in soup.find_all("a"):
if "href" in anchor.attrs:
link = anchor.attrs["href"]
else:
link = ''
if link.startswith('/'):
link = base_url + link
elif not link.startswith('http'):
link = path + link
if not link in unscraped_urls and not link in scraped_urls:
unscraped_urls.append(link)
except KeyboardInterrupt:
print('[-] Existing...')
for email in emails:
print(email)
Conducting email-grabbing reconnaissance is a critical step in a penetration test, but it’s essential to do so responsibly and within legal boundaries. Always obtain proper authorization and follow ethical guidelines when conducting penetration testing.
Watch the Video on YouTube:
Get Code on Github.
Leave a Reply