Recon for Pentesting using Python (Email Grabbing)

Performing reconnaissance for pentesting, specifically corporate email grabbing, is essential in assessing an organization’s security posture. This process involves collecting information about an organization’s email infrastructure to identify vulnerabilities and security weaknesses. With its versatility and numerous libraries, Python can be a valuable tool for conducting this reconnaissance. Here’s an overview of the process:

Information Gathering

Reconnaissance begins with gathering publicly available information about the target organization. This data can include the organization’s name, domain name, and potentially known email addresses. Python’s libraries, such as requests and beautifulsoup, can be used to scrape information from websites, search engines, and social media profiles.

Access code on my GitHub repo with the instructions on how to run it. Get Code on GitHub.

import requests
import requests.exceptions
from collections import deque
import urllib.parse
import re
from bs4 import BeautifulSoup

original_url = str(input('[+] Enter Target URL: '))
unscraped_urls = deque([original_url])

scraped_urls = set()
emails = set()

# The scraping process

depth = 0

try:
    while len(unscraped_urls):
        depth += 1
        if depth == 200:
            break
        url = unscraped_urls.popleft()
        scraped_urls.add(url)

        parts = urllib.parse.urlsplit(url)
        base_url = '{0.scheme}://{0.netloc}'.format(parts)
        if '/' in parts.path:
            path=url[:url.rfind('/')+1]
        else:
            path = url
        print('[%d] Scrapping %s' % (depth, url))

        try:
            response = requests.get(url)
        except (requests.exceptions.MissingSchema, requests.exceptions.ConnectionError):
            continue

        new_emails = set(re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+]", response.text, re.I))
        emails.update(new_emails)

        soup = BeautifulSoup(response.text, features="lxml")

        for anchor in soup.find_all("a"):
            if "href" in anchor.attrs:
                link = anchor.attrs["href"]
            else: 
                link = ''
            
            if link.startswith('/'):
                link = base_url + link
            elif not link.startswith('http'):
                link = path + link
            if not link in unscraped_urls and not link in scraped_urls:
                unscraped_urls.append(link)


except KeyboardInterrupt:
    print('[-] Existing...')

for email in emails:
    print(email)

Conducting email-grabbing reconnaissance is a critical step in a penetration test, but it’s essential to do so responsibly and within legal boundaries. Always obtain proper authorization and follow ethical guidelines when conducting penetration testing.

Watch the Video on YouTube:

Get Code on Github.

Recon for Pentesting using Python (Email Grabbing)

Information Gathering

By Vincent Olagbemide