Performing reconnaissance for pentesting, specifically corporate email grabbing, is essential in assessing an organization’s security posture. This process involves collecting information about an organization’s email infrastructure to identify vulnerabilities and security weaknesses. With its versatility and numerous libraries, Python can be a valuable tool for conducting this reconnaissance. Here’s an overview of the process:

Information Gathering

Reconnaissance begins with gathering publicly available information about the target organization. This data can include the organization’s name, domain name, and potentially known email addresses. Python’s libraries, such as requests and beautifulsoup, can be used to scrape information from websites, search engines, and social media profiles.

Access code on my GitHub repo with the instructions on how to run it. Get Code on GitHub.

import requests
import requests.exceptions
from collections import deque
import urllib.parse
import re
from bs4 import BeautifulSoup

original_url = str(input('[+] Enter Target URL: '))
unscraped_urls = deque([original_url])

scraped_urls = set()
emails = set()

# The scraping process

depth = 0

try:
    while len(unscraped_urls):
        depth += 1
        if depth == 200:
            break
        url = unscraped_urls.popleft()
        scraped_urls.add(url)

        parts = urllib.parse.urlsplit(url)
        base_url = '{0.scheme}://{0.netloc}'.format(parts)
        if '/' in parts.path:
            path=url[:url.rfind('/')+1]
        else:
            path = url
        print('[%d] Scrapping %s' % (depth, url))

        try:
            response = requests.get(url)
        except (requests.exceptions.MissingSchema, requests.exceptions.ConnectionError):
            continue

        new_emails = set(re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+]", response.text, re.I))
        emails.update(new_emails)

        soup = BeautifulSoup(response.text, features="lxml")

        for anchor in soup.find_all("a"):
            if "href" in anchor.attrs:
                link = anchor.attrs["href"]
            else: 
                link = ''
            
            if link.startswith('/'):
                link = base_url + link
            elif not link.startswith('http'):
                link = path + link
            if not link in unscraped_urls and not link in scraped_urls:
                unscraped_urls.append(link)


except KeyboardInterrupt:
    print('[-] Existing...')

for email in emails:
    print(email)

Conducting email-grabbing reconnaissance is a critical step in a penetration test, but it’s essential to do so responsibly and within legal boundaries. Always obtain proper authorization and follow ethical guidelines when conducting penetration testing.

Watch the Video on YouTube:

Get Code on Github.


Vinay

Lead software engineer with over 5 years in development and 7+ years in network engineering. Recognised educator, substantial online presence, and MSc. Cybersecurity from the University of Derby, UK, BSc. Information Technology from UEW Ghana, and a Higher Diploma in Network Engineering from NIIT India.

0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.