📒
My Pentesting Cheatsheet
  • Home
  • Commands Only Summary
    • Some other cool websites
  • Preparation
    • Documents
    • Contract - Checklist
    • Rules of Engagement - Checklist
    • Contractors Agreement - Checklist for Physical Assessments
  • Information Gathering
  • Vulnerability Assessment
  • Pentesting Machine
  • Enumeration
    • NMAP Scan types explained
    • Firewall and IDS/IPS Evasion
  • Footprinting
    • Google Dorks
    • Samba (smb)
    • NFS
    • DNS
    • SMTP
    • IMAP/POP3
    • SNMP
    • MySQL
    • MSSQL
    • Oracle TNS
    • IPMI
    • SSH
    • RDP
    • WinRM
  • Web Information Gathering
    • Whois
    • DNS & Subdomains
    • Fingerprinting
    • Crawlers
    • Search Engine Discovery
    • Automating Recon
  • Vulnerability Assessment
  • File Transfers
    • Windows Target
    • Linux Target
    • Transferring Files with Code
    • Miscellaneous File Transfer Methods
    • Protected Files Transfer
    • Catching Files over HTTP/S (Nginx)
    • Living Off The Land
    • Evading Detection
  • Shells & Payloads
    • Reverse Shells + Bind + Web
  • Password Attacks
    • John the ripper
    • Remote password attacks
    • Password mutations
    • Password Reuse / Default Passwords
    • Windows Local Password Attacks
    • Linux Local Password Attacks
    • Windows Lateral Movement
    • Cracking Files
  • Attacking Common Services
    • FTP
    • SMB
    • SQL
    • RDP
    • DNS
    • Email Services
  • Pivoting, Tunneling, and Port Forwarding
    • Choosing The Dig Site & Starting Our Tunnels
    • Playing Pong with Socat
    • Pivoting Around Obstacles
    • Branching Out Our Tunnels
    • Double Pivots
    • Final considerations
  • Active Directory Enumeration & Attacks
    • Initial Enumeration
    • Sniffing out a Foothold
    • Sighting In, Hunting For A User
    • Spray Responsibly
    • Deeper Down the Rabbit Hole
    • Kerberoasting - Cooking with Fire
    • Access Control List (ACL)
    • Advanced Privilege Escalation in Active Directory: Stacking The Deck
    • Domain trusts
    • Domain Trusts - Cross Forest
    • Defensive Considerations
  • Using Web Proxies
  • Login Brute Forcing
  • SQL Injection Fundamentals
    • Mitigating SQL Injection
  • SQLMap Essentials
    • Building Attacks
    • Database Enumeration
    • Advanced SQLMap Usage
  • Cross-Site Scripting (XSS)
    • Prevention
  • File Inclusion
  • File Upload Attacks
    • Basic Exploitation
    • Bypassing Filters
    • Other Upload Attacks
    • Prevention
  • Command Injections
    • Exploitation
    • Filter Evasion
  • Web Attacks
    • HTTP Verb Tampering
    • Insecure Direct Object References (IDOR)
    • XML External Entity (XXE) Injection
    • GraphQL
  • Attacking Common Applications
    • Application Discovery & Enumeration
    • Content Management Systems (CMS)
    • Servlet Containers/Software Development
    • Infrastructure/Network Monitoring Tools
    • Customer Service Mgmt & Configuration Management
    • Common Gateway Interfaces
    • Thick Client Applications
    • Miscellaneous Applications
  • Privilege Escalation
    • Linux Privilege Escalation
      • Information Gathering
      • Environment-based Privilege Escalation
      • Service-based Privilege Escalation
      • Linux Internals-based Privilege Escalation
      • Recent 0-Days
      • Linux Hardening
    • Windows Privilege Escalation
      • Getting the Lay of the Land
      • Windows User Privileges
      • Windows Group Privileges
      • Attacking the OS
      • Credential Theft
      • Restricted Environments
      • Additional Techniques
      • Dealing with End of Life Systems
      • Windows Hardening
    • Windows (old page)
  • Documentation & Reporting
    • Preparation
    • Reporting
  • Attacking Enterprise Networks
    • Pre-Engagement
    • External Testing
    • Internal Testing
    • Lateral Movement & Privilege Escalation
    • Wrapping Up
  • Deobfuscation
  • Metasploit
    • msfvenom
  • Custom compiled files
  • XSS
  • Azure AD (Entra ID)
Powered by GitBook
On this page
  • Popular Web Crawlers
  • Scrapy
  1. Web Information Gathering

Crawlers

PreviousFingerprintingNextSearch Engine Discovery

Last updated 7 months ago

Remember to check:

  • robots.txt

  • .well-known

Popular Web Crawlers

  1. Burp Suite Spider: Burp Suite, a widely used web application testing platform, includes a powerful active crawler called Spider. Spider excels at mapping out web applications, identifying hidden content, and uncovering potential vulnerabilities.

  2. OWASP ZAP (Zed Attack Proxy): ZAP is a free, open-source web application security scanner. It can be used in automated and manual modes and includes a spider component to crawl web applications and identify potential vulnerabilities.

  3. Scrapy (Python Framework): Scrapy is a versatile and scalable Python framework for building custom web crawlers. It provides rich features for extracting structured data from websites, handling complex crawling scenarios, and automating data processing. Its flexibility makes it ideal for tailored reconnaissance tasks.

  4. Apache Nutch (Scalable Crawler): Nutch is a highly extensible and scalable open-source web crawler written in Java. It's designed to handle massive crawls across the entire web or focus on specific domains. While it requires more technical expertise to set up and configure, its power and flexibility make it a valuable asset for large-scale reconnaissance projects.

Scrapy

pip3 install scrapy

Good example:

2KB
ReconSpider.v1.2.zip
archive