Friday, November 4, 2016

Creating a Subdomain Crawler using Java

Running the Subdomain Crawler:

1) Copy the git project from https://github.com/csanuragjain/subdomainCrawler

2) Create a Maven Project in Eclipse. Add the content from the above git project to your eclipse project

3) Run Driver.java

4) Enter the domain for which you would like to extract subdomains

5) Subdomain extraction would start

6) You can extend the capability by adding more search engines.

7) You can export the project from Eclipse so that it can work as standalone.

Search engines Used for Crawling:

Google, Ask, Baidu, Bing, Crtsh, DNSDumpster, Netcraft, Passive DNS, Virustotal, SSLSAN, ThreatCrowd, Yahoo


Project Explanation:

1) Driver folder contains the chromedriver which is used by selenium

2) Crawler folder contains extraction rules for each Search engine

3) Driver is the main class

4) subdomainList variable in Driver class will hold all the subdomains found

Let me know if you have any doubt on any of the program step or you could suggest some new search engines which we can add to the list.

No comments:

Post a Comment