diff --git a/.dockerignore b/.dockerignore new file mode 100644 index 00000000..3214bcef --- /dev/null +++ b/.dockerignore @@ -0,0 +1,4 @@ +.git/ +.vscode/ +*.txt +!/requirements.txt \ No newline at end of file diff --git a/.gitignore b/.gitignore index 753dfef7..d1db3b66 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,19 @@ +# Virtual Environment +venv/ + +# vscode +.vscode/ + +# Python +__pycache__/ + # Jupyter Notebook .ipynb_checkpoints *.ipynb + +# Output files, except requirements.txt +*.txt +!requirements.txt + +# Comma-Separated Values (CSV) Reports +*.csv diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 00000000..f54c40d5 --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,76 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +In the interest of fostering an open and welcoming environment, we as +contributors and maintainers pledge to making participation in our project and +our community a harassment-free experience for everyone, regardless of age, body +size, disability, ethnicity, sex characteristics, gender identity and expression, +level of experience, education, socio-economic status, nationality, personal +appearance, race, religion, or sexual identity and orientation. + +## Our Standards + +Examples of behavior that contributes to creating a positive environment +include: + +* Using welcoming and inclusive language +* Being respectful of differing viewpoints and experiences +* Gracefully accepting constructive criticism +* Focusing on what is best for the community +* Showing empathy towards other community members + +Examples of unacceptable behavior by participants include: + +* The use of sexualized language or imagery and unwelcome sexual attention or + advances +* Trolling, insulting/derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or electronic + address, without explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Our Responsibilities + +Project maintainers are responsible for clarifying the standards of acceptable +behavior and are expected to take appropriate and fair corrective action in +response to any instances of unacceptable behavior. + +Project maintainers have the right and responsibility to remove, edit, or +reject comments, commits, code, wiki edits, issues, and other contributions +that are not aligned to this Code of Conduct, or to ban temporarily or +permanently any contributor for other behaviors that they deem inappropriate, +threatening, offensive, or harmful. + +## Scope + +This Code of Conduct applies both within project spaces and in public spaces +when an individual is representing the project or its community. Examples of +representing a project or community include using an official project e-mail +address, posting via an official social media account, or acting as an appointed +representative at an online or offline event. Representation of a project may be +further defined and clarified by project maintainers. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported by contacting the project team at yahya.arbabi@gmail.com. All +complaints will be reviewed and investigated and will result in a response that +is deemed necessary and appropriate to the circumstances. The project team is +obligated to maintain confidentiality with regard to the reporter of an incident. +Further details of specific enforcement policies may be posted separately. + +Project maintainers who do not follow or enforce the Code of Conduct in good +faith may face temporary or permanent repercussions as determined by other +members of the project's leadership. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, +available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html + +[homepage]: https://www.contributor-covenant.org + +For answers to common questions about this code of conduct, see +https://www.contributor-covenant.org/faq diff --git a/Dockerfile b/Dockerfile new file mode 100644 index 00000000..ee118b97 --- /dev/null +++ b/Dockerfile @@ -0,0 +1,6 @@ +FROM python:3.7-alpine +RUN /sbin/apk add tor +COPY . /opt/sherlock/ +RUN /usr/local/bin/pip install -r /opt/sherlock/requirements.txt + +ENTRYPOINT ["python", "/opt/sherlock/sherlock.py"] \ No newline at end of file diff --git a/README.md b/README.md index 467aabb6..7985691a 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,14 @@ # Sherlock -> Find usernames across over 75 social networks +> Find usernames across [social networks](https://github.com/sdushantha/sherlock/blob/master/sites.md)

- - +

## Installation +**NOTE**: Python 3.6 or higher is required. + ```bash # clone the repo $ git clone https://github.com/sdushantha/sherlock.git @@ -20,10 +21,53 @@ $ pip3 install -r requirements.txt ``` ## Usage -Just run ```python3 sherlock.py``` -All of the accounts found will be stored in a text file with their username (e.g ```user123.txt```) +```bash +$ python3 sherlock.py --help +usage: sherlock.py [-h] [--version] [--verbose] [--quiet] [--csv] [--tor] [--unique-tor] + USERNAMES [USERNAMES ...] + +Sherlock: Find Usernames Across Social Networks (Version 2018.12.30) + +positional arguments: + USERNAMES One or more usernames to check with social networks. + +optional arguments: + -h, --help show this help message and exit + --version Display version information and dependencies. + --verbose, -v, -d, --debug + Display extra debugging information. + --quiet, -q Disable debugging information (Default Option). + --csv Create Comma-Separated Values (CSV) File. + --tor, -t Make requests over TOR; increases runtime; requires TOR to be installed and in system path. + --unique-tor, -u Make requests over TOR with new TOR circuit after each request; increases runtime; requires TOR to be installed and in system path. +``` + +For example, run ```python3 sherlock.py user123```, and all of the accounts +found will be stored in a text file with the username (e.g ```user123.txt```). + +## Docker Notes +If you have docker installed you can build an image and run this as a container. + +``` +docker build -t mysherlock-image . +``` + +Once the image is built sherlock can be invoked by running the following: + +``` +docker run --rm mysherlock-image user123 +``` + +The ```--rm``` flag is optional. It removes the container filesystem after running so you do not have a bunch of leftover container filesystem cruft. See https://docs.docker.com/engine/reference/run/#clean-up---rm + +One caveat is the text file that is created will only exist in the container so you will not be able to get at that. + +Or you can simply use "Docker Hub" to run `sherlock`: +``` +docker run theyahya/sherlock user123 +``` ## License MIT License diff --git a/data.json b/data.json index 426f8c1f..1fedcea4 100644 --- a/data.json +++ b/data.json @@ -1,355 +1,576 @@ { "Instagram": { "url": "https://www.instagram.com/{}", + "urlMain": "https://www.instagram.com/", "errorType": "message", "errorMsg": "The link you followed may be broken" }, "Twitter": { "url": "https://www.twitter.com/{}", + "urlMain": "https://www.twitter.com/", "errorType": "message", "errorMsg": "page doesn’t exist" }, "Facebook": { "url": "https://www.facebook.com/{}", - "errorType": "status_code" + "urlMain": "https://www.facebook.com/", + "errorType": "status_code", + "regexCheck": "^[a-zA-Z0-9]{4,49}(?>>>>>> 0d857030939da206f9e6098241ff80d869ae80e8 headers = { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:55.0) Gecko/20100101 Firefox/55.0' } +<<<<<<< HEAD for social_network in social_networks_params: url = social_networks_params.get(social_network).get("url").format(username) error_type = social_networks_params.get(social_network).get("errorType") @@ -91,4 +200,280 @@ def main(): print(f"\033[1;92m[\033[0m\033[1;77m*\033[0m\033[1;92m] Saved: \033[37;1m{filename}\033[0m") if '__main__' in __name__: - main() \ No newline at end of file + main() +======= + # Load the data + data_file_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "data.json") + with open(data_file_path, "r", encoding="utf-8") as raw: + data = json.load(raw) + + # Allow 1 thread for each external service, so `len(data)` threads total + executor = ThreadPoolExecutor(max_workers=len(data)) + + # Create session based on request methodology + underlying_session = requests.session() + underlying_request = requests.Request() + if tor or unique_tor: + underlying_request = TorRequest() + underlying_session = underlying_request.session() + + # Create multi-threaded session for all requests + session = FuturesSession(executor=executor, session=underlying_session) + + # Results from analysis of all sites + results_total = {} + + # First create futures for all requests. This allows for the requests to run in parallel + for social_network, net_info in data.items(): + + # Results from analysis of this specific site + results_site = {} + + # Record URL of main site + results_site['url_main'] = net_info.get("urlMain") + + # Don't make request if username is invalid for the site + regex_check = net_info.get("regexCheck") + if regex_check and re.search(regex_check, username) is None: + # No need to do the check at the site: this user name is not allowed. + print((Style.BRIGHT + Fore.WHITE + "[" + + Fore.RED + "-" + + Fore.WHITE + "]" + + Fore.GREEN + " {}:" + + Fore.YELLOW + " Illegal Username Format For This Site!").format(social_network)) + results_site["exists"] = "illegal" + else: + # URL of user on site (if it exists) + url = net_info["url"].format(username) + results_site["url_user"] = url + + # If only the status_code is needed don't download the body + if net_info["errorType"] == 'status_code': + request_method = session.head + else: + request_method = session.get + + # This future starts running the request in a new thread, doesn't block the main thread + future = request_method(url=url, headers=headers) + + # Store future in data for access later + net_info["request_future"] = future + + # Reset identify for tor (if needed) + if unique_tor: + underlying_request.reset_identity() + + # Add this site's results into final dictionary with all of the other results. + results_total[social_network] = results_site + + # Open the file containing account links + f = open_file(fname) + + # Core logic: If tor requests, make them here. If multi-threaded requests, wait for responses + for social_network, net_info in data.items(): + + # Retrieve results again + results_site = results_total.get(social_network) + + # Retrieve other site information again + url = results_site.get("url_user") + exists = results_site.get("exists") + if exists is not None: + # We have already determined the user doesn't exist here + continue + + # Get the expected error type + error_type = net_info["errorType"] + + # Default data in case there are any failures in doing a request. + http_status = "?" + response_text = "" + + # Retrieve future and ensure it has finished + future = net_info["request_future"] + r, error_type = get_response(request_future=future, + error_type=error_type, + social_network=social_network, + verbose=verbose) + + # Attempt to get request information + try: + http_status = r.status_code + except: + pass + try: + response_text = r.text.encode(r.encoding) + except: + pass + + if error_type == "message": + error = net_info.get("errorMsg") + # Checks if the error message is in the HTML + if not error in r.text: + + print((Style.BRIGHT + Fore.WHITE + "[" + + Fore.GREEN + "+" + + Fore.WHITE + "]" + + Fore.GREEN + " {}:").format(social_network), url) + write_to_file(url, f) + exists = "yes" + amount=amount+1 + else: + print((Style.BRIGHT + Fore.WHITE + "[" + + Fore.RED + "-" + + Fore.WHITE + "]" + + Fore.GREEN + " {}:" + + Fore.YELLOW + " Not Found!").format(social_network)) + exists = "no" + + elif error_type == "status_code": + # Checks if the status code of the response is 2XX + if not r.status_code >= 300 or r.status_code < 200: + + print((Style.BRIGHT + Fore.WHITE + "[" + + Fore.GREEN + "+" + + Fore.WHITE + "]" + + Fore.GREEN + " {}:").format(social_network), url) + write_to_file(url, f) + exists = "yes" + amount=amount+1 + else: + print((Style.BRIGHT + Fore.WHITE + "[" + + Fore.RED + "-" + + Fore.WHITE + "]" + + Fore.GREEN + " {}:" + + Fore.YELLOW + " Not Found!").format(social_network)) + exists = "no" + + elif error_type == "response_url": + error = net_info.get("errorUrl") + # Checks if the redirect url is the same as the one defined in data.json + if not error in r.url: + + print((Style.BRIGHT + Fore.WHITE + "[" + + Fore.GREEN + "+" + + Fore.WHITE + "]" + + Fore.GREEN + " {}:").format(social_network), url) + write_to_file(url, f) + exists = "yes" + amount=amount+1 + else: + print((Style.BRIGHT + Fore.WHITE + "[" + + Fore.RED + "-" + + Fore.WHITE + "]" + + Fore.GREEN + " {}:" + + Fore.YELLOW + " Not Found!").format(social_network)) + exists = "no" + + elif error_type == "": + print((Style.BRIGHT + Fore.WHITE + "[" + + Fore.RED + "-" + + Fore.WHITE + "]" + + Fore.GREEN + " {}:" + + Fore.YELLOW + " Error!").format(social_network)) + exists = "error" + + # Save exists flag + results_site['exists'] = exists + + # Save results from request + results_site['http_status'] = http_status + results_site['response_text'] = response_text + + # Add this site's results into final dictionary with all of the other results. + results_total[social_network] = results_site + + print((Style.BRIGHT + Fore.GREEN + "[" + + Fore.YELLOW + "*" + + Fore.GREEN + "] Saved: " + + Fore.WHITE + "{}").format(fname)) + + final_score(amount, f) + return results_total + + +def main(): + # Colorama module's initialization. + init(autoreset=True) + + version_string = f"%(prog)s {__version__}\n" + \ + f"{requests.__description__}: {requests.__version__}\n" + \ + f"Python: {platform.python_version()}" + + parser = ArgumentParser(formatter_class=RawDescriptionHelpFormatter, + description=f"{module_name} (Version {__version__})" + ) + parser.add_argument("--version", + action="version", version=version_string, + help="Display version information and dependencies." + ) + parser.add_argument("--verbose", "-v", "-d", "--debug", + action="store_true", dest="verbose", default=False, + help="Display extra debugging information." + ) + parser.add_argument("--quiet", "-q", + action="store_false", dest="verbose", + help="Disable debugging information (Default Option)." + ) + parser.add_argument("--tor", "-t", + action="store_true", dest="tor", default=False, + help="Make requests over TOR; increases runtime; requires TOR to be installed and in system path.") + parser.add_argument("--unique-tor", "-u", + action="store_true", dest="unique_tor", default=False, + help="Make requests over TOR with new TOR circuit after each request; increases runtime; requires TOR to be installed and in system path.") + parser.add_argument("--csv", + action="store_true", dest="csv", default=False, + help="Create Comma-Separated Values (CSV) File." + ) + parser.add_argument("username", + nargs='+', metavar='USERNAMES', + action="store", + help="One or more usernames to check with social networks." + ) + + args = parser.parse_args() + + # Banner + print(Fore.WHITE + Style.BRIGHT + +""" .\"\"\"-. + / \\ + ____ _ _ _ | _..--'-. +/ ___|| |__ ___ _ __| | ___ ___| |__ >.`__.-\"\"\;\"` +\___ \| '_ \ / _ \ '__| |/ _ \ / __| |/ / / /( ^\\ + ___) | | | | __/ | | | (_) | (__| < '-`) =|-. +|____/|_| |_|\___|_| |_|\___/ \___|_|\_\ /`--.'--' \ .-. + .'`-._ `.\ | J / + / `--.| \__/""") + + if args.tor or args.unique_tor: + print("Warning: some websites might refuse connecting over TOR, so note that using this option might increase connection errors.") + + # Run report on all specified users. + for username in args.username: + print() + results = sherlock(username, verbose=args.verbose, tor=args.tor, unique_tor=args.unique_tor) + + if args.csv == True: + with open(username + ".csv", "w", newline='') as csv_report: + writer = csv.writer(csv_report) + writer.writerow(['username', + 'name', + 'url_main', + 'url_user', + 'exists', + 'http_status' + ] + ) + for site in results: + writer.writerow([username, + site, + results[site]['url_main'], + results[site]['url_user'], + results[site]['exists'], + results[site]['http_status'] + ] + ) + +if __name__ == "__main__": + main() +>>>>>>> 0d857030939da206f9e6098241ff80d869ae80e8 diff --git a/sherlock_preview.png b/sherlock_preview.png deleted file mode 100644 index 30e294d9..00000000 Binary files a/sherlock_preview.png and /dev/null differ diff --git a/site_list.py b/site_list.py new file mode 100644 index 00000000..9cd76461 --- /dev/null +++ b/site_list.py @@ -0,0 +1,19 @@ +"""Sherlock: Supported Site Listing + +This module generates the listing of supported sites. +""" +import json + +with open("data.json", "r", encoding="utf-8") as data_file: + data = json.load(data_file) + +with open("sites.md", "w") as site_file: + site_file.write(f'## List Of Supported Sites ({len(data)} Sites In Total!)\n') + + index = 1 + for social_network in data: + url_main = data.get(social_network).get("urlMain") + site_file.write(f'{index}. [{social_network}]({url_main})\n') + index = index + 1 + +print("Finished updating supported site listing!") diff --git a/sites.md b/sites.md new file mode 100644 index 00000000..b5bb2889 --- /dev/null +++ b/sites.md @@ -0,0 +1,102 @@ +## List Of Supported Sites (100 Sites In Total!) +1. [Instagram](https://www.instagram.com/) +2. [Twitter](https://www.twitter.com/) +3. [Facebook](https://www.facebook.com/) +4. [YouTube](https://www.youtube.com/) +5. [Blogger](https://www.blogger.com/) +6. [Google Plus](https://plus.google.com/) +7. [Reddit](https://www.reddit.com/) +8. [Pinterest](https://www.pinterest.com/) +9. [GitHub](https://www.github.com/) +10. [Steam](https://steamcommunity.com/) +11. [Vimeo](https://vimeo.com/) +12. [SoundCloud](https://soundcloud.com/) +13. [Disqus](https://disqus.com/) +14. [Medium](https://medium.com/) +15. [DeviantART](https://deviantart.com) +16. [VK](https://vk.com/) +17. [About.me](https://about.me/) +18. [Imgur](https://imgur.com/) +19. [9GAG](https://9gag.com/) +20. [Flipboard](https://flipboard.com/) +21. [SlideShare](https://slideshare.net/) +22. [Fotolog](https://fotolog.com/) +23. [Spotify](https://open.spotify.com/) +24. [MixCloud](https://www.mixcloud.com/) +25. [Scribd](https://www.scribd.com/) +26. [Patreon](https://www.patreon.com/) +27. [BitBucket](https://bitbucket.org/) +28. [Roblox](https://www.roblox.com/) +29. [Gravatar](http://en.gravatar.com/) +30. [iMGSRC.RU](https://imgsrc.ru/) +31. [DailyMotion](https://www.dailymotion.com/) +32. [Etsy](https://www.etsy.com/) +33. [CashMe](https://cash.me/) +34. [Behance](https://www.behance.net/) +35. [GoodReads](https://www.goodreads.com/) +36. [Instructables](https://www.instructables.com/) +37. [Keybase](https://keybase.io/) +38. [Kongregate](https://www.kongregate.com/) +39. [LiveJournal](https://www.livejournal.com/) +40. [VSCO](https://vsco.co/) +41. [AngelList](https://angel.co/) +42. [last.fm](https://last.fm/) +43. [Dribbble](https://dribbble.com/) +44. [Codecademy](https://www.codecademy.com/) +45. [Pastebin](https://pastebin.com/) +46. [Foursquare](https://foursquare.com/) +47. [Gumroad](https://www.gumroad.com/) +48. [Newgrounds](https://newgrounds.com) +49. [Wattpad](https://www.wattpad.com/) +50. [Canva](https://www.canva.com/) +51. [Trakt](https://www.trakt.tv/) +52. [500px](https://500px.com/) +53. [BuzzFeed](https://buzzfeed.com/) +54. [TripAdvisor](https://tripadvisor.com/) +55. [Contently](https://contently.com/) +56. [Houzz](https://houzz.com/) +57. [BLIP.fm](https://blip.fm/) +58. [HackerNews](https://news.ycombinator.com/) +59. [Codementor](https://www.codementor.io/) +60. [ReverbNation](https://www.reverbnation.com/) +61. [Designspiration](https://www.designspiration.net/) +62. [Bandcamp](https://www.bandcamp.com/) +63. [ColourLovers](https://www.colourlovers.com/) +64. [IFTTT](https://www.ifttt.com/) +65. [Ebay](https://www.ebay.com/) +66. [Slack](https://slack.com) +67. [Trip](https://www.trip.skyscanner.com/) +68. [Ello](https://ello.co/) +69. [HackerOne](https://hackerone.com/) +70. [Tinder](https://tinder.com/) +71. [We Heart It](https://weheartit.com/) +72. [Flickr](https://www.flickr.com/) +73. [WordPress](https://wordpress.com) +74. [Unsplash](https://unsplash.com/) +75. [Pexels](https://www.pexels.com/) +76. [devRant](https://devrant.com/) +77. [MyAnimeList](https://myanimelist.net/) +78. [ImageShack](https://imageshack.us/) +79. [Badoo](https://badoo.com/) +80. [MeetMe](https://www.meetme.com/) +81. [Quora](https://www.quora.com/) +82. [Pixabay](https://pixabay.com/) +83. [Giphy](https://giphy.com/) +84. [Taringa](https://taringa.net/) +85. [SourceForge](https://sourceforge.net/) +86. [Codepen](https://codepen.io/) +87. [Launchpad](https://launchpad.net/) +88. [Photobucket](https://photobucket.com/) +89. [Wix](https://wix.com/) +90. [Crevado](https://crevado.com/) +91. [Carbonmade](https://carbonmade.com/) +92. [Coroflot](https://coroflot.com/) +93. [Jimdo](https://jimdosite.com/) +94. [Repl.it](https://repl.it/) +95. [Issuu](https://issuu.com/) +96. [YouPic](https://youpic.com/) +97. [House-Mixes.com](https://www.house-mixes.com/) +98. [Letterboxd](https://letterboxd.com/) +99. [Coderwall](https://coderwall.com/) +100. [Wikipedia](https://www.wikipedia.org/) +101. [Mastodon](https://mstdn.io/)