Add override to HTTP Status Detection so HEAD request is not used. Configure Instagram to use this option.

In most cases when we are detecting by status code, it is not necessary to get the entire body:  we can detect fine with just the HEAD response.  However, Richard Getz discovered that some sites (e.g. Instagram) will not respond properly if Sherlock only requests the HEAD.

Add a "request_head_only" attribute to the data.json so HTTP Status Detection can be configured either way.  It is simpler to support this change in this fashion, as it does not require changes to the tests.

With Richard Getz <richardgetziii@gmail.com>
pull/599/head
Christopher K. Hoadley 4 years ago
parent 0ba4980887
commit 8619a353e4

@ -68,7 +68,7 @@ usage: sherlock [-h] [--version] [--verbose] [--rank]
[--no-color] [--browse]
USERNAMES [USERNAMES ...]
Sherlock: Find Usernames Across Social Networks (Version 0.11.1)
Sherlock: Find Usernames Across Social Networks (Version 0.12.0)
positional arguments:
USERNAMES One or more usernames to check with social networks.

@ -882,6 +882,7 @@
},
"Instagram": {
"errorType": "status_code",
"request_head_only": false,
"rank": 35,
"url": "https://www.instagram.com/{}",
"urlMain": "https://www.instagram.com/",

@ -30,7 +30,7 @@ from notify import QueryNotifyPrint
from sites import SitesInformation
module_name = "Sherlock: Find Usernames Across Social Networks"
__version__ = "0.11.1"
__version__ = "0.12.0"
@ -237,10 +237,16 @@ def sherlock(username, site_data, query_notify,
# from where the user profile normally can be found.
url_probe = url_probe.format(username)
#If only the status_code is needed don't download the body
if net_info["errorType"] == 'status_code':
if (net_info["errorType"] == 'status_code' and
net_info.get("request_head_only", True) == True):
#In most cases when we are detecting by status code,
#it is not necessary to get the entire body: we can
#detect fine with just the HEAD response.
request_method = session.head
else:
#Either this detect method needs the content associated
#with the GET response, or this specific website will
#not respond properly unless we request the whole page.
request_method = session.get
if net_info["errorType"] == "response_url":

Loading…
Cancel
Save