Christopher K. Hoadley
26ef2e1b9b
Convert Codementor to use the Status Code detection method. The site gives a clean 404 error. Add to tests.
6 years ago
Christopher K. Hoadley
110b93a757
Convert Codecademy to use the Status Code detection method. The site gives a clean 404 error. Add to tests.
6 years ago
Christopher K. Hoadley
223d9716cb
Convert BuzzFeed to use the Status Code detection method. The site gives a clean 404 error. Add to tests.
6 years ago
Christopher K. Hoadley
08ac008828
Convert Behance to use the Status Code detection method. The site gives a clean 404 error. Add to tests.
6 years ago
Christopher K. Hoadley
65e3820608
Convert Bandcamp to use the Status Code detection method. The site gives a clean 404 error. Add to tests.
6 years ago
Christopher K. Hoadley
c76b4524da
Convert BLIP.fm to use the Status Code detection method. The site gives a clean 404 error. Add to tests.
6 years ago
Christopher K. Hoadley
8a82d883c6
Convert AngelList to use the Status Code detection method. The site gives a clean 404 error. Add to tests.
6 years ago
Christopher K. Hoadley
89787b1509
Add test methods for HTTP Status detection method as well.
6 years ago
Christopher K. Hoadley
bd941c8034
Convert Academia.edu to use the Status Code detection method. The site gives a clean 404 error.
6 years ago
Christopher K. Hoadley
f609320d3c
Convert Canva to the more robust Response URL detection method. Add to tests to ensure that it is covered.
6 years ago
Yahya SayadArbabi
916fdd0603
Merge branch 'BlucyBlue-master'
6 years ago
Yahya SayadArbabi
f69be05803
Rebase & bump version
6 years ago
BlucyBlue
465f4c85c3
Typo in printout when reading proxies from file.
6 years ago
BlucyBlue
9f523365f7
Finally importing load_proxies module.
6 years ago
BlucyBlue
8587d1a835
If the ProxyError gets raised in the 'get_response' function, the request will be tried with another proxy selected from the 'proxy_list' global var. New parameter 'retry_no' is the number of retries that will be made before throwing a final ProxyError.
6 years ago
BlucyBlue
6bf8358342
Set new parameter 'retry_no' of the 'get_response' function to 3 (can be changed). This will be used if retrying a ProxyError.
6 years ago
BlucyBlue
855f154d9b
If the 'proxy_list' we select a random member and pass it as the proxy to the session. If the list is empty, the proxy parameter will be set to arg.proxy, which defaults to None if the user did not pass an individual proxy as well.
6 years ago
BlucyBlue
2accdcafea
If the user selected --check_proxies option along with --proxy_list option, proxies loaded from the .csv file are checked using the check_proxies function from the load_proxies module. Proxies which pass the test are stored in the proxy_list global var.
6 years ago
BlucyBlue
6cc4e22898
If the user selected --proxy_list option, we attempt to read proxies from the csv, and store the list in global var proxy_list.
6 years ago
BlucyBlue
bd683022b3
Exception is raised if both a single proxy and the proxy_list are used. As needed, this can be changed to merging the single proxy with the proxy list, but seems a bit unnecessary at this time.
6 years ago
BlucyBlue
dc32d473e0
Exception will now be raised if etiher a single proxy or proxy_list options are used along with Tor.
6 years ago
BlucyBlue
c5e06b068e
Added two new arguments, '--proxy_list'/'-pl' and '--check_proxies'/'-cp', for users to activate options of reading proxies from a document (at this time, only .csv is supported), and check their anonimity before using them.
6 years ago
BlucyBlue
166d224423
First change to 'sherlock.py' for use of load_proxies module. Global variable proxy_list is created, and by default set to an empty list. This variable will store proxies from a proxy list (if this option is used), and will enable different threads to access proxies at the same time.
6 years ago
BlucyBlue
65a040dbbb
Function 'check_proxy_list' which checks anonimity of each proxy contained in a list of named tuples. Proxies are checked by using the 'check_proxy' function.
6 years ago
BlucyBlue
901074ea4e
Function 'check_proxy', which checks anonimity of a signle proxy by anaylizing return headers received from a request using the proxy in question.
6 years ago
BlucyBlue
a63bdb3152
Created new file 'load_proxies.py' to store functions for reading proxies from files, and checking proxy anonimity. Created the function 'load_proxies_from_csv' which reads proxies from a .csv file to a list of named tuples.
6 years ago
Yahya SayadArbabi
263b8b3b90
Merge branch 'aditisrinivas97-master'
6 years ago
Yahya SayadArbabi
67108071e5
bump version
6 years ago
aditisrinivas97
619d9ab6bc
Fix issue with site name and url
6 years ago
Yahya SayadArbabi
011df7af55
bump version
6 years ago
Yahya SayadArbabi
fd63e1093f
Merge branch 'avinashshenoy97/patch-1'
6 years ago
Avinash Shenoy
3db3f4558b
Parallelized updating alexa ranking
6 years ago
Avinash Shenoy
1442f333c2
Parallelised updating Alexa.com ranking of sites
...
Script now fetches Alexa ranks for sites concurrently on separate threads. Cuts down the time to sync ranks from approximately **5 minutes** to about **18 seconds**.
6 years ago
Yahya SayadArbabi
269df6d549
Merge pull request #151 from ptalmeida/master
...
Fix readme and instrallpackages.sh typo
6 years ago
ptalmeida
8ee50e6717
Fix typo
...
necessery -> necessary
6 years ago
ptalmeida
85d7be3e77
Actually bring README.md up to date
6 years ago
Yahya SayadArbabi
d6b7c0ac55
Merge branch 'ptalmeida-Add-sorgin-by-alexa-rank-functionality'
6 years ago
ptalmeida
8b681158bc
small corrections to rank sort
6 years ago
ptalmeida
78ade00dee
Update outdated REAME.md
6 years ago
ptalmeida
5d972a3138
add --rank -r option to sherlock
6 years ago
ptalmeida
55d43b0ee6
Update requirements.txt
6 years ago
ptalmeida
db0cf7c289
Update requirements.txt
6 years ago
ptalmeida
826af1ec19
remove unused import
6 years ago
Yahya SayadArbabi
2408bb520e
Merge branch 'UltraWelfare/optional_output'
6 years ago
George Tsomlektsis
0e6b8d0dca
Added optional parameters for outputting files and folders.
6 years ago
George Tsomlektsis
f511faab23
Added the ability to load external json files.
6 years ago
ptalmeida
0b96141df0
Merge remote-tracking branch 'upstream/master'
6 years ago
ptalmeida
9c45146da1
remove unused import
6 years ago
ptalmeida
cc2b1cb27a
Improve terminal appearence for site_list.py
6 years ago
ptalmeida
40fc51fc32
add rank paramether to site_list.py
...
--rank or -r to update all page ranks
6 years ago