8000 List of possible bots to check · Issue #6315 · matomo-org/device-detector · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

List of possible bots to check #6315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sgiehl opened this issue Jun 23, 2020 · 8 comments
Open

List of possible bots to check #6315

sgiehl opened this issue Jun 23, 2020 · 8 comments

Comments

@sgiehl
Copy link
Member
sgiehl commented Jun 23, 2020

List of useragents from matomo-org/matomo-log-analytics#239 that are currently not detected as bot (or library):

  • Hatena-Favicon2 (http://www.hatena.ne.jp/faq/)
  • Mozilla/5.0 (X11; Linux x86_64; rv:20.0; Favicon; +https://github.com/ArthurHoaro/favicon) Gecko/20100101 Firefox/32.0
  • com.ddeville.llwebkit.favicon/158 CFNetwork/902.1 Darwin/17.7.0 (x86_64)
  • com.ddeville.llwebkit.favicon/158 CFNetwork/974.1 Darwin/18.0.0 (x86_64)
  • Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.170 Safari/537.36 Thumbshots/10.0.0.0
  • Thumbor/6.4.2
  • rethumb/v1 (http://rethumb.com)
  • FetchStream
  • Hatena::Fetcher/0.01 (master) Furl/3.13
  • Hatena::Fetcher/0.01 (master) Furl/3.06
  • safarifetcherd/604.1 CFNetwork/974.2.1 Darwin/18.0.0
  • safarifetcherd/604.1 CFNetwork/975.0.3 Darwin/18.2.0
  • safarifetcherd/604.1 CFNetwork/901.1 Darwin/17.6.0
  • safarifetcherd/604.1 CFNetwork/976 Darwin/18.2.0
  • safarifetcherd/604.1 CFNetwork/958.1 Darwin/18.0.0
  • HatenaBookmark/0.03 (compatible; entryimage-fetcher)
  • Mozilla/5.0 (compatible; ImageFetcher/7.0; +http://images.weserv.nl/)
  • Mozilla/5.0 (compatible; ImageFetcher/8.0; +http://images.weserv.nl/)
  • Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0 ; BacklinkHttpStatus)
  • Hatena Star UserAgent/2
  • Hatena-Favicon2 (http://www.hatena.ne.jp/faq/)
  • Hatena::Fetcher/0.01 (master) Furl/3.13
  • Hatena::Fetcher/0.01 (master) Furl/3.06
  • HatenaBookmark/0.03 (Hatena::Bookmark; master;) Furl/3.13
  • HatenaBookmark/4.0 (Hatena::Bookmark; Analyzer)
  • Hatena::Scissors/0.01
  • HatenaBookmark/4.0 (Hatena::Bookmark; Scissors)
  • HatenaBookmark/0.03 (compatible; entryimage-fetcher)
  • Elytra/0.10.0 (Macintosh; Ubuntu/14.06) GCDHTTPRequest
  • Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)
  • request.js
  • Zend_Http_Client
  • EventMachine HttpClient
  • HTTPClient/1.0 (2.6.0.1, ruby 2.0.0 (2015-12-16))
  • HTTPClient/1.0 (2.8.3, ruby 2.2.1 (2015-02-26))
  • Jakarta Commons-HttpClient/3.1
  • Mozilla/5.0 (compatible; Funnelback) RPT-HTTPClient/0.3-3E
  • AppEngine-Google; (+http://code.google.com/appengine; appid: s~readability-api-hrd)
  • AppEngine-Google; (+http://code.google.com/appengine; appid: e~finscience-1253)
  • AppEngine-Google; (+http://code.google.com/appengine; appid: s~cdn-dinoia)
  • GAE AppEngine-Google; (+http://code.google.com/appengine; appid: s~ga-mozilla-org-prod-001)
  • Mozilla/5.0 AppEngine-Google; (+http://code.google.com/appengine; appid: s~mendoapp1)
  • Mozilla/5.0 AppEngine-Google; (+http://code.google.com/appengine; appid: s~vodio-app)
  • Mozilla/5.0 AppEngine-Google; (+http://code.google.com/appengine; appid: e~finscience-1253)
  • Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) AppEngine-Google; (+http://code.google.com/appengine; appid: s~xiaohe18675)
  • Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) AppEngine-Google; (+http://code.google.com/appengine; appid: s~thgntk2)
  • Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) AppEngine-Google; (+http://code.google.com/appengine; appid: s~kindle-11)
  • Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) AppEngine-Google; (+http://code.google.com/appengine; appid: s~feedkin)
  • Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) AppEngine-Google; (+http://code.google.com/appengine; appid: s~ebook-rexdf)
  • Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US) AppEngine-Google; (+http://code.google.com/appengine; appid: s~virustotalcloud)
  • Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36 AppEngine-Google; (+http://code.google.com/appengine; appid: s~simple-rss-proxy)
  • Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729) AppEngine-Google; (+http://code.google.com/appengine; appid: s~theajaxpost)
  • Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36 AppEngine-Google; (+http://code.google.com/appengine; appid: s~twilinks123)
  • Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0) AppEngine-Google; (+http://code.google.com/appengine; appid: b~thatkindleear)
  • Evergreen (macOS; RSS Reader; https://ranchero.com/evergreen/)
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit/602.3.12 (KHTML, like Gecko) NetNewsWire/3.3.2
  • Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/534.57.7 (KHTML, like Gecko) NetNewsWire/3.3.2
  • NetNewsWire (macOS; RSS Reader; https://ranchero.com/netnewswire/)
  • NetNewsWire/3.3.2 (Mac OS X; http://netnewswireapp.com/mac/; gzip-happy)
  • NetNewsWire/4.1.0 (Mac OS X; http://netnewswireapp.com/mac/; gzip-happy)
  • Evergreen (macOS; RSS Reader; https://ranchero.com/evergreen/)
  • Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36 AppEngine-Google; ( http://code.google.com/appengine; appid: s~simple-rss-proxy)
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) QuiteRss/0.18.12 Safari/538.1
  • NetNewsWire (macOS; RSS Reader; https://ranchero.com/netnewswire/)
  • rss/10 CFNetwork/976 Darwin/18.2.0
  • rss/5 CFNetwork/897.15 Darwin/17.5.0
  • rss/5 CFNetwork/902.2 Darwin/17.7.0
  • rss/5 CFNetwork/974.2.1 Darwin/17.7.0
  • rss/8 CFNetwork/974.2.1 Darwin/18.0.0
  • rss/8 CFNetwork/975.0.3 Darwin/17.7.0
  • ruby:net.bhaak.rss2html:1.0
  • Visor News Reader RSS/1.0 (admin@visorco.com)
  • Winds: Open Source RSS & Podcast app: https://getstream.io/winds/
  • RSSOwl/2.2.1.201312301314 (Windows; U; en)
  • Ruby
  • Ruby, Twurly v1.1 (https://twurly.org)
  • ruby:net.bhaak.rss2html:1.0
  • HTTPClient/1.0 (2.6.0.1, ruby 2.0.0 (2015-12-16))
  • HTTPClient/1.0 (2.8.3, ruby 2.2.1 (2015-02-26))
  • Mozilla/5.0 (Mixnode) AppleWebKit/537.36 (KHTML, like Gecko)
  • node-superagent/3.8.2
  • node-superagent/2.3.0
  • node-superagent/0.18.2
  • node.io
  • node.js
  • Node.js (linux; U; rv:v6.9.1) AppleWebKit/537.36 (KHTML, like Gecko)
  • http.rb/4.0.0
  • http.rb/3.3.0
  • http.rb/3.0.0
  • Mozilla/5.0 (compatible; adscanner/)
  • Mozilla/5.0 (compatible; DNS SSL/TLS HTTP HTML Website Security Scanner/0.2 beta; +https://www.htmlyse.com/)
  • httpscheck (unknown version) CFNetwork/897.15 Darwin/17.5.0 (x86_64)
  • httpscheck (unknown version) CFNetwork/893.13.1 Darwin/17.4.0 (x86_64)
  • httpscheck (unknown version) CFNetwork/976 Darwin/18.2.0 (x86_64)
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36 (compatible; linkCheckV3.0)
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) wkhtmltoimage Safari/538.1
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) wkhtmltopdf Safari/534.34
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) wkhtmltoimage Safari/534.34
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) wkhtmltoimage Version/7.0 Safari/538.1
  • Mozilla/5.0 (X11; BSD Four) AppleWebKit/534.34 (KHTML, like Gecko) wkhtmltoimage Safari/534.34
  • Houdini%20Sweeper/3000 CFNetwork/893.13.1 Darwin/17.4.0 (x86_64)
  • Houdini%20Sweeper/3000 CFNetwork/897.15 Darwin/17.5.0 (x86_64)
  • GoScraper
  • MetadataScraper
  • Embed PHP Library
  • Embed PHP library
  • SnowHaze Search/1.0 support@snowhaze.com
  • WordupInfoSearch/1.0
  • WordupinfoSearch/1.0
  • Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.134 Safari/537.36 http://notifyninja.com/monitoring
  • Iframely/0.8.5 (+http://iframely.com/;)
  • Iframely/1.2.2 (+http://iframely.com/;)
  • Iframely/1.2.5 (+http://iframely.com/;)
  • Iframely/1.2.7 (+http://iframely.com/;)
  • Iframely/1.0.4 (+http://iframely.com/;)
  • Iframely/1.2.7 (+https://iframely.com/;)
  • G-i-g-a-b-o-t
  • B-l-i-t-z-B-O-T/6.3 (Ubuntu 7.0; zh_TW;)
  • B-l-i-t-z-B-O-T/4.3 (AmigaOS 2.3; fr_LU;)
  • B-l-i-t-z-B-O-T/1.9 (Windows Vista 5.7; fr_CA;)
  • inbound.li parser
  • Mozilla/5.0 (compatible; WebDataStats/1.0 ; https://webdatastats.com/policy.html)
  • NetTrack Anonymous Web Statistics https://nettrack.info/support.php
  • CakePHP
  • Bots that Matomo catches in the server (but not the log analyzer)
  • Mozilla/5.0 (compatible; GoogleDocs; documents; +http://docs.google.com)
  • Mozilla/5.0 (compatible; GoogleDocs; +http://docs.google.com; +Google-Document-Conversion)
  • Mozilla/5.0 (compatible; GoogleDocs; apps-presentations; +http://docs.google.com)

Note: have not yet checked which of this useragents are actually bots or maybe valid mobile apps or other stuff....

@sanchezzzhak
Copy link
Collaborator
sanchezzzhak commented Jun 23, 2020
  • facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) (service bot)
  • Mozilla/5.0 (compatible; Plukkie/1.7; http://www.botje.com/plukkie.htm) (search bot)
  • Grammarly/1.0 (http://www.grammarly.com) (service bot)

@arcanedev-maroc
Copy link

There are some bots missing compared to JayBizzle/Crawler-Detect

@sanchezzzhak
Copy link
Collaborator
sanchezzzhak commented Sep 26, 2020

The Mighty Website Monitoring Service

sanchezzzhak added a commit to sanchezzzhak/device-detector that referenced this issue Nov 12, 2020
@sanchezzzhak
Copy link
Collaborator
sanchezzzhak commented Nov 23, 2020

sanchezzzhak added a commit to sanchezzzhak/device-detector that referenced this issue Jan 8, 2021
sanchezzzhak added a commit to sanchezzzhak/device-detector that referenced this issue Jan 8, 2021
sanchezzzhak added a commit to sanchezzzhak/device-detector that referenced this issue Jan 8, 2021
sanchezzzhak added a commit to sanchezzzhak/device-detector that referenced this issue Jan 8, 2021
sanchezzzhak added a commit to sanchezzzhak/device-detector that referenced this issue Jan 8, 2021
sanchezzzhak added a commit to sanchezzzhak/device-detector that referenced this issue Feb 22, 2021
detect(bot) detect bots: Sputnik Favicon Bot, Sputnik Image Bot, Hatena Favicon

issue matomo-org#6315
@chris-y
Copy link
chris-y commented May 5, 2022

Jakarta Commons-HttpClient is in that list, but I have one with User-Agent : Jakarta Commons-HttpClient/3.0.1 which isn't being recognised on devicedetector.net.
image

@sanchezzzhak
Copy link
Collaborator

devicedetector.net Not updated to the latest version
We will fix this as soon as there is time

Try an alternative demo URL
https://devicedetector.lw1.at/Jakarta%20Commons-HttpClient%2F3.0.1

@andre-kornetzky
Copy link

Iframely/1.3.1 (+https://iframely.com/docs/about) Atlassian

@sanchezzzhak
Copy link
Collaborator

@andre-kornetzky, this bot was added on 05/22/2022.
update your version.

user_agent: Iframely/1.3.1 (+https://iframely.com/docs/about) Atlassian
bot:
  name: Iframely
  category: Crawler
  url: https://iframely.com/
  producer:
    name: Itteco Software, Corp.
    url: https://iframely.com/

liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 7, 2024
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 7, 2024
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 7, 2024
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 7, 2024
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 7, 2024
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 7, 2024
sanchezzzhak pushed a commit that referenced this issue Feb 8, 2024
…r Android, iPadOS, iOS, macOS operating systems, adds detection for various bots, apps, libraries and browsers (#7574)

* Adds detection for Anthropic AI
* Improves detection for BingBot
* Adds detection for Arc browser
* Improves detection for Klarna
* Improves detection for Lilo
* Improves version detection for iOS and macOS
* Improves detection for MetaMask
* Adds detection for Babashka HTTP Client
* Adds detection for BrightSign operating system
* Adds detection for BrightSign LS445
* Improves version detection for Android
* Adds detection for Airfind Secure Browser
* Improves detection for TikTok
* Adds detection for SecureX browser
* Improves generic bots regex
* Adds detection for DoCoMo browser
* Adds missing category for LTX71
* Improves version detection for Opera
* Improves detection for Yandex Browser
* Improves detection for Samsung Browser
* Adds detection for Lilo
* Adds detection for various F-Secure apps
* Improves detection for Daum
* Improves detection for Via browser
* Adds detection for EZVPN
* Adds detection for NoCard VPN
* Improves detection for F-Secure SAFE
* Adds detection for Nuviu browser
* Improves version detection for iPadOS
* Adds detection for Netpeak Checker
* Adds detection for Sandoba//Crawler
* Adds detection for Sirdata
* Adds detection for CheckMark Network
* Adds detection for http.rb 
ref #6315

* Adds detection for FacebookBot
* Adds detection for Cohere AI
* Adds detection for PerplexityBot
* Adds detection for superagent
ref #6315

* Adds detection for The Trade Desk Content
* Improves regex for generic bots
* Adds detection for Montastic Monitor
ref #6315

* Adds detection for CakePHP
* Adds detection for request
ref #6315

* Adds detection for Twurly
ref #6315

* Adds detection for Mixnode
ref #6315

* Adds detection for fGet browser
* Adds detection for CSSCheck
* Adds detection for +Simple
ref #7039

* Adds detection for Thor browser
ref #7039

* Adds detection for Incognito Browser
ref #7039

* Adds detection for Godzilla Browser
ref #7039

* Adds detection for Ocean Browser
* Adds detection for Qmamu
ref #7039

* Adds detection for BF Browser
* Adds detection for BroKeep Browser
* Improves detection for CM Security
* Adds detection for Microsoft Math Solver
* Improves detection for Microsoft Bing Search
* Adds detection for Bitwarden
* Adds detection for MX Player
* Adds detection for HistoryHound
* Adds detection NoCard VPN Lite
* Rename BrightSign to BrightSignOS
* Change Lilo to browser and fix link
* Rename BF Browser to BXE Browser
* Improves BF Browser
* Improves Mixnode bot regex
* Improves Googlebot regex
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 12, 2024
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 12, 2024
< 93B1 div class="pr-1 flex-auto min-width-0" > Adds detection for WebCEO
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 12, 2024
liviuconcioiu added a commit to liviuconcioiu/device-detector that referenced this issue Feb 12, 2024
sanchezzzhak pushed a commit that referenced this issue Feb 13, 2024
* Adds new test for Seekport
* Adds detection for Botify
* Adds detection for Snapchat Ads
* Adds detection for Adscanner
* Adds detection for WebCEO
* Adds detection for NetTrack
* Adds detection for htmlyse
ref #6315

* Improve generic regex
* Adds detection for Trendsmap
* Improve generic regex
* Adds detection for Steve Bot
* Improve generic regex
* Sort generic regex
* Improve generic regex
* Adds detection for KeyCDN Tools
* Adds detection for Google Transparency Report
* Update urls for some bots
* Adds detection for Arquivo.pt
* Adds detection for IsItWP
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants
0