legitbot

1.11.6last stable release 2 weeks ago
Complexity Score
Low
Open Issues
3
Dependent Projects
0
Weekly Downloadsglobal
18

License

  • Apache-2.0
    • Yesattribution
    • Permissivelinking
    • Permissivedistribution
    • Permissivemodification
    • Yespatent grant
    • Yesprivate use
    • Permissivesublicensing
    • Notrademark grant

Downloads

Readme

Legitbot

Ruby gem to make sure that an IP really belongs to a bot, typically a search engine.

Usage

Suppose you have a Web request and you would like to check it is not diguised:

bot = Legitbot.bot(userAgent, ip)

bot will be nil if no bot signature was found in the User-Agent. Otherwise, it will be an object with methods

bot.detected_as # => :google
bot.valid? # => true
bot.fake? # => false

Sometimes you already know which search engine to expect. For example, you might be using rack-attack:

Rack::Attack.blocklist("fake Googlebot") do |req|
  req.user_agent =~ %r(Googlebot) && Legitbot::Google.fake?(req.ip)
end

Or if you do not like all those ghoulish crawlers stealing your content, evaluating it and getting ready to invade your site with spammers, then block them all:

Rack::Attack.blocklist 'fake search engines' do |request|
  Legitbot.bot(request.user_agent, request.ip)&.fake?
end

Versioning

Semantic versioning with the following clarifications:

  • MINOR version is incremented when support for new bots is added.
  • PATCH version is incremented when validation logic for a bot changes (IP list updated, for example).

Supported

  • Ahrefs
  • Amazonbot
  • Amazon AdBot
  • Applebot
  • Baidu spider
  • Bingbot
  • BLEXBot (WebMeUp)
  • DataForSEO
  • DuckDuckGo bot
  • Google crawlers
  • IAS
  • OpenAI GPTBot
  • Oracle Data Cloud Crawler
  • Marginalia
  • Meta / Facebook Web crawlers
  • Petal search engine
  • Pinterest
  • Twitterbot, the list of IPs is in the Troubleshooting page
  • Yandex robots

License

Apache 2.0

Other projects

  • Play Framework variant in Scala: play-legitbot
  • Article When (Fake) Googlebots Attack Your Rails App
  • Voight-Kampff is a Ruby gem that detects bots by User-Agent
  • crawler_detect is a Ruby gem and Rack middleware to detect crawlers by few different request headers, including User-Agent
  • Project Honeypot’s http:BL can not only classify IP as a search engine, but also label them as suspicious and reports the number of days since the last activity. My implementation of the protocol in Scala is here.
  • CIDRAM is a PHP routing manager with built-in support to validate bots.

Dependencies

No runtime dependency information found for this package.

CVE IssuesActive
0
Scorecards Score
No Data
Test Coverage
99.00%
Follows Semver
Yes
Github Stars
26
Dependenciestotal
2
DependenciesOutdated
0
DependenciesDeprecated
0
Threat Modelling
No
Repo Audits
No

Learn how to distribute legitbot in your own private RubyGems registry

gem install legitbot
Processing...
Done

Releases

Loading Version Data
RubyGems on Cloudsmith

Getting started with RubyGems on Cloudsmith is fast and easy.