8000 GitHub - andreaskoch/ga-spam-control at v0.5.0
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

andreaskoch/ga-spam-control

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md< 8000 /div>
 
 
 
 

Repository files navigation

Google Analytics Spam Control

Command-line utility for automating the fight against Google Analytics referral spam

Google Analytics referrer spam is pain. There are hundreds of known referrer spam domains and every other day a new one pops up. And the only way to keep the spammers from skewing your web analytics reports is to block these spam domain names one by one.

ga-spam-control is a small command-line utility that keeps your Google Analytics spam filters up-to-date, automatically.

How does ga-spam-control work?

ga-spam-control creates filters for your Google Analytics accounts that block known referrer spam domains from your analytics reports and keeps these filter up-to-date.

To always protect your analytics reports from annoying false entries ga-spam-control combines multiple community-maintained lists of known spam domains:

This gives you the ability to completely automate your spam protection process. Just let ga-spam-control check your Google Analytics accounts daily for new spam. And when it detects new spam; update your filters.

Available Commands

The command line utility provides the following actions.

Spam Control Filter Actions

In order to protect your Google Analytics account from spam ga-spam-control creates filters which blocks known referrer spam domains from your analytics reports. These are the commands that help you to review and update your spam filters:

  1. Action: show-status Display the spam-control status of all your accounts or for a specific account
  2. Action: update-filters Create or update the spam-control filters for a specific account
  3. Action: remove-filters Remove all spam-control filters from an account

Referrer Spam Domains Actions

The basis for the spam filters is an up-to-date list of known referrer spam domains. And with these commands you can review and update the spam-domain lists:

  1. Action: list-spam-domains Print a list of all currently known referrer spam domains
  2. Action: update-spam-domains Update the list of referrer spam domain names.
  3. Action: find-spam-domains Manually review the last n days of analytics data and mark domain names as spam

Which domains are currently considered spam is stored in the ~/.ga-spam-control/spam-domains/community.txt and ~/.ga-spam-control/spam-domains/personal.txt.

Usage

ga-spam-control <command> [<args> ...]

Print help information

ga-spam-control --help

Display spam-control status

Display the current spam-control show-status for all accounts that you have access to:

ga-spam-control show-status

Display the spam-control status in a parseable format:

ga-spam-control show-status --quiet

Display the current spam-control status for a specific Google Analytics account:

ga-spam-control show-status <accountID>

Install or update spam-control filters

update the spam-control filters for a specific Google Analytics account:

ga-spam-control update-filters <accountID>

Uninstall spam-control filters

remove the spam-control filters for a specific Google Analytics account:

ga-spam-control remove-filters <accountID>

Find new referrer spam in your accounts

The find-spam-domains displays referrer domain names from the last n days of analytics data to you for review.

ga-spam-control find-spam-domains <accountID> <numberOfDaysToLookBack>

By default ga-spam-control will use the last 90 days of analytics data. But if you want to review less or more days you can specify the number of days yourself.

Authentication

The first time you perform an action, you will be displayed an oAuth authorization dialog. If you permit the requested rights the authentication token will be stored in your home directory (~/.ga-spam-control/credentials.json).

To sign out you can either delete the file or de-authorize the "Google Analytics Spam Control" app in your Google App Permissions at https://security.google.com/settings/security/permissions.

Installation

The command-line package is github.com/andreaskoch/ga-spam-control/cli. You can clone the repository or install it with go get github.com/andreaskoch/ga-spam-control and then run the make.go script:

go run make.go -test
go run make.go -install
go run make.go -crosscompile

Or with make:

make test
make install
make crosscompile

Licensing

ga-spam-control is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

Roadmap

Ideally Google would just include a spam-protection into Google Analytics but until then here are some ideas for additional features and possible improvements:

  • Make remote spam domain providers configurable
  • Populate my own list of known referrer spam domains with the results from the find-spam-domains action.
    • Automatic daily upload from the ga-spam-control clients
    • Review of the additions by trusted community members or by a tool which checks the listed website
  • Create and update a "No Referrer Spam" segment and update it during the normal update process. Unfortunately I will need Google to add create and update support to the Google Analytics API for this to work (see: analytics-issues - Issue 174: Create Advanced Segment and Customized Report Through API).
  • Until Google supports segment creation via the API I ga-spam-control can at least print the necessary segment content to support manual editing of spam segments.
  • Use machine learning to automatically identify new referrer spam. Earlier versions of ga-spam-control already used a machine learning model. But unfortunately I could only train the model to detect new referrer spam for a single website - the model did not work well enough when I applied it to websites with different usage patterns.
  • Other options for detecting referrer spam automatically
    • Correlate analytics data with web server logs to identify referrer spam
    • Do a word analysis of the referrer site and use regular e-mail techniques to identify spam sites

Let me know if you have other ideas, or if want one of the features implemented next.

Related Resources

Referrer Spam

Lists of Referrer Spam Domains

There are multiple curated lists of referrer spam domains out there that you can use to create filters for your analytics accounts.

Other Spam Blocker Tools

ga-spam-control is not the first and not the only tool that helps you to block referrer spam from your Google Analytics accounts.

Google Analytics: Segments

Filters prevent referrer spam from getting into your Google Analytics accounts. But filters don't help you with referrer spam that already reached your reports. In order to filter this spam out you can use segments that filter out the spammy traffic:

Google Analytics: Bot and Spider Filtering

Google Analytics has a setting to block bots and spiders from your Google Analytics reports.

  1. Goto Google Analytics > Admin > Account > Property > View > View Settings
  2. Goto Bot Filtering
  3. Check Exclude all hits from known bots and spiders

This feature is not advertised much by Google. The only time it officially got mentioned by is in a Google Plus post: Google Analytics - Introducing Bot and Spider Filtering.

I am not yet sure if this flag does the trick. One would assume that is would be easy for Google to exclude all referrer spam and block the stupid spammers once and for all.

Google Analytics: API

About

Command-line utility for blocking referrer spam from your Google Analytics accounts

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published
0