Scraping Google Adsense

By Tim Kofol

Published: November 12th, 2007

I am obsessed with Google Adsense. I waste a ton of time looking at my daily page impressions, click through percentage and earnings. Well I found a way to scrape that data to my desktop.

Thanks to schadenfreude for the basics of this idea.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new

# Adsense's login form is included in an iframe so we will need to use the iframe url instead of http://www.google.com/adsense
page = agent.get 'https://www.google.com/accounts/ServiceLoginBox?service=adsense&ltmpl=login&ifr=true&rm=hide&fpui=3&nui=15&alwf=true&passive=true&continue=https%3A%2F%2Fwww.google.com%2Fadsense%2Flogin-box-gaiaauth&followup=https%3A%2F%2Fwww.google.com%2Fadsense%2Flogin-box-gaiaauth&hl=en_US'

# The login form isnt named but it is the first and only form on the page,.. lets get it.
form = page.forms.first

# Fill out the form with your credentials and submit
form.Email = 'username'
form.Passwd = 'password'

page = agent.submit form

# This is the tricky part as we get bounced around with some javascript redirects.
# Simply follow the url. It is not dynamic, no session ids, etc so just hard code it, should never change.
page = agent.get 'https://www.google.com/accounts/CheckCookie?continue=https%3A%2F%2Fwww.google.com%2Fadsense%2Flogin-box-gaiaauth&followup=https%3A%2F%2Fwww.google.com%2Fadsense%2Flogin-box-gaiaauth&service=adsense&hl=en_US&chtml=LoginDoneHtml'

# We are now logged in and can go to any adsense page we with
page = agent.get 'https://www.google.com/adsense/report/overview'

Now you just have to scrape the page using hpricot to get the metrics you are interested in.

Combine that with geektool and now you have realtime Adsense updates straight to your desktop.

Note: Because of the way google handles adsense data, you often get inconsistent results. One second you may have 100 page views the next login you may have 50. Trust the higher number as I assume there is just a lag in adsense data being replicated on the database servers.

RubyInline Link Errors on Leopard

By Tim Kofol

Published: November 4th, 2007

For those of you who have upgraded to OS X Leopard and tried to use RubyInline. You might have been greeted with lots of Compliation Errors.

Well after a couple hours of googling, i found the solutions here.

Turns out Ruby built on Leopard is built without the compilation flag saying ignore missing symbols when linking.

The temporary hack to fix it, is to go into the RubyInline gem directory and find lib/inline.rb file and change the line that looks like this


flags = @flags.join(' ')

to this


flags = @flags.join(' ') + ' -lruby'

Now everything should work. Hopefully the next RubyInline version will fix this problem, so I don’t have to hack it again.