Using tGoogleAnalyticsInput in JasperETL/Talend

This is a short article on getting Google Analytics into JaspersoftETL/Talend ETL tools. Google Analytics has a nice interface and many great ways of analyzing web traffic, it'd be hard to beat the work Google did there. The purpose of this connector would be to bring the data into your own data warehouse and join it with other intelligence you might have. A great way to display the aggregated data would be with a BI tool like Jaspersoft.

I'm writing because the help for the tGoogleAnalyticsInput is in French and incomplete.

So let's get started. You'll need JasperETL 4.1.2 or older (commercial editions here, open source here).

Installing the plugin:
Create a new Job Design and drag the tGoogleAnalyticsInput component from the Palette (under Internet):


















Configure the component. First start with finding some fields you'd like to use from the Google API: http://code.google.com/apis/analytics/docs/gdata/dimsmets/dimsmets.html. Note that you need to have some metrics (can't be all dimensions, vice versa).

Here's an example schema (you'll have to put in the fields yourself):
Now configure the component. In order to get your Profile ID (note it doesn't start with UA) read the cryptic Google Documentation about it. I got mine by examining the URL while logged into GA and looking at the &id= parameter.

The "value" column comes directly from the Google API and will also have to be filled in. I've also found that I HAVE to filter on period. You can do 1980-2020 if you want to include everything. 

Now you can run the the job and see your visitors:

Funny enough. If you're reading this right now then you will eventually flow through the GA database and the very job I'm displaying above ;)

Comments

Unknown said…
Ernesto - did you implement this? Do you have a working example?
R. A. Burrell said…
The Start and End Date are helpful. But, how do you set it up to pull yesterday?
Ernesto Ongaro said…
RA, I read up on TalendDate.getDate and TalendDate.addDate - with these two I was able to calculate today's date and subtract 1 from today (the -1 param to addDate)

So to get yesterday and today:

End Date (today):
TalendDate.getDate("yyyy-MM-dd")

Start Date(tomorrow):
TalendDate.addDate(TalendDate.getDate("yyyy-MM-dd"),"yyyy-MM-dd",-1,"dd")
* -1 is the field that I added
Anonymous said…
Hi,
I tried giving all the required parms
and the dates are giving error, MM can not be resolved to a variable.Any clue? The date I gave is "2012-04-11" and "2013-07-10".What is wrong here?
Anitha.
Ernesto Ongaro said…
Anitha, did you include the dates in quotes? That's all I can really think could be the problem

Popular posts from this blog

Dealing with Nested Documents in MongoDB and Talend (aka Baking a Cake)

Add sbin to user PATH in Debian

Batch convert Asterisk GSM WAV files to mp3