Google Analytic's Code Scraper

For SEO purposes, I mocked together a quick script to scrape clients websites to make sure that they had Google analytics setup on their sites.
It searches for the known code "UA-" which is the standard starting letters for Google analytics. Go and try it out! (On sites that your allowed to scrape that is! :P)
Yes I am well aware there are many caveats, such as any enterprise tracking or non-google, but for a low tier website hosting company this sufficed plenty.

# usage ./ anayltics-output.txt
# real world usage:
# for item in $(cat domains-that-we-scrape.txt); do echo $item; ./ www.$item analytics-output.txt; done
# I would suggest that you test domains without www. first and it will throw an error for that site. If it does, then you already have a bad start to your seo practices. 

atauaid=$(wget $site -qO- -U 'Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20100101 Firefox/10.0.2' -T 30 --tries=3)  
if [ "$atauaid" != "" ]; then  
    uaid=$(echo -n "$atauaid" | grep UA- | grep -v verification | grep -v X-UA | head -n 1 | awk -F"'" '{ for (i=1;i<=NF;i++) if ($i ~ /UA-/) print $i }');
    if ! [[ "$uaid" =~ "UA*" ]]; then
         uaid=$(echo -n "$uaid" | awk -F'"' '{ for (i=1;i<=NF;i++) if ($i ~ /UA-/) print $i }');
 if [[ "$uaid" =~ $regex ]]; then
     echo "$site , $uaid ,  match ,"  >> $output
     _uaid=$(echo -n "$atauaid" | grep UA- | grep -v verification | grep -v X-UA | awk -F"'" '{ for (i=1;i<=NF;i++) if ($i ~ /UA-/) print $i }')
     if ! [[ "$_uaid" =~ "UA*" ]]; then
         _uaid=$(echo -n "$_uaid" | awk -F'"' '{ for (i=1;i<=NF;i++) if ($i ~ /UA-/) print $i }');
     if [[ "$_uaid" == "" ]]; then
         echo "$site , $_uaid , missing" >> $output
         echo "$site , $_uaid , backupmatch" >> $output
    echo "Scraping $site FAILED" >> $output

Tim Coombs

Administrator of and world leader of my own mind, the only place our ideas and thoughts are our own in a world gone mad

In a terminal

Subscribe to's Blog

Get the latest posts delivered right to your inbox.

or subscribe via RSS