Google Analytic's Code Scraper
For SEO purposes, I mocked together a quick script to scrape clients websites to make sure that they had Google analytics setup on their sites. It searches for the known code “UA-“ which is the standard starting letters for Google analytics. Go and try it out! (On sites that your allowed to scrape that is! :P) Yes I am well aware there are many caveats, such as any enterprise tracking or non-google, but for a low tier website hosting company this sufficed plenty.
#!/bin/bash
# usage ./analytics-scrape.sh www.slowb.ro anayltics-output.txt
# real world usage:
# for item in $(cat domains-that-we-scrape.txt); do echo $item; ./analytics-scrape.sh www.$item analytics-output.txt; done
# I would suggest that you test domains without www. first and it will throw an error for that site. If it does, then you already have a bad start to your seo practices.
site=$1;
output=$2
regex="UA-[1-9]*-*[1-9]"
atauaid=$(wget $site -qO- -U 'Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20100101 Firefox/10.0.2' -T 30 --tries=3)
if [ "$atauaid" != "" ]; then
uaid=$(echo -n "$atauaid" | grep UA- | grep -v verification | grep -v X-UA | head -n 1 | awk -F"'" '{ for (i=1;i<=NF;i++) if ($i ~ /UA-/) print $i }');
if ! [[ "$uaid" =~ "UA*" ]]; then
uaid=$(echo -n "$uaid" | awk -F'"' '{ for (i=1;i<=NF;i++) if ($i ~ /UA-/) print $i }');
fi
if [[ "$uaid" =~ $regex ]]; then
echo "$site , $uaid , match ," >> $output
else
_uaid=$(echo -n "$atauaid" | grep UA- | grep -v verification | grep -v X-UA | awk -F"'" '{ for (i=1;i<=NF;i++) if ($i ~ /UA-/) print $i }')
if ! [[ "$_uaid" =~ "UA*" ]]; then
_uaid=$(echo -n "$_uaid" | awk -F'"' '{ for (i=1;i<=NF;i++) if ($i ~ /UA-/) print $i }');
fi
if [[ "$_uaid" == "" ]]; then
echo "$site , $_uaid , missing" >> $output
else
echo "$site , $_uaid , backupmatch" >> $output
fi
fi
else
echo "Scraping $site FAILED" >> $output
fi