Bizarre Google Trend Search Spike

uptown · on Nov 24, 2010

It happens. According to Google Analytics a website I built had 14.4 billion in revenue one day in November, 2008. While I wish it was true, the bigger problem is that any chart that includes this datapoint is essentially useless since it dwarfs the daily revenue of every other day.

citricsquid · on Nov 24, 2010

I think that's your cue to write an ebook!

fbcocq · on Nov 24, 2010

http://en.wikipedia.org/wiki/Google_Trends

"Originally, Google neglected updating Google Trends on a regular basis. In March 2007, internet bloggers noticed that Google had not added new data since November 2006, and Trends was updated within a week. Google did not update Trends from March until July 30, and only after it was blogged about, again.[2] Google now claims to be "updating the information provided by Google Trends daily; Hot Trends is updated hourly."

Google Insights for Search seems to be better for this sort of analysis since it offers regional filtering options and puts the searched term into context.

http://www.google.com/insights/search/#q=xkcd%2Cpenny%20arca...

citricsquid · on Nov 24, 2010

Seems they had an error with how it calculated? The spike exists for everything I can find that existed back then, but the spike seems relative to the total, so maybe they accidentally counted 1 search as 2 or something? Maybe it was an issue with their use of ajax that caused 2 search requests to be fired off to google? Maybe the data isn't wrong, maybe something caused extra searches?

http://www.google.com/trends?q=newgrounds,+reddit,+paul+grah...

pbhjpbhj · on Nov 24, 2010

http://news.ycombinator.com/item?id=1937589 look at the drilldown results, does it still show a spike?

ryanc · on Nov 24, 2010

Bizarre Google Trend Spikes are traditionally caused by 4chan, it doesn't seem like that is the case here though.

nikgregory · on Nov 24, 2010

When I first saw the spike it was for XKCD and Penny Arcade, I was expecting it to be internet culture related. I mean at the time XKCD was still growing in readership and to get a colossal jump was a bit weird so I was wondering about a Digg or Reddit boost, then when I noticed it was wider and into more obscure terms I wondered if it was an oddball 4chan event. However I saw it in literally every term I searched, the only ones I couldn't see it in were terms that already had huge random spikes.

So I thought I'd post it here, see if anyone else could figure it out and it looks like a few people have good suggestions - gotta love HN.

mcantor · on Nov 24, 2010

Upvoted for germane usage of the terms "traditional" and "4chan" in the same sentence.

lanstein · on Nov 25, 2010

Upvoted for germane use of 'germane'.

benmathes · on Nov 24, 2010

Goodbye, Hacker News, I miss your useful discussions.

jrockway · on Nov 24, 2010

pbhjpbhj · on Nov 24, 2010

If you look at the one year graph for 2007 the spike is missing - http://www.google.com/trends?q=google%2Cyahoo%2Cmicrosoft.... Zoom out to all years and it's there.

Similarly there's no spike if you look at just the US or just the UK.

Perhaps Baidu was down or something but it seems most likely an error on just the compiled graph.

klbarry · on Nov 24, 2010

I don't think google would manually repair the data but rather just fix the part of the data collector that made the mistake.

brk · on Nov 24, 2010

Do they even have the ability to repair the data? If information is logged in real-time, and there is no easy way to filter through billions of search query terms to de-dupe (or whatever fix may be required), it might not be possible to correct the dataset.

notyourwork · on Nov 24, 2010

Care to elaborate on why you feel they cannot correct the data? I am having trouble understanding.

wladimir · on Nov 24, 2010

Because the original data on which the statistics are based was probably deleted a long time ago. And that's the only way to get the 'right' numbers. The only option would be to filter out the peak, then again, this will also lose all real information in that timespan. Just too much bother.

harry · on Nov 24, 2010

I doubt google - or any similar company who relies upon data aggregation and collection - makes a habit of deleting data.

uxp · on Nov 24, 2010

I would assume that google, or any other similar data aggregation company, would log and keep statistics and summary information, but discard actual raw data. They do have some of the biggest storage capabilities, but thats no reason to fill it full of apache logs.

klbarry · on Nov 24, 2010

For instance, if a spam site ranks for a query, it is only super last resort to ban them manually - they would prefer to change the next incarnation of the algorithm to block that spam.