Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anyone remember gsearch by Jeff Dean circa 2006 at Google? It was a full scan regex search, using 40-80 machines in parallel, not an index

Also sourcerer was a full scan of VCS metadata, on a single machine. i.e. if you listed all the commits by a certain person, it would just do a for loop in C++ going through about 1-2 M commits. They were just C++ structs in memory, and it was probably submillisecond times, even with hardware at the time

I remember reading those 2 programs and being impressed that they were both very fast, and full scans! (gsearch got slow later as the codebase grew, but the first version was a <1 second distributed grep of the whole Google codebase.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: