Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For what it's worth, if you have in the order of N things to store, and you hash them to a space of size N^3, the possibility for a collision is less than N^(-1). Basically nonexistent. Hence, if you want to save space, just use a normal hash table implementation, but use a more precise hash instead of the keys. This is very useful when your keys are document sized.


I don't understand what you mean?

10 things to store, 1000 buckets to store in, less than 0.1 probability of a collision? That seams both not very impressive or even correct?


It's just a rough (but certain for all n) upper bound. The precise probability for your case seems to be slightly greater than 0.01. The point is that you can replace a precise, expensive equality test with a second hash with greater range.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: