Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nice. Although it's not mentioned, it seems like the insight here is partially about the lingering effects on the API of one model of persistence?

A log is normally considered a stream that's too large to keep in memory and needs to be incrementally persisted to disk on the fly, in timestamp order. (Or at least, the traditional file-based way does that, which affects the API.) But each HTTP request happens in memory and its log could be kept there.

A downside might be if the server crashes in the middle of a request, the request is entirely unlogged? But maybe you could log some values to disk and merge them later.



It wouldn't make sense to log most of these events outside the context of the http request -- it makes sense to smash it all into the one log line. Otherwise you have no simple way to tie everything back together; you could make an id for each request and tie everything back to that, but that is inconvenient, slow, and mostly a way to justify a Hadoop cluster you don't actually need.

That's not the key insight though. The key insight is not to try to log the time something took -- instead, log the beginning and the end, and calculate the duration later. This makes the logging apis much simpler and enables measuring times between events in different scopes without coupling the different scopes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: