Hi josephg, I'm a CRDT researcher. This is great to see so much work around CRDT...

josephg · on July 31, 2021

Cool! It'd be interesting to see those CRDT implementations added to Kevin Jahns' CRDT Benchmarks page[1]. The LogootSplit paper looks interesting. It looks like xray is abandoned, and I'm not sure about teletype. Though teletype's CRDT looks to be entirely implemented in javascript[2]? If the authors are around I'd love to see some benchmarks so we can compare approaches and learn what actually works well.

And I'm not surprised these techniques have been invented before. Realising a tree is an appropriate data structure here is a pretty obvious step if you have a mind for data structures.

To name it, I often find myself feeling defensive when people read my work and respond with a bunch of links to academic papers. Its probably totally unfair and a complete projection from my side, but I hear a voice in my head reword your comment to instead say something awful like: "Cool, but everything you did was done before. Even if they didn't make any of their work practical, usable or good they still published first and you obviously didn't do a good enough literature review if you didn't know that." And I feel an unfair defensiveness arise in me as a result that wants to find excuses to dismiss the work, even if the work might be otherwise interesting.

Its hard to compare their benchmark results because they used synthetic randomized editing traces, which always have different performance profiles than real edits for this stuff. Their own university gathered some great real world data in an earlier study. It would have been much more instructive if that data set was used here. At a glance their RAM usage looks to be about 2 orders of magnitude worse than diamond-types or yjs. And their CPU usage... ?? I can't tell because they have no tables of results. Just some hard to read charts with log scales, so you can't even really eyeball the figures. So its really hard to tell if their work ends up performance-competitive without spending a couple days getting their enterprise style java code running with a better data set. Do you think thats worth doing?

[1] https://github.com/dmonad/crdt-benchmarks

[2] https://github.com/atom/teletype-crdt

conaclos · on Aug 1, 2021

Yes, xray was abandoned and teletype is written in JS.

I understand your point and as a researcher and engineer I know your feeling. I took some cautions by using "Some optimizations". I value engineering as much as research and I'm bothered when I heard any side telling the other side that their work is worthless. Your work and the work of Kevin Jahns are very valuable and could improve the way that researchers and engineers do benchmarks.

This is still hard for me to determine when position-based list CRDT (Logoot, LogootSPlit, ...) are better than tombstone-based list CRDT (RGA, RgaSplit, Yata, ...). It could be worth to assess that.

3 year ago I started an update of LogootSplit. The new CRDT is named Dotted LogootSplit [1] and enables delta-synchronizations. The work is not finished: I had other priorities such as writing my thesis... I have to perform some benchmark. However I'm more interested in the hypothetical advantages of Dotted LogootSplit regarding synchronization over unreliable networks. From an engineering point-of-view, I'm using a partially-persistent-capable AVL tree [2]. Eventually I would like to switch to a partially-persistent-capable b-tree. Unfortunately writing a paper is very time consuming, and time is missing.

I still stick with JS/TS because in my viewpoint Wasm is not mature yet. Ideally, I would like to use a language that compiles both to JS and Wasm. Several years ago I welcomed Rust with a lot of enthusiasm. Now I'm doubtful about Rust due to the inherent complexity of the language.

[1] https://github.com/coast-team/dotted-logootsplit/tree/dev [2] https://github.com/Conaclos/cow-list

josephg · on Aug 2, 2021

Cool! What do you think is missing from wasm for maturity? It seems great for something like CRDTs, since the code is reasonably self contained. I hear you about rust - I'm not convinced it'll ever be as popular as java / C# / JS for exactly that reason. But rust doesn't need to be that popular for me to enjoy it, or for the people who use my software to reap the speed & safety benefits.

I'll have to take a read of LogootSplit. I suspect most / all list CRDTs can work with this approach (using a list internally and doing an insertion sort). But I don't know enough about how logoot / logootsplit works to know!

And I really hear you about writing papers taking time. That blog post we're talking about here took nearly a month of time in aggregate to write. I wrote the initial draft in about 2 days, but editing and adding diagrams and everything was exhausting. There's still more work I could have put into it before publishing - I anticipated some of the things people were confused by in this thread. But at the end of the day, published > perfect and I have code to write as well!

syspec · on Aug 1, 2021

> To name it, I often find myself feeling defensive when people read my work and respond with a bunch of links to academic papers. Its probably totally unfair and a complete projection from my side, but I hear a voice in my head reword your comment to instead say something awful like: "Cool, but everything you did was done before. Even if they didn't make any of their work practical, usable or good they still published first and you obviously didn't do a good enough literature review if you didn't know that." And I feel an unfair defensiveness arise in me as a result that wants to find excuses to dismiss the work, even if the work might be otherwise interesting.

I've followed your work for a longtime (since chipmunk-js days), and that is a very honest self assessment