Fabric – A simple triplestore written in Go

danpalmer · on Aug 27, 2019

Nice to see more triple store implementations. What’s the industry take up of data stores like this? It always struck me as much more academic and not used as much in industry. I saw a SaaS triple store a while ago but it seemed to disappear or never really take off.

refset · on Aug 27, 2019

Nubank recently raised $400M at a $10B valuation and they depend on Datomic [1] heavily for their core systems [2].

The RDF database marketplace is very established [3], and the likes of MarkLogic, DB2, and Oracle have clearly encountered profitable reasons to add RDF support. I believe RDF has good traction in knowledge-intensive industry domains such as clinical research and life sciences.

Disclosure: I work on Crux [4] which adds bitemporal versioning and eviction to a document->triplestore model running on top of Kafka.

[1] https://www.datomic.com/customers.html

[2] https://www.datomic.com/nubanks-story.html

[3] https://en.wikipedia.org/wiki/Comparison_of_triplestores

[4] https://github.com/juxt/crux

xfhtjxfhj · on Aug 27, 2019

at topsy almost a decade ago, we had a triple store written in perl and backed with innodb for storing our copy of twitter.

that 80 node cluster of dl380s was a beast to operate, but damn was it spiffy when it was working well

rip ux8

edraferi · on Aug 27, 2019

+1 I got a lot of mileage out of the triple model when working with social media data. You just don’t know what data patterns you will find when you start looking, and need to support generic queries.

dustingetz · on Aug 27, 2019

Happy Datomic user here – simple, flexible, powerful. I've recently heard good things about https://www.stardog.com/ which is a real triple store (Datomic adds a time dimension)

streetcat1 · on Aug 27, 2019

This data stores are the ONLY way to create decent chat bots (ones that know what they are talking about and can engage in a conversations).

kevinsundar · on Aug 27, 2019

eBay actually has an open source distributed triple store they released recently: https://github.com/eBay/akutan

Seems to me they'd be using it internally.

grenoire · on Aug 27, 2019

Are there any differences between a KV-store in the form of "Bob:Knows" — "John" and a triple store in the form of "Bob" — "Knows" — "John"? Redis, for example, can query the first one easily by scanning.

Bonus question: What are some real-life use cases for triple stores?

coldtea · on Aug 27, 2019

For one you could easily ask "what is the relationship between Bob and John?" --> Knows, which you can't without text manipulation in the KV format.

Sounds contrived, but can be handy in many uses cases.

Triple relationships like the above are the kind of queries a Prolog engine can answer well (and far more).

Semantic Web's RDF is also like this.

kd5bjo · on Aug 27, 2019

A slightly more plausible example is “Who knows John?”. I think about turning to triple stores when I’m still exploring an application domain and don’t know what data access patterns will look like yet. Something like a hexastore that maintains full indexes for all query orders seems like a reasonable compromise for read-heavy applications in the prototype stage.

PaulHoule · on Aug 27, 2019

It is not unusual to implement a triple store with multiple indexes, so you could build k-v stores with

s-p -> o p-o -> s s-o -> p

and then you have indexes which are good for those triple patterns.

Let's see.

The core table in the salesforce.com system consists of triples, but salesforce.com will materialize whatever indexes and views are necessary to make things fast based on automatic run-time profiling. Their patent on this should run out just about now, so this feature may turn up in real-life triple stores where it would make a big difference in practicality.

The NSA has been shopping around for a triple store which could ingest around 1 trillion triples per day.

The BBC made a nice web site for the world cup which used forward chaining inference in a triple store to determine the consequences of each goal, so the tables would all adjust whenever anything happened.

mooreds · on Aug 27, 2019

Do you have links for any of these lying around? Seems like some pretty cool examples.

Someone · on Aug 27, 2019

How you store your triples affects performance, but, conceptually, is only an implementation detail.

But then, why stop at a KV-store? A set with entries “Bob:Knows:John” will work just as well, if you ignore performance.

But then, why stop at a set? A string “Bob:Knows:John;Bob:Loves:John;John;Is;vegetarian” works just as well (conceptually!)

IMO, a major real-life use case is as a means to produce PhD’s :-). The concept is enticing and easily grasped, but there are zillions of papers to write on query planning, automatic storage optimization, discovering heuristics, etc. It’s just like the early days of SQL: you don’t have to read decades of papers to move to the front of development.

kd5bjo · on Aug 27, 2019

Unless I’m missing something, the in-memory backend here appears to actually use the set solution: all of the triple fields are concatenated together and used as a dictionary key. Queries iterate through the dictionary entries until a sufficient number of results have been located.

On the other hand, I’m not really familiar with Go, so I may be reading it wrong.

xfhtjxfhj · on Aug 27, 2019

you might want to search based on the predicate, or you might have versioned triples where the predicate changes

in one version, you have these two ideas represented:

[bob,likes,cake]

[ann,suspects,[bob, likes, cake]]

this may change in another version:

[bob,likes,cake]

[ann,knows,[bob,likes,cake]]

a decent triple store will allow you to version these ideas and explore how they change over time, or maybe query by predicate

scoobyyabbadoo · on Aug 27, 2019

>Are there any differences between a KV-store in the form of "Bob:Knows" — "John" and a triple store in the form of "Bob" — "Knows" — "John"? Redis, for example, can query the first one easily by scanning.

A triple store can more quickly answer queries about triples. The reason to use triples is that it is what you naturally get when you try to store structured relational data where the schema changes quickly.

crimsonalucard · on Aug 27, 2019

There's a bunch of software products with the name fabric.

leetrout · on Aug 27, 2019

Hardest problem in computer science.

overcast · on Aug 27, 2019

I mean, I'm working in UI Fabric right now (MS). Clearly there needs to be some sanity checks when naming software these days.

person_of_color · on Aug 27, 2019

I agree. Come on.

The Python fabric package is really popular.

kevinsundar · on Aug 27, 2019

Could you compare this to eBay's distributed triple store https://github.com/eBay/akutan ?

Looks like it is written in Go too. I can see your's being much simpler to get up and running initially though. Looks like akutan isn't as simple since its built on docker and is a daemon.

alephnan · on Aug 27, 2019

What's the use-case for a triple store vs a graph DB? Social networks?

PaulHoule · on Aug 27, 2019

People also call triple stores graph DBs.

Triple stores support a disciplined set of primitive types that come from xml schema, so you have "xsd:integer", "xsd:datetime", "xsd:decimal", really the critical things that are missing in JSON. That is, there is a kind of fact where the object is a literal.

Triple stores also support facts where the object is an identifier for another object. That could be a URI which names it, or it could be an internal "blank node" identifier.

Other kinds of "graph database" have different semantics, for instance they might not have support for literals, or have a different set of literal data types, or they might let you attach facts to the edges (hypergraph, property graph, ...)

ordinarydev · on Aug 27, 2019

I thought this was going to be about the Python SSH library - a bit confusing.

creddit · on Aug 27, 2019

Its title specifying that it’s a “simple triple store written in Go” wasn’t enough to tell you that it wasn’t?

Insanity · on Aug 27, 2019

Nice job! Perhaps consider adding it to awesome-go?

https://github.com/avelino/awesome-go

I looked at the report card already and seems like you've done a great job, so you'll have no trouble getting added!

https://goreportcard.com/report/github.com/spy16/fabric