Posted by: John Erickson | January 11, 2010

The Evolution of Linked Data Business Models


  1. Thanks for the kind words, you’ve given me much food for thought.

    The way I think of our approach to linked data is that rather than building the linked data graph up from the atomic level of individual facts, we’re building a web of knowledge down from the molecular level of full datasets.

    As you point out, this is technologically simpler (like you’d expect from a bunch of chimpanzees) and presents a smaller impedance mismatch with current technology. It also avoids several subtle problems. Provenance becomes straightforward: compare the decision to trust (and cite) data from the National Climate Data Center with that drawn from numerous Wikipedia contributors via DBpedia via infochimps, then abstract the latter to a live cloud of evolving data. Versioning and forking, efficient computing, and license/TOS compliance become significantly easier as well.

    I also want to say that though ‘rectangular’ data is most flexible, contributors should feel free to upload data in whatever shape and format they use, rectangular or graph or almost-structured. We have datasets containing one point and datasets with network graphs in some odd adjacency-list format. Well, of course, a .gml is odd to me, but not to the folks that made the dataset; I’d prefer a .tsv to process with hadoop, and you might like an .rdf for your graph browser. The thing we all agree on is that we’d rather have the data in an odd format than not have it at all 🙂

  2. “Talis’ Paul Miller” ??? Not anymore! 🙂

  3. The more heavily-linked a dataset is, the more valuable it is, by definition.

    I beg to disagree. To give a simple example, you have a bunch of links sitting to the right side of the blog. If you tripled the number of links, would the list be more valuable, or less? If you had a 300 times the number of links would that make the list more valuable. I think not.

    Unfortunately it’s very cumbersome in linked data to assign a link strength, so only strong links are valuable.

  4. Thanks for your comment, Eric! I agree with you that fitness of the linked nodes is critical; indeed, this is why I’ve preferred thinking about Barabasi’s preferential attachment network model (which considers evolution due to node fitness) over more simplistic models.

    Several weeks ago I explored some of this in a rather lengthy post, but it was lost when my blog was trashed by Blogger. I will “have a think” and try to articulate a more precise statement of this idea of “value = links x fitness” soon!

  5. Taking platform independence and access as given i.e., inherent to HTTP based Linked Data. Its best to look at Linked Data value as a function of: Link Density (relatedness), Link Quality, and Linked Data consumer’s Context Lenses.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: