Freebase

Freebase
URL www.freebase.com
Type of site Online database
Content license Creative Commons Attribution License
Owner Metaweb Technologies
Alexa rank 16,446[1]

Freebase is a large collaborative knowledge base consisting of metadata composed mainly by its community members. It is an online collection of structured data harvested from many sources, including individual 'wiki' contributions.[2] Freebase aims to create a global resource which allows people (and machines) to access common information more effectively. It was developed by the American software company Metaweb and has been running publicly since March 2007. Metaweb was acquired by Google in a private sale announced July 16, 2010.

Freebase data is available for free/libre for commercial and non-commercial use under a Creative Commons Attribution License, and an open API, RDF endpoint, and database dump are provided for programmers. Google's News Timeline includes media information from Freebase.[3]

Contents

Overview

On March 3, 2007 Metaweb publicly announced Freebase, described by the company as "an open shared database of the world's knowledge," and "a massive, collaboratively-edited database of cross-linked data." Often understood as database model using wikipedia-turned-database or entity-relationship model, Freebase provides an interface that allows non-programmers to fill in structured, or 'meta-data', of general information, and to categorize or connect data items in meaningful, or 'semantic' ways.

Described by Tim O'Reilly upon their launch, "Freebase is the bridge between the bottom up vision of Web 2.0 collective intelligence and the more structured world of the semantic web."[4]

Freebase contains data harvested from sources such as Wikipedia, ChefMoz, NNDB, and MusicBrainz, as well as individually contributed data from its users. The structured data is licensed under the Creative Commons Attribution License, [4] and a JSON based HTTP API is provided to programmers for developing applications on any platform to utilize the Freebase data. The source code for the Metaweb application itself is proprietary.

Freebase runs on a database infrastructure created in-house by Metaweb that utilizes a graph model. This means that instead of using tables and keys to define data structures, Freebase defines its data structure as a set of nodes and a set of links that establish relationships between the nodes. Because its data structure is non-hierarchical, Freebase can model much more complex relationships between individual elements than a conventional database, and is open for users to enter new objects and relationships into the underlying graph. Queries to the database are made in "Metaweb Query Language" (MQL).

Development

Danny Hillis first described his idea for creating a knowledge web he called Aristotle in a paper in 2000. But he said he did not try to build the system until he had recruited two technical experts as co-founders. Robert Cook, in parallel computing and database design, is Metaweb’s executive vice president for product development. John Giannandrea, formerly chief technologist at Tellme Networks and chief technologist of the Web browser group at Netscape/AOL, is the company’s chief technology officer.[5]

Originally accessible by invitation only, Freebase opened full anonymous read access to the public in its alpha stage of development, and now requires registration only for data contributions.

On October 29, 2008, at the International Semantic Web Conference 2008, Freebase released its RDF service for generating RDF representations of Freebase topics, allowing Freebase to be used as Linked Data.[6]

Organization and policy

Freebase's subjects (which often correspond to a Wikipedia article) are called topics and the data stored about them depend on their type, how they are classified. For example, an entry for Arnold Schwarzenegger, the former Governor of California, would be entered as a topic that would include a variety of types describing him as an actor, bodybuilder, and politician. Freebase has approximately 11.5 million topics as of April 2010.[7]

Freebase's ontologies (structured categories), known in Freebase as "types" — are themselves user-editable.[4] Each type has a number of defined predicates, called "properties".

[U]nlike the W3C approach to the semantic web, which starts with controlled ontologies, Metaweb adopts a folksonomy approach, in which people can add new categories (much like tags), in a messy sprawl of potentially overlapping assertions.[4]

In this manner, Freebase differs from the wiki model in many ways. Users can create their own types, but these types aren't adopted in the 'public commons' until promoted by a Metaweb employee. As well, users cannot modify each other's types. The reason Freebase can't open up permissions of schemas is because external apps rely on them; thus changing a type's schema, for instance by deleting a property or changing a simple property, might break queries for API users and even within Freebase itself - in saved views, for example.

Metaweb promotes some users to expert status, similar to Wikipedia's administrator policy, who are given some admin permissions.[8]

The underlying data storage supports multilingual data, but as of 2011 every user’s display language is forcibly set to English. This will change at some point.

As of 2011 the only access is via MQL.[9]

Business and community

The Freebase system is built and patented by Metaweb, a for-profit company,[10] which delivers targeted advertising on Freebase.com. [2] In terms of Freebase's relationship with the open data community:

...we have no formal relationship with other open data projects. Though the definition of open data is pretty loose, we try to follow general open data principles by not restricting access to Freebase information to registered users, charging users to access our information, imposing restrictive licenses over the use of Freebase information, or using proprietary or closed technology as a barrier to accessing Freebase information.[11]

Freebase is planning formal mappings of some of their types to established ontologies like FOAF, though this is not a priority.[12]

In the future, the company hopes to also generate profit by organizing proprietary data.[13]

Criticism

Lack of Notability guideline
Unlike Wikipedia, Freebase has no notability guidelines. Instead, it permits any data that might be of interest to other people; it does not permit transient data or that of only personal interest.[14] Under these guidelines, commercial content is permitted if it is structured, factual data. Because of this, some have raised concerns about spam.
Denormalisation
A type or base created on Freebase cannot be edited by anyone but its creator. This is a policy to prevent inexperienced or malevolent users from breaking schemas. A result of this policy is that a half-complete schema cannot be improved by other users and must instead be reproduced completely, producing non-cooperative and often duplicate types.
Information of absence
Freebase has no solution to Null, nothing, unknown or N/A values. The None topic is badly broken, because many people seem to share the same spouse. As it stands, if one were looking for "fires of unknown cause", one would look for missing causes, not knowing if the cause of the fire is really unknown or the data is missing.[15]
Bulk import tools
are used internally at Metaweb, but the reconciliation process for imported data has so far proved too complicated for public release, and public bulk tools are very limited.[16]
Multilingual implementation
Freebase has translations (or translation support) of many of its topics, but its types are currently implemented (or at least described) in natural language English, leading to challenges in developing a universal schema.

Popular applications

See also

References

  1. ^ "Freebase - Alexa". Alexa Internet, Inc.. http://www.alexa.com/siteinfo/www.freebase.com. Retrieved 17 April 2011. 
  2. ^ a b Markoff, John (2007-03-09). "Start-Up Aims for Database to Automate Web Searching". The New York Times. http://www.nytimes.com/2007/03/09/technology/09data.html?ex=1331096400&en=a87d4f61e6052888&ei=5090&partner=rssuserland&emc=rss. Retrieved 2007-03-09. 
  3. ^ "Features : Google News Timeline - Google News Help". www.google.com. http://www.google.com/support/news/bin/answer.py?answer=144273. Retrieved 2009-06-30. 
  4. ^ a b c d "Freebase Will Prove Addictive". O'Reilly Radar. 2007-03-08. http://radar.oreilly.com/archives/2007/03/freebase_will_p_1.html. Retrieved 2007-03-09. 
  5. ^ Markoff, John (2007-03-09). "Start-Up Aims for Database to Automate Web Searching". nytimes. http://www.nytimes.com/2007/03/09/technology/09data.html?_r=2&oref=slogin. Retrieved 2009-02-07. 
  6. ^ "Introducing the Freebase RDF service". http://blog.freebase.com/2008/10/30/introducing_the_rdf_service/. Retrieved 2009-02-19. 
  7. ^ http://www.freebase.com/explore
  8. ^ "new-experts-programme". http://blog.freebase.com/2008/08/29/freebases-new-experts-programme/. Retrieved 2009-02-07. 
  9. ^ http://blog.freebase.com/2007/12/04/internationalization-in-freebase/
  10. ^ "investors". http://www.metaweb.com/about/investors.html. Retrieved 2009-01-02. 
  11. ^ "faq". http://www.freebase.com/help/faq. Retrieved 2009-01-02. 
  12. ^ "introducing_the_rdf_service/". http://blog.freebase.com/2008/10/30/introducing_the_rdf_service/. Retrieved 2009-02-07. 
  13. ^ "Sharing what matters". The Economist. 2007-06-07. http://www.economist.com/printedition/displaystory.cfm?story_id=9249171. Retrieved 2007-06-15. 
  14. ^ "Freebase Contribution Guidelines". http://www.freebase.com/view/en/freebase_contribution_guidelines. Retrieved 2009-02-27. 
  15. ^ "Data-modeling discussion - None as a topic". http://lists.freebase.com/pipermail/data-modeling/2008-October/001159.html. Retrieved 2009-02-19. 
  16. ^ "blog". http://blog.freebase.com/2007/12/14/list-importer/. Retrieved 2009-01-02. 

External links