Introducing Semantic XBRL
Written by Ashu Bhatnagar Posted on April 29, 2009
Ashu Bhatnagar is CEO of Good Morning Research, a Softpark company that specializes in building Semantic XBRL technology. The GoodMorningResearch.com machine automates XBRL tagging of Excel data in RDF format with one-click Save As XBRL functionality. Mr. Bhatnagar moderates the Semantic XBRL group on LinkedIn.
The history of XBRL is already known to most of this blog’s readers — a good concise refresher is available at the Bryant University site — but the history of the Semantic Web is less likely to be well known.
Because both XBRL and the Semantic Web rely upon XML and embedded tags for assigning meaning (i.e., semantics) to elements (data), and use extensible taxonomies for definitions in a standardized manner, their futures are likely intertwined. It’s useful to have some understanding of the Semantic Web’s origins and design in order to better understand its influence on XBRL moving forward.
On the East Coast in September 1998, Tim Berners-Lee, the father of the World Wide Web, coined the term “Semantic Web” in a roadmap for future Web design.
In Tim’s words, the architecture was intended for “achieving a set of connected applications for data on the Web in such a way as to form a consistent logical web of data (semantic web).”
The roadmap noted:
The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help.
One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web. Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form.
RDF, OWL, and Sparql are examples of such languages.
Fast forward ten years. In large part because of the support of former SEC chairman Christopher Cox, XBRL today has gained global support and strong momentum from financial regulators in many countries, who increasingly mandate its adoption for company filings from publicly listed companies.
Similarly, the Semantic Web has gained a strong following from global academic researchers and the World Wide Web Consortium, the de facto body that defines standards for the web. Fruits of this advanced academic research now are beginning to show in areas like machine-automated ontology.
XBRL leads the space when it comes to development of standardized financial taxonomies with support from market regulators, IFRS, and accounting professionals in many countries. But its continued improvements likely will benefit from Semantic Web’s advances.
Today’s XBRL is built on today’s Web, but today’s Web is not standing still! It’s growing beyond a web of hypertext-linked documents to a Semantic Web. This web is being enhanced not only by changes to XML but also by next-generation standardized languages like Resource Descriptor Format (RDF), Web Ontology Language (OWL), and Semantic Query Language (Sparql) — enhancements that complement XBRL’s framework.
In parallel, efforts are underway to impart semantics for machine automation to more of the current non-semantic web:
- RDFa is a variant of RDF-with-attributes from the Semantic Web viewpoint
- Similarly, Andy Greener describes Inline XBRL as undertaking the work to embed XBRL-tagged meaning within data while rendering an XBRL document within a current-standard non-semantic web page. XBRL International’s specification for Inline XBRL as a markup to embed XBRL metadata inside HTML documents is available online.
- A January 2009 article from Scientific American entitled The Semantic Web in Action — Corporate applications are well underway, and consumer uses are emerging reports some real-life applications emerging from the Semantic Web community.
- In the domain of finance, Thomson Reuters offers a free service, OpenCalais, to automate tagging unstructured text in the RDF format, used for machine-tagging financial news.
Many more tools and applications currently under development are focused on integrating RDF with XBRL. In short, this will amount to Semantic XBRL.
In the coming weeks, I hope to explore some of these developments more specifically, including wiki-tagging, data quality, and transparency. In the meantime, check out Tim Berners-Lee’s talk on linked data.


Bob Schneider is a Partner in
Wilson So is the Director of Hitachi Consulting Corporation