Introducing Semantic XBRL

Written by Bob Schneider
Posted on April 29, 2009 Comments
April 29, 2009 | General | Bob Schneider

Written by Ashu Bhatnagar     Posted on April 29, 2009

Ashu Bhatnagar is CEO of Good Morning Research, a Softpark company that specializes in building Semantic XBRL technology. The GoodMorningResearch.com machine  automates XBRL tagging of Excel data in RDF format with one-click Save As XBRL functionality. Mr. Bhatnagar moderates the Semantic XBRL group on LinkedIn.

The history of XBRL is already known to most of this blog’s readers — a good concise refresher is available at the Bryant University site — but the history of the Semantic Web is less likely to be well known.

Because both XBRL and the Semantic Web rely upon XML and embedded tags for assigning meaning (i.e., semantics) to elements (data), and use extensible taxonomies for definitions in a standardized manner, their futures are likely intertwined. It’s useful to have some understanding of the Semantic Web’s origins and design in order to better understand its influence on XBRL moving forward.

On the East Coast in September 1998, Tim Berners-Lee, the father of the World Wide Web, coined the term “Semantic Web” in a roadmap for future Web design.

In Tim’s words, the architecture was intended for “achieving a set of connected applications for data on the Web in such a way as to form a consistent logical web of data (semantic web).”

The roadmap noted:

The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help.

One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web. Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form.

RDF, OWL, and Sparql are examples of such languages.

Fast forward ten years. In large part because of the support of former SEC chairman Christopher Cox, XBRL today has gained global support and strong momentum from financial regulators in many countries, who increasingly mandate its adoption for company filings from publicly listed companies.

Similarly, the Semantic Web has gained a strong following from global academic researchers and the World Wide Web Consortium, the de facto body that defines standards for the web. Fruits of this advanced academic research now are beginning to show in areas like machine-automated ontology.  

XBRL leads the space when it comes to development of standardized financial taxonomies with support from market regulators, IFRS, and accounting professionals in many countries. But its continued improvements likely will benefit from Semantic Web’s advances.

Today’s XBRL is built on today’s Web, but today’s Web is not standing still! It’s growing beyond a web of hypertext-linked documents to a Semantic Web. This web is being enhanced not only by changes to XML but also by next-generation standardized languages like Resource Descriptor Format (RDF), Web Ontology Language (OWL), and Semantic Query Language (Sparql) — enhancements that complement XBRL’s framework.

In parallel, efforts are underway to impart semantics for machine automation to more of the current non-semantic web:

Many more tools and applications currently under development are focused on integrating RDF with XBRL. In short, this will amount to Semantic XBRL.

In the coming weeks, I hope to explore some of these developments more specifically, including wiki-tagging, data quality, and transparency. In the meantime, check out Tim Berners-Lee’s talk on linked data.

Hernando de Soto’s Insights Inspire XBRL Solutions

Written by Bob Schneider
Posted on April 24, 2009 Comments
April 24, 2009 | General | Bob Schneider

Written by Bob Schneider     Posted on April 24, 2009

A few weeks ago Paul Wilkinson published a terrific post on Hernando de Soto's WSJ op-ed Toxic Assets Were Hidden Assets. To oversimplify his multifaceted blog, Paul argues that XBRL is part of the solution to the current inadequacies in recording, tracking, and analyzing toxic asset data that de Soto describes in his piece.

The Peruvian economist’s thinking had also figured prominently in a post I wrote a year ago on XBRL and microfinance institutions, which provide microloans to the poor in developing nations. In his masterwork The Mystery of Capital, de Soto described how, contrary to received opinion, the poor entrepreneurs in these countries possess real assets (in the form of land), business acumen, and the same "animal spirits" as small business people in advanced economies; what they lack is a system for legal ownership of their property, making it impossible for them to borrow. Microfinance institutions provide the capital they need, and their repayment rates have been excellent. The XBRL data standard has been adopted by the Microfinance Information Exchange (MIX) for the detailed financial and social performance information generated by leading microfinance institutions.

Paul's post made me wonder what it is about de Soto's arguments that they simultaneously speak so powerfully about two asset classes that couldn't be further apart on the economic hierarchy, namely, (1) the unregistered land holdings of the poor in countries like Peru, Haiti, and Indonesia, and (2) the quant-driven derivatives traded by the 21st century Masters of the Universe (well, until recently) in New York and London.  Why do de Soto’s writings, which so far as I know never mention interactive data, inspire calls for XBRL-based solutions?

The question demanded I read The Mystery of Capital. Reviewing the front and back matter first, I noted in the acknowledgments that the book's title was suggested by Bob Litan of the left-of-center Brookings Institution, while the text was worked by David Frum, a noted right-leaning pundit. Moreover, the book had received kind words from both liberal organs like the New York Review of Books and conservative journals like Commentary. Clearly we weren't in the balkanized states of Olbermann and Hannity anymore.

Read against the context of TARP bailouts and a bankrupt financial sector, de Soto provides a startling reminder of how parts of our economy have descended into Third World status. Here's the introduction to the chapter entitled The Mystery of Missing Information:

Imagine a country where nobody can identify who owns what, addresses cannot be easily verified, people cannot be made to pay their debts, resources cannot be conveniently turned into money, ownership cannot be divided into shares, descriptions of assets are not standardized and cannot be easily compared….You have just put yourself into the life of a developing country.

How distressingly similar that is to the toxic-asset sector described in de Soto's WSJ article:

These derivatives are the root of the credit crunch. Why? Unlike all other property paper, derivatives are not required by law to be recorded, continually tracked and tied to the assets they represent. Nobody knows precisely how many there are, where they are, and who is finally accountable for them. Thus, there is widespread fear that potential borrowers and recipients of capital with too many nonperforming derivatives will be unable to repay their loans.

Whether it's land in the poor neighborhoods of Port-au-Prince or securities cooked up by the Best and the Brightest on Wall Street, assets' value is crucially tied to the efficiency and efficacy of their recordkeeping systems. From de Soto's WSJ article:

Property is much more than a body of norms. It is also a huge information system that processes raw data until it is transformed into facts that can be tested for truth, and thereby destroys the main catalysts of recessions and panics — ambiguity and opacity.

De Soto spends a good part of his book describing how the legal system for real property we take for granted today developed over many decades in the United States. In the beginning:

Every farm or settlement recorded its assets and the rules governing them in rudimentary ledgers, symbols, or oral testimony. But the information was atomized, dispersed, and not available to any one agent at any given moment. As we know too well today, an abundance of facts is not necessarily an abundance of knowledge. For knowledge to be functional, advanced nations have to integrate into one comprehensive system all their loose and isolated data about property.

Unlike the recording of securities based on real property, recordkeeping for U.S. real property itself is excellent. But it is the end result of a complex historical process extending 200 years. In his book, de Soto pays particular attention to the abundance of squatters who attempted to establish ownership through extralegal bodies like claim associations and miners organizations. Only after decades-long battles about title in the 18th and 19th centuries were extralegal and legal systems of property combined and these conflicts ultimately resolved. The economic success the U.S. subsequently enjoyed depended vitally on that resolution. 

In contrast, recordkeeping for mortgage-backed securities (MBS) is a throwback to the chaos of earlier times. In a sense, it is a betrayal of our economic heritage. Here's how Philip Moyer of EDGAR Online describes MBS reporting in his January 2009 whitepaper Bringing Transparency to the Mortgage-backed Securities Market:

As a loan moves through the many participants in the MBS supply chain, each member of the supply chain — originators, retail banks, wholesale banks, issuers, servicers and ratings agencies — decides what to report publicly and when to report it. Additionally, all players use different report formats, different data labels, different ways of tracking the status of the collateral and even different models for tracking the identity of the individual loans. As a result, a loan can receive as many as five unique IDs between its origination and when it is bundled into an MBS. There is no centralized regulator that validates or collects all of this data. There is no central repository that can be queried… Therefore, it is difficult to track the status of a single loan in an MBS — even if it is in default — because every participant has completely different reporting models. The industry is awash in a sea of disconnected and incomparable data.

How can XBRL help? On pages 17-18 of his paper, Moyer describes eight benefits of using XBRL in a centralized MBS reporting system instead of current document/spreadsheet formats:

  • Data Quality   Consistent labels and content, a structure for organizing data, and, most important, "a model to validate that the structure and the content of a report adheres to appropriate standards."
  • Security and Privacy   Traceability and detailed control of the information reported.
  • Historical Comparability   Versioning and an enduring structure allow comparisons regardless of changes in reporting requirements over time.
  • Platform Compatibility   XBRL is platform agnostic.
  • Multicurrency and Multilingual   Investors can view data in their own language and multiple currencies.
  • Reduced Reporting Costs   Sunk costs for existing regulatory and banking XBRL reporting can be leveraged.
  • Reduced Data Processing Costs   Receiving better data from upstream business partners offsets the costs incurred by each member of the supply chain.
  • International Industry Standard   Leveraging the existing body of knowledge for XBRL saves resources that would otherwise go to resolving technical and financial reporting problems.

XBRL is not a cure-all, of course. To the extent that the data continues to be plagued by incompleteness and both intentional and unintentional error, MBS reporting still will have significant problems.

Nevertheless, introducing XBRL into the process will go a long way to providing the consistent, comparable, and accessible information that de Soto recognizes as an essential element in asset value, no matter what the asset class.
 

EBRC Proposes New XBRL Taxonomy for the MD&A

Written by Bob Schneider
Posted on April 15, 2009 Comments
April 15, 2009 | General | Bob Schneider

Written by Bob Laux     Posted on April 15, 2009 

Bob Laux is the Senior Director of Financial Accounting and Reporting at Microsoft Corporation. He is responsible for Microsoft’s technical accounting, including interacting with and responding to accounting standard setters on numerous issues. Prior to joining Microsoft in 2000, Mr. Laux was an Industry Fellow at the Financial Accounting Standards Board (FASB) where he was responsible for coordinating the activities of the Emerging Issues Task Force.

As indicated in previous posts on this blog, the SEC’s rule for interactive data is a major milestone in XBRL implementation.  However, it is important that there are continued improvements in XBRL taxonomies, especially with respect to disclosures outside the financial statements and footnotes.

The SEC rule made specific reference to disclosures outside the financial statements and footnotes:

We did not propose, and are not adopting, a requirement that filers provide interactive data for their Management’s Discussion and Analysis (MD&A), executive compensation, or other financial, statistical or narrative disclosure . . . In deciding not to require the tagging of this information at this time, we agree with the commenters who believed that more experience with interactive data and a greater understanding of the costs and time associated with compliance with the requirements as proposed is needed before expanding the requirement to other information. We will continue to consider, however, the advisability of permissible optional or required interactive data for disclosures made outside a set of financial statements prepared in accordance with U.S. GAAP or IFRS as issued by the IASB or related financial statement schedules required under Commission rules (33-9002, pages 40-41).

The Enhanced Business Reporting Consortium (EBRC) is currently in the process of improving the taxonomy for Management’s Discussion and Analysis (MD&A).  As a follow on to the American Institute of Certified Public Accountants (AICPA) Special Committee on Enhanced Business Reporting, the EBRC was formed by four founding members: PricewaterhouseCoopers, Microsoft, Grant Thornton, and the AICPA. The mission of the EBRC is to establish a consortium of investors, creditors, regulators, management, and other stakeholders to improve the quality, integrity, and transparency of information used for decision-making.  In particular, the EBRC is working on enhancing the reporting model to focus not only on financial information, but also on a range of contextual and nonfinancial information that provides an enriched understanding of company performance, value drivers, strategies, and potential.

The current MD&A taxonomy consists of approximately 70 tags with most of the elements at the same hierarchical level. This required the creation of numerous company-specific extensions to make the taxonomy useful for those companies that tagged their MD&A under the SEC’s Voluntary Filing Program. While the new taxonomy being proposed by the EBRC currently only consists of approximately 140 tags, it has a structure that should significantly simplify the tagging of the information as well as facilitate access for users of this information.

It is hoped that the financial reporting supply chain will openly collaborate on the proposed taxonomy in order to create more detailed tags so that narrative information can be provided in an interactive data format. Although the SEC is not requiring the tagging of narrative information at this time, the ability to tag and consume nonfinancial information (i.e., the narrative type of disclosures that are made via MD&A and other channels of corporate communication) is increasingly useful to both companies and consumers of business information, and is vital to providing enhanced transparency in the markets.

The EBRC invites interested parties to review the proposed MD&A taxonomy at this website. Follow the instructions for logging in and the MD&A taxonomy will be midway down the navigation tree once you are logged in.
 

XBRL: An Interview with Chie Mitsui

Written by Bob Schneider
Posted on April 10, 2009 Comments
April 10, 2009 | General | Bob Schneider

Chie Mitsui is a Data Analyst at Nomura Research Institute, where she is helping design NRI's new XBRL-enhanced services. She previously worked as system designer for information service products at Jiji Press, a Japanese financial news and data distribution company. Ms. Mitsui holds a masters in physics from the Tokyo University of Science and is pursuing her MBA.

1. In February you conducted a well attended workshop at the Tokyo Stock Exchange that was sponsored by  XBRL Japan. Could you tell us its purpose and who signed up for it?

XBRL has been implemented at both the Tokyo Stock Exchange’s TDnet and the Financial Services Agency’s EDINET. Naturally, interest in using XBRL for investment analysis has increased; analysts and investors want to know “What can XBRL do for me?” These end-users represent the final and key link in the financial reporting supply chain, yet their views on XBRLized disclosures haven’t been given great weight in the implementation process.

Given that TDnet’s mission is to deliver timely financial information to the investment community, there was a strong sense that such workshops are essential for helping end-users utilize the system’s XBRLized data.  The attendees were (primarily) analysts, data intermediaries, and representatives of securities firms.

2. Were they receptive to the need for XBRLized data?

I think many end-users are not satisfied with their current choices for retrieving financial data, namely, using the services of a data vendor or inputting the data themselves from a PDF into, say, Excel.  But that doesn't necessarily mean they are completely convinced of the need for XBRLized data.

Let me give you an example from our workshop. The XBRL tool we used has dialogs to choose elements on the calculation sheets and provides easy viewing for beginners. Once you construct an initial sheet with an opened XBRL file, you can use the same sheet for other files. When attendees made the calculations for the target company we selected, some users felt it would indeed be easier to simply stick with copying data from a PDF.

But what I emphasized to these users was that, while that might be true for a single company’s data, it’s not the case when you want to compare all companies in the same industry. That’s when XBRL really shows its colors.

We used XBRLized data for 3Q 2008 to calculate the ratio of accounts receivable to sales for both the target and a broad group of companies; this would help identify the risk in reported profits and potential cash flow problems. When we compared the results, the target company’s ratio was a little higher than average. The attendees did realize the usefulness of XBRL in this situation. Nevertheless, we did have some hurdles to overcome.

3. Which were?       

For the first company we selected, the actual element was notes and accounts receivable-trade, and for most of the companies we selected this was the right choice. But for one company the element was just accounts receivable-trade, which didn’t treat notes receivable, and its inclusion generated an error message. The experience highlighted for attendees the special nature of using XBRL data.

When analysts and analysts use data from data intermediaries, the data is normalized so that items are combined for better comparability. Having the original data allows for more specificity and better analysis, but it also raises issues like the one I just cited. If you want to compare a financial ratio for dozens of companies, you do not have time to check each instance for errors. You need to choose the right element to verify your hypothesis.

4. So in your view using TDnet’s XBRL data for comparing financial ratios of many Japanese companies simultaneously can prove difficult?

Well, it’s not a problem of the XBRL – it’s the nature of Japanese financial reporting. The taxonomies used in the US reflect US GAAP accounting and have much more structure. The software enables users to identify the parent element, and they can use tree-like navigation to select the elements they need. Japanese taxonomies, in contrast, are relatively flat, and it is more difficult to know the relationships between elements. That is the real issue: how to make taxonomies.

There are other significant differences in US and Japanese taxonomies. For example, attendees were surprised that US GAAP taxonomies had contexts that identify the specific time period. In contrast, Japanese taxonomies have contexts like CurrentYearConsolidatedDuration. That doesn’t mean you’re likely to confuse 2007 and 2008 data for Japanese companies -- the taxonomy provides for that distinction. But it does require that you adjust your software settings accordingly.

5. So you see consequences for international comparisons as well?

I do. Of course, institutional investors, as well as some individual investors, trade equities on an international basis. Differing accounting standards among nations  naturally result in differing taxonomies. But beyond the dissimilarities that result from accounting standards, the nature of XBRL taxonomies, the naming of elements, and so forth can vary among countries in important ways. So for now, sector analysts who want to analyze their industries on a worldwide basis will encounter comparability issues directly related to the XBRL.

At the same time, I see opportunities for improved financial reporting through XBRL. Items that have been traditionally combined for paper-based reporting – such as notes and accounts receivable --  can be disaggregated and reported separately with XBRL. More granularity will mean more accurate reporting.   

6. What was the main conclusion you drew from the workshop?

The most important thing I came away with is the need for end-users to be part of the XBRL conversation.  After all, all of the work of XBRL implementation is ultimately aimed at them. This meeting was a good opportunity for their voices to be heard, and I hope there will be more of them in the future.