The Entity Problem
Written by David vun Kannon Posted January 16, 2007
David vun Kannon was one of the first Co-Chairs of the XBRL Specification Working Group and has been an Editor of every version of the XBRL Specification. He is a Director for PricewaterhouseCoopers, LLP.
In my blog post last week, I discussed in general how XBRL could be applied to problems in Master Data (metadata) Management. In this post, I’d like to focus on the problem of MDM for business entities.
XBRL assumes that entities exist, and are one of the fundamental objects about which people want to know facts. That is why entity is a necessary part of the context for every fact in an XBRL instance document. On the other hand, facts that don’t apply to an entity (or apply to all entities) are bad matches to XBRL. How, for example, should the value of pi be represented in XBRL?
XBRL also assumes that all entities can be named, somehow, but doesn’t create or force the use of a particular naming scheme. There are many naming systems for entities, and the instance author can choose the most appropriate one. The SEC assigns CIK numbers to companies for filing purposes; the FFIEC variously uses RSSD IDs, Cert numbers, and OCC charter numbers. Dun and Bradstreet creates DUNS numbers.
All very nice, but what happens when I want to combine XBRL data from a 10-K, a call report, and a credit report for the same bank? The combined instance documents don’t contain any clue that this CIK, that Cert, and that DUNS number all refer to the same legal entity.
This is a problem that I call the identity management problem in XBRL. It crops up in many industries, such as KYC (Know Your Customer) rules in banking. I suffer my personal version of the problem because my name has an unusual spelling, and many people or computers assume I don’t know how to spell my own name. Mail arriving at my home is addressed frequently to Mr Vun, Mr Kannon, Mr Vonkannon, Mr Vukahaho you get the picture. Direct mail advertisers must think a small militia has encamped on my property.
This problem of multiple identities in different schemes is an error when it comes to junk mail. But if I were a bank (BankDavid, the friendly bank) it would be an unavoidable fact of life.
XBRL taxonomies to the rescue. We can define an abstract element that represents the legal entity BankDavid, and associate to that abstraction the identifiers used in multiple different identification schemes, such as the CIK, Cert, and DUNS.

This kind of linkbase could be shoehorned into the standard XBRL reference linkbase, but it is probably more appropriate to separate this data out into a separate kind of linkbase (a use of the generic linkbase concept).
As a data pattern, this is the classic hub and spoke model. Adding a new identifier magnifies the value of all previously linked identifiers an example of Metcalfe’s Law.
As an example of the problems of metadata management, we only need a few thousand elements to capture all the publicly traded companies in the US. This is similar in scale to the set of financial reporting concepts in the US GAAP taxonomy well within the scale of XBRL implementations. In comparison, it is vastly smaller than the smallest data warehouse is designed for.


Bob Schneider is a Partner in
Wilson So is the Director of Hitachi America, Ltd.