<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Admitting the Obvious About XBRL</title>
	<atom:link href="http://hitachidatainteractive.com/2009/06/30/admitting-the-obvious-about-xbrl/feed/" rel="self" type="application/rss+xml" />
	<link>http://hitachidatainteractive.com/2009/06/30/admitting-the-obvious-about-xbrl/</link>
	<description>XBRL News and Commentary from the Hitachi XBRL Business Unit</description>
	<lastBuildDate>Wed, 10 Mar 2010 05:20:53 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: David vun Kannon</title>
		<link>http://hitachidatainteractive.com/2009/06/30/admitting-the-obvious-about-xbrl/comment-page-1/#comment-30388</link>
		<dc:creator>David vun Kannon</dc:creator>
		<pubDate>Sun, 05 Jul 2009 14:34:03 +0000</pubDate>
		<guid isPermaLink="false">http://hitachidatainteractive.com/?p=651#comment-30388</guid>
		<description>Kurt, I think the discussion has become a lot more interesting now that you are bringing it down to a specific use case. This addresses one of my principal criticisms of your first blog post.

Let&#039;s begin by noting two distinct use cases - one for data points and one for formatted disclosures. If you think about your hypothetical request to the SEC, you are asking for data, an arbitrary set of data. Responding with whole documents is probably not the right way to answer this query! You&#039;ve got a nice WHERE clause, but your SELECT needs work. For example, look at the posts by Peter Haynesworth on xbrl-public. He&#039;s the kind of financial analyst you are thinking of, and all he wants is five or six data points on every Rhode Island based company. (I&#039;m not sure the SEC will ever have a web service like this, but other data intermediaries will.)

The other use case is for specific disclosures – data with structure and presentational qualities. For a number of netbook and web-phone uses, I do not think that client-side mashup is the right way to structure this, it should be formatted on the server.

You could possibly satisfy both of these “last mile” use cases with Inline XBRL – XBRL instance data wrapped in XHTML. The server side app is not going to send either user the entire US GAAP Taxonomy in one go, no more than any AJAX app dumps an entire database onto the client. The trick is to know how to chunk the data. The server side of “HyperAnalyst” is going to intermediate between the  web clients and the regulatory filings. It has to, because most queries are going to ask for comparative data across several filings from different companies.

It&#039;s not a question of “Is XBRL lightweight enough?” It is a question of “How do I store XBRL in a RDBMS or CMS for most effective satisfaction of the expected queries?”

Can you substitute JSON or YAML data structures for the XBRL? Sure, but then you are erecting a semantic firewall between the user and the source data.

The somewhat more substantive issue underneath all of this is whether XBRL needs to shift from a syntax based standard to an infoset/model based standard with official implementations of that model in different syntaxes. Several years ago I argued against such a move, but it is a question that should be periodically re-examined.</description>
		<content:encoded><![CDATA[<p>Kurt, I think the discussion has become a lot more interesting now that you are bringing it down to a specific use case. This addresses one of my principal criticisms of your first blog post.</p>
<p>Let&#8217;s begin by noting two distinct use cases &#8211; one for data points and one for formatted disclosures. If you think about your hypothetical request to the SEC, you are asking for data, an arbitrary set of data. Responding with whole documents is probably not the right way to answer this query! You&#8217;ve got a nice WHERE clause, but your SELECT needs work. For example, look at the posts by Peter Haynesworth on xbrl-public. He&#8217;s the kind of financial analyst you are thinking of, and all he wants is five or six data points on every Rhode Island based company. (I&#8217;m not sure the SEC will ever have a web service like this, but other data intermediaries will.)</p>
<p>The other use case is for specific disclosures – data with structure and presentational qualities. For a number of netbook and web-phone uses, I do not think that client-side mashup is the right way to structure this, it should be formatted on the server.</p>
<p>You could possibly satisfy both of these “last mile” use cases with Inline XBRL – XBRL instance data wrapped in XHTML. The server side app is not going to send either user the entire US GAAP Taxonomy in one go, no more than any AJAX app dumps an entire database onto the client. The trick is to know how to chunk the data. The server side of “HyperAnalyst” is going to intermediate between the  web clients and the regulatory filings. It has to, because most queries are going to ask for comparative data across several filings from different companies.</p>
<p>It&#8217;s not a question of “Is XBRL lightweight enough?” It is a question of “How do I store XBRL in a RDBMS or CMS for most effective satisfaction of the expected queries?”</p>
<p>Can you substitute JSON or YAML data structures for the XBRL? Sure, but then you are erecting a semantic firewall between the user and the source data.</p>
<p>The somewhat more substantive issue underneath all of this is whether XBRL needs to shift from a syntax based standard to an infoset/model based standard with official implementations of that model in different syntaxes. Several years ago I argued against such a move, but it is a question that should be periodically re-examined.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paul Wilkinson</title>
		<link>http://hitachidatainteractive.com/2009/06/30/admitting-the-obvious-about-xbrl/comment-page-1/#comment-30185</link>
		<dc:creator>Paul Wilkinson</dc:creator>
		<pubDate>Thu, 02 Jul 2009 14:34:26 +0000</pubDate>
		<guid isPermaLink="false">http://hitachidatainteractive.com/?p=651#comment-30185</guid>
		<description>Readers may be interested in conversations along these lines at the Web site soliciting comment on technology to expedite economic recovery:

http://www.thenationaldialogue.org/ideas/data-quality-and-data-standards-are-key-to-gaining-public-trust

One clarification: By &quot;pro-sumer,&quot; I mean the Toffler concept: http://en.wikipedia.org/wiki/Prosumer.

If David Weinberger is correct that &quot;Everything is Miscellaneous,&quot; my attempt to apply a taxonomy to data users may be misguided. In any case, I don&#039;t think DVK is saying Web technical users are unimportant. David? I agree Web technical users are vital -- and must also have standards available to them so they can make it as easy as possible for non-technical and deep-technical users alike to move from documents to data. Of course, standards should be as light as possible, but no lighter. I look forward to witnessing progress along all of these lines at the Santa Clara conference, July 28-30 (http://xbrl.us/News/Pages/20090629.aspx).

Today&#039;s unemployment data underscore the importance of fast action to improve market trust and confidence -- and bring to mind Elvis:

A little less conversation, a little more action please
All this aggravation aint satisfactioning me
A little more bite and a little less bark
A little less fight and a little more spark

Or, to paraphrase another thinker, it&#039;s not important how it happens, just that it happens -- it being better information for economic growth. Notwithstanding Elvis, whatever David, Kurt, and others can do to break down data standard silos can only expedite progress. That requires conversations like yours. Keep up the good work!</description>
		<content:encoded><![CDATA[<p>Readers may be interested in conversations along these lines at the Web site soliciting comment on technology to expedite economic recovery:</p>
<p><a href="http://www.thenationaldialogue.org/ideas/data-quality-and-data-standards-are-key-to-gaining-public-trust" rel="nofollow">http://www.thenationaldialogue.org/ideas/data-quality-and-data-standards-are-key-to-gaining-public-trust</a></p>
<p>One clarification: By &#8220;pro-sumer,&#8221; I mean the Toffler concept: <a href="http://en.wikipedia.org/wiki/Prosumer" rel="nofollow">http://en.wikipedia.org/wiki/Prosumer</a>.</p>
<p>If David Weinberger is correct that &#8220;Everything is Miscellaneous,&#8221; my attempt to apply a taxonomy to data users may be misguided. In any case, I don&#8217;t think DVK is saying Web technical users are unimportant. David? I agree Web technical users are vital &#8212; and must also have standards available to them so they can make it as easy as possible for non-technical and deep-technical users alike to move from documents to data. Of course, standards should be as light as possible, but no lighter. I look forward to witnessing progress along all of these lines at the Santa Clara conference, July 28-30 (<a href="http://xbrl.us/News/Pages/20090629.aspx)" rel="nofollow">http://xbrl.us/News/Pages/20090629.aspx)</a>.</p>
<p>Today&#8217;s unemployment data underscore the importance of fast action to improve market trust and confidence &#8212; and bring to mind Elvis:</p>
<p>A little less conversation, a little more action please<br />
All this aggravation aint satisfactioning me<br />
A little more bite and a little less bark<br />
A little less fight and a little more spark</p>
<p>Or, to paraphrase another thinker, it&#8217;s not important how it happens, just that it happens &#8212; it being better information for economic growth. Notwithstanding Elvis, whatever David, Kurt, and others can do to break down data standard silos can only expedite progress. That requires conversations like yours. Keep up the good work!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kurt Cagle</title>
		<link>http://hitachidatainteractive.com/2009/06/30/admitting-the-obvious-about-xbrl/comment-page-1/#comment-30156</link>
		<dc:creator>Kurt Cagle</dc:creator>
		<pubDate>Thu, 02 Jul 2009 07:13:20 +0000</pubDate>
		<guid isPermaLink="false">http://hitachidatainteractive.com/?p=651#comment-30156</guid>
		<description>David and Paul,

This is turning into a very interesting discussion, though I’d like to see it move away from being confrontational (and will endeavor to do the same on my end). To make it a little easier to address points, I’m going to number them, in response to both comments.

1. &lt;b&gt;The Simplicity Principle.&lt;/b&gt; I think that Paul gives a couple of very good examples, perhaps not intentionally, about the simplicity aspect. I know Tim Bray (have even had the rare privilege of his kids and mine playing together once or twice) and while I can’t speak for him, I suspect that what he’s saying actually is quite applicable. Paul lays out the distinction for the specification as non-expert consumers (lay people), expert consumers (analysts), and pro-sumers (regulators &amp; Fortune 500/1000 accountants). While I think this is valid division, I’d also recognize a second dimension - non-technical, deep technical users, and web technical users, which I’d define as follows:

&lt;b&gt;Non-Technical Users.&lt;/b&gt; These are people that may have some accounting, finance or investing skill, but that do not have the technical ability to build applications that can utilize XBRL in any manner, and as such, are dependent upon tools that will produce meaningful results from XBRL resources.
&lt;b&gt;Deep-Technical Users.&lt;/b&gt; These are application developers that are deeply skilled with XBRL, have the technical acumen to integrate these into commercial applications or pipeline processing applications (such as BPEL), and are usually employed by vendors developing tools in this space.
&lt;b&gt;Web Technical Users.&lt;/b&gt; These are people who work with data streams and data processors to build what are called “mashups” in the vernacular, though I think a better term for these people are integration developers.

David’s contention is that this third group of people (if I’m reading your last post correctly) are not that important to the evolution of XBRL. My contention is that they are in fact critical, which is in fact the underlying thesis of the article that I originally wrote.

There are four primary trends going on in application development circles at this stage, and each of these has some impact upon XBRL. The first is the rise of RESTful Web Services. What this means in practical terms is that we are moving to a deployment model in which resource URLs act as front ends to databases, HTTP GET, POST, PUT and DELETE become primary mechanisms for interacting with databases, and parameterizing these URLs (essentially binding them to queries within the database) can be used to return feeds of content that are custom to a particular goal.

For instance, I should be able to point to the SEC&#039;s website relatively soon with a URL of the form 

&lt;code&gt;http://www.sec.gov/xbrls?q=total-assets:1.2B,5.5B;reporting-quarters:1Q08-3Q09;industy:pharmaceuticals&lt;/code&gt;

and get a feed (either RSS, Atom or possible plain old XML) which contains links to all XBRL filings that satisfy the above constraints (the specific URL is fictional, just used for illustration purposes). 

The links in turn would return the XBRL documents in forms that could be utilized by the lay consumer (from simple HTML based report listings for that XBRL) to the analyst (forms that provide graphical analysis of all pharmacuticals in that domain) and to the regulator (showing trend analysis for company X compared to the industry overall). The mechanism for generating these different faces come about via XQuery or XSLT Transformations, or via jQuery or similar components. 

Similarly, I should be able to post an XBRL document to the SEC, probably along a secure HTTPS conduit, which would then be validated, processed and made available as part of the feeds described above. What&#039;s more, this same kind of mechanism also allows for the development of web enabled tools (XForms, XBL based or one of a couple of dozen AJAX toolkits from TIBCO to jQuery) to allow for the creation of XBRL content, right from your browser.

Such information, moreover, can be processed in such ways as to provide a certain degree of opacity as necessary for the companies in question in order to prevent exposure of sensitive content - through role-based filters or similar mechanisms. This type of mechanism is one of the key characteristics of medical record systems (which I have worked on, in the HL7 space) and is instrumental to being able to provide adequate adoption of records in that space. 

This kind of architecture is now becoming prevalent in computing circles, and similar architectures exist for working with such content in XQuery, Ruby, Python, RESTful Java Servlets and .NET, among others.

This technology is here. It has proven itself on smaller scale implementations, and is now scaling its way up to domain content in areas as intractible as health records, government specifications and, yes, business reporting formats. Thus, when I&#039;m talking about the kid building mashups, keep in mind that this kid is a programmer working for the SEC or fill in the blank Fortune 1000 company.

The second trend that&#039;s occurring at this stage is the evolution of document-centric (and property centric) databases. When XBRL was initially designed, most of the databases that were available to handle processing were relational SQL databases, and it shows clearly in the underlying design of XBRL, in which structures are maintained in as flat a manner as possible in order to better handle such databases. 

However, since 2007, there has been a profound shift in the market towards the emergence of XML-capable databases, most of them employing XQuery.  In mid-2009, as I write this, every major database has an XML database offering, often integrated directly into their next generation product line. Companies also have dedicated XML databases that are completely optimized for XML querying, able to serve up XML content in a small fraction of the time of established vendor systems, and there are even a number of very capable XML databases such as eXist-db available in the open source realm.

Most of my comments about XBRL&#039;s lack of optimization come from the fact that the linkbase structures that XBRL employs are the least efficient way of being able to pull together relevant data within the technologies employed by these databases, decreasing performance in some cases by an order of magnitude or more compared to document structures. I know - I&#039;ve run a number of benchmarks comparing multiply linked flat content compared to document-centric content (XBRL documents, specifically from the SEC), and the performance from such unoptimized XBRL frankly sucks. The most efficient solution is, upon importing such content from the SEC feed, to actually reify it into folded document structures and use that, which makes the accessibility of the documents reasonable for queries. This is my central argument, that if canonical XBRL is in fact this inefficient, perhaps an alternate format that is more document-like may be more amenable to processing.

The next trend that is occurring is the move towards client modularization. We&#039;re moving away from storing HTML content and moving towards streaming of structured data content (mostly XML and JSON, but occasionally YAML or similar formats) to web-enabled components that can then handle conversion of this structured data into an appropriate output format. In this case, small and lightweight matter.

To give you an example of this, consider the case of a hypothetical tool called HyperAnalyst. It&#039;s an extension to Firefox or (soon) Google Chrome that will query a repository of financial reports in XBRL* as illustrated above, and receive a feed to the relevant reports through an appropriate web or web-like gui. This component may be a browser extension. It may be an inline component. It may be a stand-alone widget built on Webkit or similar application, or even an iPhone app, but the key is that it will be accessible at a moment&#039;s notice, and it will allow for fairly sophisticated metrics and similar comparisons through services. The tool may also grab FinancialXML and UML or custom XML or JSON from the NYSE or Bourse.

The question I&#039;d raise is whether XBRL, as it exists now, is lightweight enough to work within such an environment. Your response time is no longer going to be acceptable if it takes several minutes to download a set of five or six large XML files on even a high bandwidth system, and if that code also needs to renormalize the internal references and bind that to the appropriate presentation layers for validation, this application is going to be a dog performance-wise.

Is that analyst not in your use case for XBRL? Most financial analysts and financial reporters that I know are as likely to be watching the market over their iPhone or Netbook while talking with clients or investigating trends while sitting at Starbucks. Those are tomorrow&#039;s platforms (and rapidly are becoming today&#039;s platforms).

The final trend is the shift from static to streaming content. The scenario that I see here is one that I expect is going to be extraordinary common for areas such as carbon markets (because they are essentially run by by the same young turks that are writing those widgets), but will eventually shift to most equity and commodity markets. In this particular case, what you see within a company is a RESTful XBRL server that is tied into the financial operating system of a company. What this means is that, far from reporting information on an annual or quarterly basis, this information is being updated &lt;i&gt;in real time&lt;/i&gt; - the financials of a given company are essentially up to date as of the last day, or hour, or even minute. 

This scenario is of course one of the holy grails of XBRL (and should be if it&#039;s not) - real time financial reporting is what makes this technology so compelling. Do I think that banks and big multinationals are going to move to this? No (at least not immediately), but I also think that banks and big multinationals actually represent only a small portion of the overall potential for XBRL as a technology. This technology will become dominant with new companies that are going to come out of the current economic carnage, because they will see it as a selling point - their financials are transparent, people can trust them. Again, I think that if such a system becomes reliant upon a complex format that doesn&#039;t integrate cleanly with their XML pipelines, that doesn&#039;t respect the idea of services, and that isn&#039;t performant, then they will seek solutions that are - and XBRL will become an also-ran technology.

This is the case to be made for lightweight protocols that Tim Bray, Jon Udell and others are advocating. Both Tim and Jon and a number of others that I know in the technology community that are following emerging technologies (and in some cases creating those same technologies) understand that the above scenario is happening, and recognize that simply having a common accounting language is not enough. Some concession needs to be made to its use within the web in order for that technology to succeed.

BTW, there are certainly precedents for this. The Geographical Markup Language was formulated by the OGC at about the same time as XBRL, and shares certain similarities in terms of complexity and weight. As more and more people used GML, they came to realize that it was not sufficiently performant for web use, and two different standards - one based upon Google&#039;s KML format for GoogleEarth, the second, geoRSS, based upon the Atom specification, were developed to provide top-level metadata and a packaging mechanism over syndication channels. These proved so successful that the OGC incorporated them as part of the GML specification, essentially as wire formats. 

This is what I was referring to in my original posting that prompted this discussion. There is in fact no need to radically change the underlying XBRL model, but such wire formats would automatically perform the task that currently needs to be handled programmatically - denormalize the contexts, associate labels, calculations, and reference bindings to property nodes as attributes, consolidate the XSD schema, and bind them together as a single XML document, identifying contexts so that in those cases where you have one-to-many associations you can create idrefs back to the initial id.

2. &lt;b&gt;The Semantic Web.&lt;/b&gt; I endorse both document-centric development and semantic web structures for a very simple reason - they do very different things. XML is relatively lightweight, can be quickly indexed and queried and can be transformed readily into different formats, making it ideal for both web-centric and pipeline oriented applications (which are, in my experience, converging to the same thing). XML is not good for imputing relationships, because it is node-centric rather than link-centric, and as such, it has less utility for performing inferential analysis with other data that may not necessarily fit within its schema.

RDF/OWL, on the other hand, is edge-centric, and its primary mechanism is to create relational graphs. XML lets you compare apples and apples, and for many, many tasks, that is primarily what is needed. RDF/OWL lets you compare apples and oranges, and infer from that that both are fruit, both are roughly round shape, that they originate from different parts of the world and that one happens to be earning more in the Chicago commodities exchange.

As such, RDF/OWL requires both a different set of computational mechanisms and a different way of storing and working with that information. It is also, generally, a couple of magnitudes slower in processing time than XML because what you are doing with inferencing is creating networks of informational content on the fly - a significantly more complicated task than traversing trees. As such, SW (I don&#039;t like typing RDF/OWL all the time, so will use this as an abbreviation for Semantic Web) has different types of applicability than XML.

The fundamental challenge that XBRL has in the Semantic world is that the linkage architecture that comes from link bases is very primitive in comparison to what SW is able to bring to bear, from my understanding of both. There&#039;s a certain degree of inference which needs to be made in order to extract enough information out of an XBRL document to make it worthwhile. That doesn&#039;t mean its not worth the effort, it just means that it is not a trivial operation.

One of the things I&#039;ve come to realize with regard to SW is that it really requires the advent of a powerful triples store database in order to gain significant benefit from it - just putting an XBRL document into an RDF or Turtle format doesn&#039;t gain you that much advantage, save that it makes it easier for such a database to consume the XBRL and build the triples linkages from it.

I think that this is an effort that should be seen as another &quot;format&quot; for XBRL, one for which you can write a single centralized transformation (such as being worked on now by Dave Raggett of the W3C). Its utility comes from the fact that once you have an XBRL document in RDF, it can be consumed by a triples database (and converted, as a consequence, into a set of triples) which in turn handles the associated mappings and bindings for creating appropriate &quot;lenses&quot; to view the data. This means that you can do such things as described above - being able to see how the R&amp;D investment vs. earnings ratios for  those same pharmaceuticals correlate to the efficacy of treating depression pharmecologically, something that would be difficult to do with XML because what&#039;s being explored here are relationships, not nodal content.

Again, and I think this is an important point, what is being explored here are ways of utilizing the information of the models in formats that are appropriate to their respective needs. The XBRL format that was developed over the last decade did so largely as an institution-to-institution mechanism, and as such evolved to fill that niche. I believe it is a mistake to confuse the ontological design with the delivery format, and to recognize the benefit of having expressions of the language that can readily (and directly) work with other technologies such as the Semantic Web.

3. &lt;b&gt;Document Enrichment and Documentation.&lt;/b&gt; There is yet another middle ground that is being explored by the XBRL organization, and that is the notion of RDFa in the context of document enrichment. Enrichment&#039;s an interesting concept - by the use of attribute bindings you can associate information in a document with a particular conceptual namespace. There are two approaches that can be employed here. The first approach is that an accountant, probably via tools, can embed within an annual report written in HTML, DITA, ODF or WORDML, the relevant associations to property bindings against an XBRL ontology. The relationships so embedded can then be parsed by a GRDDL processor to generate RDF, which can then be consumed by a triples store.

The second approach is perhaps more intriguing, and that is to utilize this process in reverse to generate reports. In this case, the report writer establishes the RDFa elements in a document template, and, by running these against either an XQuery or SPARQL transformation (or an XSLT, for that matter), this populates the template with the appropriate values for the given context or operation. I see it being employed much more extensively now that the W3C Semantic Web Group has stepped up their XBRL Activity.

By the way, you also asked about the role that DITA plays, vis-a-vis XBRL, as well as it&#039;s &quot;goodness&quot;. Again, my focus is essentially on utility. DITA is a topic-oriented documentation system, and, while it&#039;s something of a pain in the butt to set up, is very effective at being able to create discrete topical blocks that can, nonetheless, be compiled together via DITA maps into localized help and information context systems. In that regard, I think that XBRL and DITA are ideally suited for one another, especially as you see XBRL move from being a static format to being more of a services oriented technology. DITA can be transformed into HTML content, can embed XBRL/RDFa content and contexts, and can similarly be compiled into more extensive documentation, from CFM help files to large scale living annual reports.

4. &lt;b&gt;NIEM.&lt;/b&gt; There were a group of about five senior ontologists and information architects that collectively worked on the NIEM architecture from its origins in the DoJ to its current &quot;wildfire&quot; growth through much of the Federal government. I know two of them personally, and they are easily two of the best data architects I&#039;ve ever met, globally, so I&#039;d be careful about dismissing NIEM out of hand. More to the point, they begun this effort largely under the radar of the Bush administration, and as such, the level of politicization was consequently very low.

I brought up NIEM to illustrate differences in architectural approach, not necessarily to endorse it as a better solution (though I think, obviously, that it is a good one). More to the point, I also think that it has the potential to be a competitive solution to XBRL within other governmental agencies, especially in areas such as procurement, resource accounting, financial contract management, and energy systems management. Without Chris Cox championing the standard at the SEC, and Mary Schapiro singularly cool about XBRL, I think that understanding specifications such as NIEM and how XBRL could be adapted to work in an IEP model might prove to be most prudent.
specification

5. &lt;b&gt;Namespaces.&lt;/b&gt; Again, this is a difference in design principles. NIEM utilizes namespaces as the cornerstone of a document design architecture - in essence, there&#039;s a one-to-one correlation between a namespace and an IEP (not strictly true in all cases, but a good general rule). XBRL utilizes namespaces to identify authority rather than topicality, as far as I can tell. Both are valid approaches, and there are somecorrelation between  deep debates within the XML community about the relationship between namespaces and object classification. Personal preference, I&#039;ve come to align toward namespaces establishing a topical reference rather than just an authoritative one as this seems to result in cleaner ontologies, but I&#039;d also concede that large scale data design is still more art than science, and as such is subject to interpretation.

6. &lt;b&gt;External citations.&lt;/b&gt; This may be perhaps a misunderstanding of mine with regard to XBRL. External citations in this regard refered to Reference Linkbases, which serves, I assume, to establish equivalency of terminology (if term A in XBRL document A and term B in XBRL document B both reference the same reference citation under the same arc role, then this should imply that there is a semantic equivalency between A and B). However, I&#039;ve noticed that this isn&#039;t always satisfied properly in XBRL documents that I&#039;ve read, so going by real world examples, I may be missing something.

7. &lt;b&gt;XLink&lt;/b&gt;. XLink has been somewhat contentious for awhile, largely because as a standard it has needed to satisfy the concept of linkages in a number of different contexts. The RDF/OWL community would argue that the Semantic Web effectively obviates the need for XLink. The HTML community gets twitchy about anything but simple links, the XML community has perhaps been deliberately vague with regard to this specification, and its adoption in other areas tends to be spotty - SVG, XForms utilize it, XHTML theoretically does but only XML purists use xlink, DocBook (which isn&#039;t a W3C spec) uses it, but other technologies don&#039;t. XInclude theoretically does use it, but XInclude seems to scare the heck out of the HTML Working Group (of course, anything even vaguely XMLish scares at least some members of that group, but that&#039;s not a discussion for this forum).

David, I personally agree with you on XLink. I&#039;d like to see the W3C come to some formal decision as to what actually it is, does, and where it&#039;s scope lies, and would prefer to see it universally required in the underlying schemas, but I&#039;m probably one of the few people who feels that way.



In summary, my suggestions with regard to XBRL involve examining that analyst use case, as ultimately I think it has a great deal of bearing upon the further evolution of the language and its adoption. I am not suggesting that XBRL 2.1 change in any way - it solves handily the use cases that it was initially designed for, it has widespread adoption in the institution-to-institution space, and it effectively does so in a manner that is consonent with the existing participants. However, I do feel that this optimization leaves it suboptimal in other areas for which there are relatively simple fixes - making it more streamlined and better able to work in a services environment, providing an accepted mechanism for mapping XBRL into Semantic Web terms for consumption in inference engines, establishing ways that the language can co-exist in mission critical arenas with other specifications, from NIEM to DITA to RDFa. None of these reduce the importance of the language, its design or its goals, and I believe that all of these may actually encourage wider-spread adoption of the language outside of its initial area of regulatory reporting.

</description>
		<content:encoded><![CDATA[<p>David and Paul,</p>
<p>This is turning into a very interesting discussion, though I’d like to see it move away from being confrontational (and will endeavor to do the same on my end). To make it a little easier to address points, I’m going to number them, in response to both comments.</p>
<p>1. <b>The Simplicity Principle.</b> I think that Paul gives a couple of very good examples, perhaps not intentionally, about the simplicity aspect. I know Tim Bray (have even had the rare privilege of his kids and mine playing together once or twice) and while I can’t speak for him, I suspect that what he’s saying actually is quite applicable. Paul lays out the distinction for the specification as non-expert consumers (lay people), expert consumers (analysts), and pro-sumers (regulators &amp; Fortune 500/1000 accountants). While I think this is valid division, I’d also recognize a second dimension &#8211; non-technical, deep technical users, and web technical users, which I’d define as follows:</p>
<p><b>Non-Technical Users.</b> These are people that may have some accounting, finance or investing skill, but that do not have the technical ability to build applications that can utilize XBRL in any manner, and as such, are dependent upon tools that will produce meaningful results from XBRL resources.<br />
<b>Deep-Technical Users.</b> These are application developers that are deeply skilled with XBRL, have the technical acumen to integrate these into commercial applications or pipeline processing applications (such as BPEL), and are usually employed by vendors developing tools in this space.<br />
<b>Web Technical Users.</b> These are people who work with data streams and data processors to build what are called “mashups” in the vernacular, though I think a better term for these people are integration developers.</p>
<p>David’s contention is that this third group of people (if I’m reading your last post correctly) are not that important to the evolution of XBRL. My contention is that they are in fact critical, which is in fact the underlying thesis of the article that I originally wrote.</p>
<p>There are four primary trends going on in application development circles at this stage, and each of these has some impact upon XBRL. The first is the rise of RESTful Web Services. What this means in practical terms is that we are moving to a deployment model in which resource URLs act as front ends to databases, HTTP GET, POST, PUT and DELETE become primary mechanisms for interacting with databases, and parameterizing these URLs (essentially binding them to queries within the database) can be used to return feeds of content that are custom to a particular goal.</p>
<p>For instance, I should be able to point to the SEC&#8217;s website relatively soon with a URL of the form </p>
<p><code><a href="http://www.sec.gov/xbrls?q=total-assets:1.2B,5.5B;reporting-quarters:1Q08-3Q09;industy:pharmaceuticals" rel="nofollow">http://www.sec.gov/xbrls?q=total-assets:1.2B,5.5B;reporting-quarters:1Q08-3Q09;industy:pharmaceuticals</a></code></p>
<p>and get a feed (either RSS, Atom or possible plain old XML) which contains links to all XBRL filings that satisfy the above constraints (the specific URL is fictional, just used for illustration purposes). </p>
<p>The links in turn would return the XBRL documents in forms that could be utilized by the lay consumer (from simple HTML based report listings for that XBRL) to the analyst (forms that provide graphical analysis of all pharmacuticals in that domain) and to the regulator (showing trend analysis for company X compared to the industry overall). The mechanism for generating these different faces come about via XQuery or XSLT Transformations, or via jQuery or similar components. </p>
<p>Similarly, I should be able to post an XBRL document to the SEC, probably along a secure HTTPS conduit, which would then be validated, processed and made available as part of the feeds described above. What&#8217;s more, this same kind of mechanism also allows for the development of web enabled tools (XForms, XBL based or one of a couple of dozen AJAX toolkits from TIBCO to jQuery) to allow for the creation of XBRL content, right from your browser.</p>
<p>Such information, moreover, can be processed in such ways as to provide a certain degree of opacity as necessary for the companies in question in order to prevent exposure of sensitive content &#8211; through role-based filters or similar mechanisms. This type of mechanism is one of the key characteristics of medical record systems (which I have worked on, in the HL7 space) and is instrumental to being able to provide adequate adoption of records in that space. </p>
<p>This kind of architecture is now becoming prevalent in computing circles, and similar architectures exist for working with such content in XQuery, Ruby, Python, RESTful Java Servlets and .NET, among others.</p>
<p>This technology is here. It has proven itself on smaller scale implementations, and is now scaling its way up to domain content in areas as intractible as health records, government specifications and, yes, business reporting formats. Thus, when I&#8217;m talking about the kid building mashups, keep in mind that this kid is a programmer working for the SEC or fill in the blank Fortune 1000 company.</p>
<p>The second trend that&#8217;s occurring at this stage is the evolution of document-centric (and property centric) databases. When XBRL was initially designed, most of the databases that were available to handle processing were relational SQL databases, and it shows clearly in the underlying design of XBRL, in which structures are maintained in as flat a manner as possible in order to better handle such databases. </p>
<p>However, since 2007, there has been a profound shift in the market towards the emergence of XML-capable databases, most of them employing XQuery.  In mid-2009, as I write this, every major database has an XML database offering, often integrated directly into their next generation product line. Companies also have dedicated XML databases that are completely optimized for XML querying, able to serve up XML content in a small fraction of the time of established vendor systems, and there are even a number of very capable XML databases such as eXist-db available in the open source realm.</p>
<p>Most of my comments about XBRL&#8217;s lack of optimization come from the fact that the linkbase structures that XBRL employs are the least efficient way of being able to pull together relevant data within the technologies employed by these databases, decreasing performance in some cases by an order of magnitude or more compared to document structures. I know &#8211; I&#8217;ve run a number of benchmarks comparing multiply linked flat content compared to document-centric content (XBRL documents, specifically from the SEC), and the performance from such unoptimized XBRL frankly sucks. The most efficient solution is, upon importing such content from the SEC feed, to actually reify it into folded document structures and use that, which makes the accessibility of the documents reasonable for queries. This is my central argument, that if canonical XBRL is in fact this inefficient, perhaps an alternate format that is more document-like may be more amenable to processing.</p>
<p>The next trend that is occurring is the move towards client modularization. We&#8217;re moving away from storing HTML content and moving towards streaming of structured data content (mostly XML and JSON, but occasionally YAML or similar formats) to web-enabled components that can then handle conversion of this structured data into an appropriate output format. In this case, small and lightweight matter.</p>
<p>To give you an example of this, consider the case of a hypothetical tool called HyperAnalyst. It&#8217;s an extension to Firefox or (soon) Google Chrome that will query a repository of financial reports in XBRL* as illustrated above, and receive a feed to the relevant reports through an appropriate web or web-like gui. This component may be a browser extension. It may be an inline component. It may be a stand-alone widget built on Webkit or similar application, or even an iPhone app, but the key is that it will be accessible at a moment&#8217;s notice, and it will allow for fairly sophisticated metrics and similar comparisons through services. The tool may also grab FinancialXML and UML or custom XML or JSON from the NYSE or Bourse.</p>
<p>The question I&#8217;d raise is whether XBRL, as it exists now, is lightweight enough to work within such an environment. Your response time is no longer going to be acceptable if it takes several minutes to download a set of five or six large XML files on even a high bandwidth system, and if that code also needs to renormalize the internal references and bind that to the appropriate presentation layers for validation, this application is going to be a dog performance-wise.</p>
<p>Is that analyst not in your use case for XBRL? Most financial analysts and financial reporters that I know are as likely to be watching the market over their iPhone or Netbook while talking with clients or investigating trends while sitting at Starbucks. Those are tomorrow&#8217;s platforms (and rapidly are becoming today&#8217;s platforms).</p>
<p>The final trend is the shift from static to streaming content. The scenario that I see here is one that I expect is going to be extraordinary common for areas such as carbon markets (because they are essentially run by by the same young turks that are writing those widgets), but will eventually shift to most equity and commodity markets. In this particular case, what you see within a company is a RESTful XBRL server that is tied into the financial operating system of a company. What this means is that, far from reporting information on an annual or quarterly basis, this information is being updated <i>in real time</i> &#8211; the financials of a given company are essentially up to date as of the last day, or hour, or even minute. </p>
<p>This scenario is of course one of the holy grails of XBRL (and should be if it&#8217;s not) &#8211; real time financial reporting is what makes this technology so compelling. Do I think that banks and big multinationals are going to move to this? No (at least not immediately), but I also think that banks and big multinationals actually represent only a small portion of the overall potential for XBRL as a technology. This technology will become dominant with new companies that are going to come out of the current economic carnage, because they will see it as a selling point &#8211; their financials are transparent, people can trust them. Again, I think that if such a system becomes reliant upon a complex format that doesn&#8217;t integrate cleanly with their XML pipelines, that doesn&#8217;t respect the idea of services, and that isn&#8217;t performant, then they will seek solutions that are &#8211; and XBRL will become an also-ran technology.</p>
<p>This is the case to be made for lightweight protocols that Tim Bray, Jon Udell and others are advocating. Both Tim and Jon and a number of others that I know in the technology community that are following emerging technologies (and in some cases creating those same technologies) understand that the above scenario is happening, and recognize that simply having a common accounting language is not enough. Some concession needs to be made to its use within the web in order for that technology to succeed.</p>
<p>BTW, there are certainly precedents for this. The Geographical Markup Language was formulated by the OGC at about the same time as XBRL, and shares certain similarities in terms of complexity and weight. As more and more people used GML, they came to realize that it was not sufficiently performant for web use, and two different standards &#8211; one based upon Google&#8217;s KML format for GoogleEarth, the second, geoRSS, based upon the Atom specification, were developed to provide top-level metadata and a packaging mechanism over syndication channels. These proved so successful that the OGC incorporated them as part of the GML specification, essentially as wire formats. </p>
<p>This is what I was referring to in my original posting that prompted this discussion. There is in fact no need to radically change the underlying XBRL model, but such wire formats would automatically perform the task that currently needs to be handled programmatically &#8211; denormalize the contexts, associate labels, calculations, and reference bindings to property nodes as attributes, consolidate the XSD schema, and bind them together as a single XML document, identifying contexts so that in those cases where you have one-to-many associations you can create idrefs back to the initial id.</p>
<p>2. <b>The Semantic Web.</b> I endorse both document-centric development and semantic web structures for a very simple reason &#8211; they do very different things. XML is relatively lightweight, can be quickly indexed and queried and can be transformed readily into different formats, making it ideal for both web-centric and pipeline oriented applications (which are, in my experience, converging to the same thing). XML is not good for imputing relationships, because it is node-centric rather than link-centric, and as such, it has less utility for performing inferential analysis with other data that may not necessarily fit within its schema.</p>
<p>RDF/OWL, on the other hand, is edge-centric, and its primary mechanism is to create relational graphs. XML lets you compare apples and apples, and for many, many tasks, that is primarily what is needed. RDF/OWL lets you compare apples and oranges, and infer from that that both are fruit, both are roughly round shape, that they originate from different parts of the world and that one happens to be earning more in the Chicago commodities exchange.</p>
<p>As such, RDF/OWL requires both a different set of computational mechanisms and a different way of storing and working with that information. It is also, generally, a couple of magnitudes slower in processing time than XML because what you are doing with inferencing is creating networks of informational content on the fly &#8211; a significantly more complicated task than traversing trees. As such, SW (I don&#8217;t like typing RDF/OWL all the time, so will use this as an abbreviation for Semantic Web) has different types of applicability than XML.</p>
<p>The fundamental challenge that XBRL has in the Semantic world is that the linkage architecture that comes from link bases is very primitive in comparison to what SW is able to bring to bear, from my understanding of both. There&#8217;s a certain degree of inference which needs to be made in order to extract enough information out of an XBRL document to make it worthwhile. That doesn&#8217;t mean its not worth the effort, it just means that it is not a trivial operation.</p>
<p>One of the things I&#8217;ve come to realize with regard to SW is that it really requires the advent of a powerful triples store database in order to gain significant benefit from it &#8211; just putting an XBRL document into an RDF or Turtle format doesn&#8217;t gain you that much advantage, save that it makes it easier for such a database to consume the XBRL and build the triples linkages from it.</p>
<p>I think that this is an effort that should be seen as another &#8220;format&#8221; for XBRL, one for which you can write a single centralized transformation (such as being worked on now by Dave Raggett of the W3C). Its utility comes from the fact that once you have an XBRL document in RDF, it can be consumed by a triples database (and converted, as a consequence, into a set of triples) which in turn handles the associated mappings and bindings for creating appropriate &#8220;lenses&#8221; to view the data. This means that you can do such things as described above &#8211; being able to see how the R&amp;D investment vs. earnings ratios for  those same pharmaceuticals correlate to the efficacy of treating depression pharmecologically, something that would be difficult to do with XML because what&#8217;s being explored here are relationships, not nodal content.</p>
<p>Again, and I think this is an important point, what is being explored here are ways of utilizing the information of the models in formats that are appropriate to their respective needs. The XBRL format that was developed over the last decade did so largely as an institution-to-institution mechanism, and as such evolved to fill that niche. I believe it is a mistake to confuse the ontological design with the delivery format, and to recognize the benefit of having expressions of the language that can readily (and directly) work with other technologies such as the Semantic Web.</p>
<p>3. <b>Document Enrichment and Documentation.</b> There is yet another middle ground that is being explored by the XBRL organization, and that is the notion of RDFa in the context of document enrichment. Enrichment&#8217;s an interesting concept &#8211; by the use of attribute bindings you can associate information in a document with a particular conceptual namespace. There are two approaches that can be employed here. The first approach is that an accountant, probably via tools, can embed within an annual report written in HTML, DITA, ODF or WORDML, the relevant associations to property bindings against an XBRL ontology. The relationships so embedded can then be parsed by a GRDDL processor to generate RDF, which can then be consumed by a triples store.</p>
<p>The second approach is perhaps more intriguing, and that is to utilize this process in reverse to generate reports. In this case, the report writer establishes the RDFa elements in a document template, and, by running these against either an XQuery or SPARQL transformation (or an XSLT, for that matter), this populates the template with the appropriate values for the given context or operation. I see it being employed much more extensively now that the W3C Semantic Web Group has stepped up their XBRL Activity.</p>
<p>By the way, you also asked about the role that DITA plays, vis-a-vis XBRL, as well as it&#8217;s &#8220;goodness&#8221;. Again, my focus is essentially on utility. DITA is a topic-oriented documentation system, and, while it&#8217;s something of a pain in the butt to set up, is very effective at being able to create discrete topical blocks that can, nonetheless, be compiled together via DITA maps into localized help and information context systems. In that regard, I think that XBRL and DITA are ideally suited for one another, especially as you see XBRL move from being a static format to being more of a services oriented technology. DITA can be transformed into HTML content, can embed XBRL/RDFa content and contexts, and can similarly be compiled into more extensive documentation, from CFM help files to large scale living annual reports.</p>
<p>4. <b>NIEM.</b> There were a group of about five senior ontologists and information architects that collectively worked on the NIEM architecture from its origins in the DoJ to its current &#8220;wildfire&#8221; growth through much of the Federal government. I know two of them personally, and they are easily two of the best data architects I&#8217;ve ever met, globally, so I&#8217;d be careful about dismissing NIEM out of hand. More to the point, they begun this effort largely under the radar of the Bush administration, and as such, the level of politicization was consequently very low.</p>
<p>I brought up NIEM to illustrate differences in architectural approach, not necessarily to endorse it as a better solution (though I think, obviously, that it is a good one). More to the point, I also think that it has the potential to be a competitive solution to XBRL within other governmental agencies, especially in areas such as procurement, resource accounting, financial contract management, and energy systems management. Without Chris Cox championing the standard at the SEC, and Mary Schapiro singularly cool about XBRL, I think that understanding specifications such as NIEM and how XBRL could be adapted to work in an IEP model might prove to be most prudent.<br />
specification</p>
<p>5. <b>Namespaces.</b> Again, this is a difference in design principles. NIEM utilizes namespaces as the cornerstone of a document design architecture &#8211; in essence, there&#8217;s a one-to-one correlation between a namespace and an IEP (not strictly true in all cases, but a good general rule). XBRL utilizes namespaces to identify authority rather than topicality, as far as I can tell. Both are valid approaches, and there are somecorrelation between  deep debates within the XML community about the relationship between namespaces and object classification. Personal preference, I&#8217;ve come to align toward namespaces establishing a topical reference rather than just an authoritative one as this seems to result in cleaner ontologies, but I&#8217;d also concede that large scale data design is still more art than science, and as such is subject to interpretation.</p>
<p>6. <b>External citations.</b> This may be perhaps a misunderstanding of mine with regard to XBRL. External citations in this regard refered to Reference Linkbases, which serves, I assume, to establish equivalency of terminology (if term A in XBRL document A and term B in XBRL document B both reference the same reference citation under the same arc role, then this should imply that there is a semantic equivalency between A and B). However, I&#8217;ve noticed that this isn&#8217;t always satisfied properly in XBRL documents that I&#8217;ve read, so going by real world examples, I may be missing something.</p>
<p>7. <b>XLink</b>. XLink has been somewhat contentious for awhile, largely because as a standard it has needed to satisfy the concept of linkages in a number of different contexts. The RDF/OWL community would argue that the Semantic Web effectively obviates the need for XLink. The HTML community gets twitchy about anything but simple links, the XML community has perhaps been deliberately vague with regard to this specification, and its adoption in other areas tends to be spotty &#8211; SVG, XForms utilize it, XHTML theoretically does but only XML purists use xlink, DocBook (which isn&#8217;t a W3C spec) uses it, but other technologies don&#8217;t. XInclude theoretically does use it, but XInclude seems to scare the heck out of the HTML Working Group (of course, anything even vaguely XMLish scares at least some members of that group, but that&#8217;s not a discussion for this forum).</p>
<p>David, I personally agree with you on XLink. I&#8217;d like to see the W3C come to some formal decision as to what actually it is, does, and where it&#8217;s scope lies, and would prefer to see it universally required in the underlying schemas, but I&#8217;m probably one of the few people who feels that way.</p>
<p>In summary, my suggestions with regard to XBRL involve examining that analyst use case, as ultimately I think it has a great deal of bearing upon the further evolution of the language and its adoption. I am not suggesting that XBRL 2.1 change in any way &#8211; it solves handily the use cases that it was initially designed for, it has widespread adoption in the institution-to-institution space, and it effectively does so in a manner that is consonent with the existing participants. However, I do feel that this optimization leaves it suboptimal in other areas for which there are relatively simple fixes &#8211; making it more streamlined and better able to work in a services environment, providing an accepted mechanism for mapping XBRL into Semantic Web terms for consumption in inference engines, establishing ways that the language can co-exist in mission critical arenas with other specifications, from NIEM to DITA to RDFa. None of these reduce the importance of the language, its design or its goals, and I believe that all of these may actually encourage wider-spread adoption of the language outside of its initial area of regulatory reporting.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David vun Kannon</title>
		<link>http://hitachidatainteractive.com/2009/06/30/admitting-the-obvious-about-xbrl/comment-page-1/#comment-30089</link>
		<dc:creator>David vun Kannon</dc:creator>
		<pubDate>Wed, 01 Jul 2009 16:27:08 +0000</pubDate>
		<guid isPermaLink="false">http://hitachidatainteractive.com/?p=651#comment-30089</guid>
		<description>Kurt, thanks for making this a conversation. We might want to break these topics down into smaller chunks if it continues over several exchanges.

	I agree that ontologies need to be designed and modeled carefully, and independent of process if we intend them to have applicability past the original process definition. This is the lesson of relational vs hierarchical databases. George Santayana had something to say about this.

	The idea that XBRL is not an efficient representation for XML processing systems is a red herring. As you have stated before, we are always “just a stylesheet away” from a more efficient processing format. As you have stated before, storage, network speeds and processing speeds constantly make these issues less important. The main point is not to judge a transfer format as a processing format. There is this concept of normalization and denormalization in the implementation of relational databases that is very apropos here. (Santayana, ibid.)

	I’m glad you are not trying to credential yourself as a language designer on the basis of your writing experience. I think your point that there have been advances in the last ten years that XBRL should take note of in any future refactoring or redevelopment of the core standards is very important and valid. There are also things that haven’t been done in that time period (by the W3C, for example) that are even more important than RESTful architecture, AJAX, and Drupal, if that is conceivable. For example, SQL, Java, UML, and XML Schema all have incompatibilities in their data typing systems. Wouldn’t it be nice if there were no corner cases to code for because of these differences?

	With respect to the NIEM, I have to admit to an initial skepticism about government developed designs or methodologies. Remember Ada? However, given NIEM’s focus on issues of criminal justice and homeland security, it is inevitable that NIEM and XBRL are going to intersect in their concerns over the area of forensic accounting, fraud, money laundering and general “follow the money” issues.

	I agree with your position that XBRL is closer to RDF/OWL than may appear on the surface. This is in part because of the reification of multiple structures in XBRL (those pesky but oh so useful linkbases!).

	If the bottom-line benefit of IEP, etc is that it is a document centric model, I will agree that there is a key philosophical divide about designing data centric vs document centric languages. I will happily admit that XBRL is designed to be data centric, but can fit into a document centric model. Just as NIEM has a data dictionary schema that it mines for elements for its packages and models, XBRL taxonomies serve the same purpose. You can even do document centric modeling with XBRL, to some degree, using tuples. I’d be very happy to explore mixed NIEM-XBRL applications.

	I’m not sure why you say XBRL avoids namespaces. XBRL uses namespaces just as much as other XML Schema based XML designs. Can you give an example of what you mean? I’m also not sure what kind of ‘external citations’ you mean, and why you think they have something to do with equivalency. Let’s say taxonomy author Alice creates an element in the substitution group of us-gaap:assets, and includes in her taxonomy a reference linkbase with a reference arc from alice:A-assets to the same definition of assets in the FASB Codification as was used in the US GAAP Taxonomy. Taxonomy author Bob independently does the same thing in his taxonomy for bob:B-assets. Instance author Charles uses both taxonomies. Nothing in the above guarantees that alice:A-assets  = bob:B-assets in an instance in the same context.

	On the subject of DITA, since DITA is being used by the FASB for the new Codification, there are plenty of opportunities for XBRL-DITA interaction. Where does DITA sit in your scale of design goodness?

	XML is semantically neutral. That neutrality guarantees that data centric designs are as ‘valid’ as document centric designs. If tools are not neutral, that is not something that can or should constrain design. Tools can change faster than language designs. EDI/X12 is document centric. (Santayana, ibid.)

	I’m not sure why you would say that XBRL is or wants or tries to be a third alternative to SW or XML designs. XBRL is nodes and edges, graphs of data and metadata. XBRL instances and schemas are lists of nodes, XBRL linkbases are lists of edges. It is that simple.

	Web apps may become a flood, does that mean they won’t use databases? I agree that an AJAX app should not be flipping XBRL back and forth between the browser client and a server. Why would you even start down that path? I think you’ve seriously misunderstood the use cases and actors if you think an investment analyst or fund manager needs an XBRL parser written Drupal.

	I’m having trouble reconciling your enthusiasm for NIEM document centric design with your enthusiasm for SemWeb designs using RDF/OWL etc. As you’ve said, XBRL (under the surface) is close to the “bag of triples” designs of the Semantic Web. Do you not agree that there is a tension between these enthusiasms or do you feel that XBRL is somehow exempt from the way you resolve that tension?

	Of all the W3C technologies upon which XBRL is based, I would hold that only XLink is seriously in trouble (out of date and largely abandoned). XML Schema, Namespaces, XML Base, XPath - this criticism does not apply. Do you think there should be a standard XML hyperlinking language? I do, and I would advocate reinvigorating XLink in order to satisfy this problem.

	As a conversational gambit “I’m a humble writer” somewhat undercuts your previous “I’m a credible XML expert” position. I absolutely disagree that XBRL’s success or failure rests on its appeal to developers. Tim Bray has made similar comments that XBRL needs to be a wildfire, up-from-the-coder success to truly be a winning technology. That is making a mistake of the superficial (XBRL is XML and must follow the path of RSS.) with the essential (XBRL serves the needs of corporate data transfer.).</description>
		<content:encoded><![CDATA[<p>Kurt, thanks for making this a conversation. We might want to break these topics down into smaller chunks if it continues over several exchanges.</p>
<p>	I agree that ontologies need to be designed and modeled carefully, and independent of process if we intend them to have applicability past the original process definition. This is the lesson of relational vs hierarchical databases. George Santayana had something to say about this.</p>
<p>	The idea that XBRL is not an efficient representation for XML processing systems is a red herring. As you have stated before, we are always “just a stylesheet away” from a more efficient processing format. As you have stated before, storage, network speeds and processing speeds constantly make these issues less important. The main point is not to judge a transfer format as a processing format. There is this concept of normalization and denormalization in the implementation of relational databases that is very apropos here. (Santayana, ibid.)</p>
<p>	I’m glad you are not trying to credential yourself as a language designer on the basis of your writing experience. I think your point that there have been advances in the last ten years that XBRL should take note of in any future refactoring or redevelopment of the core standards is very important and valid. There are also things that haven’t been done in that time period (by the W3C, for example) that are even more important than RESTful architecture, AJAX, and Drupal, if that is conceivable. For example, SQL, Java, UML, and XML Schema all have incompatibilities in their data typing systems. Wouldn’t it be nice if there were no corner cases to code for because of these differences?</p>
<p>	With respect to the NIEM, I have to admit to an initial skepticism about government developed designs or methodologies. Remember Ada? However, given NIEM’s focus on issues of criminal justice and homeland security, it is inevitable that NIEM and XBRL are going to intersect in their concerns over the area of forensic accounting, fraud, money laundering and general “follow the money” issues.</p>
<p>	I agree with your position that XBRL is closer to RDF/OWL than may appear on the surface. This is in part because of the reification of multiple structures in XBRL (those pesky but oh so useful linkbases!).</p>
<p>	If the bottom-line benefit of IEP, etc is that it is a document centric model, I will agree that there is a key philosophical divide about designing data centric vs document centric languages. I will happily admit that XBRL is designed to be data centric, but can fit into a document centric model. Just as NIEM has a data dictionary schema that it mines for elements for its packages and models, XBRL taxonomies serve the same purpose. You can even do document centric modeling with XBRL, to some degree, using tuples. I’d be very happy to explore mixed NIEM-XBRL applications.</p>
<p>	I’m not sure why you say XBRL avoids namespaces. XBRL uses namespaces just as much as other XML Schema based XML designs. Can you give an example of what you mean? I’m also not sure what kind of ‘external citations’ you mean, and why you think they have something to do with equivalency. Let’s say taxonomy author Alice creates an element in the substitution group of us-gaap:assets, and includes in her taxonomy a reference linkbase with a reference arc from alice:A-assets to the same definition of assets in the FASB Codification as was used in the US GAAP Taxonomy. Taxonomy author Bob independently does the same thing in his taxonomy for bob:B-assets. Instance author Charles uses both taxonomies. Nothing in the above guarantees that alice:A-assets  = bob:B-assets in an instance in the same context.</p>
<p>	On the subject of DITA, since DITA is being used by the FASB for the new Codification, there are plenty of opportunities for XBRL-DITA interaction. Where does DITA sit in your scale of design goodness?</p>
<p>	XML is semantically neutral. That neutrality guarantees that data centric designs are as ‘valid’ as document centric designs. If tools are not neutral, that is not something that can or should constrain design. Tools can change faster than language designs. EDI/X12 is document centric. (Santayana, ibid.)</p>
<p>	I’m not sure why you would say that XBRL is or wants or tries to be a third alternative to SW or XML designs. XBRL is nodes and edges, graphs of data and metadata. XBRL instances and schemas are lists of nodes, XBRL linkbases are lists of edges. It is that simple.</p>
<p>	Web apps may become a flood, does that mean they won’t use databases? I agree that an AJAX app should not be flipping XBRL back and forth between the browser client and a server. Why would you even start down that path? I think you’ve seriously misunderstood the use cases and actors if you think an investment analyst or fund manager needs an XBRL parser written Drupal.</p>
<p>	I’m having trouble reconciling your enthusiasm for NIEM document centric design with your enthusiasm for SemWeb designs using RDF/OWL etc. As you’ve said, XBRL (under the surface) is close to the “bag of triples” designs of the Semantic Web. Do you not agree that there is a tension between these enthusiasms or do you feel that XBRL is somehow exempt from the way you resolve that tension?</p>
<p>	Of all the W3C technologies upon which XBRL is based, I would hold that only XLink is seriously in trouble (out of date and largely abandoned). XML Schema, Namespaces, XML Base, XPath &#8211; this criticism does not apply. Do you think there should be a standard XML hyperlinking language? I do, and I would advocate reinvigorating XLink in order to satisfy this problem.</p>
<p>	As a conversational gambit “I’m a humble writer” somewhat undercuts your previous “I’m a credible XML expert” position. I absolutely disagree that XBRL’s success or failure rests on its appeal to developers. Tim Bray has made similar comments that XBRL needs to be a wildfire, up-from-the-coder success to truly be a winning technology. That is making a mistake of the superficial (XBRL is XML and must follow the path of RSS.) with the essential (XBRL serves the needs of corporate data transfer.).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paul Wilkinson</title>
		<link>http://hitachidatainteractive.com/2009/06/30/admitting-the-obvious-about-xbrl/comment-page-1/#comment-30042</link>
		<dc:creator>Paul Wilkinson</dc:creator>
		<pubDate>Wed, 01 Jul 2009 00:31:28 +0000</pubDate>
		<guid isPermaLink="false">http://hitachidatainteractive.com/?p=651#comment-30042</guid>
		<description>David notes: &quot;It still is not clear that the Web, defined as the experience of non-expert consumers, has — or should — drive design and technology choices for XBRL.&quot;

That is an excellent point worth repeating. It goes to the heart of the role of information in the marketplace.

Some information is more likely to be used by &quot;expert consumers,&quot; other information by &quot;pro-sumers,&quot; and still other information by &quot;non-expert consumers.&quot; A remarkable thing about XBRL is its robust capability to support a variety of uses. (As for admitting “the obvious,” note that XBRL US has called itself “the national consortium for XML business reporting standards” on its Web site for some time.)

The more information in the marketplace available to information consumers at various levels, the better the price system works — whether it&#039;s the price of gasoline, the price of food, the price of stocks, or some other price. In the case of gasoline, experts can tell you what octane rating is better for your car; a pro-sumer might account for the car wash discount at a particular station; you can balance the distance to a particular gas station against its lower price. (Unfortunately, it remains difficult to raise capital to compete against oil companies because it’s costly and relatively complex for small companies to live in the relevant information eco-system — perhaps XML and XBRL can help with that, too.)

What should &quot;drive design and technology choices for XBRL&quot; is what gets the most information into the marketplace in the most timely manner so that information consumers at various levels can use the information, directly or in collaboration with other information consumers with varying degrees of expertise.

Here’s one example of why getting information into the marketplace sooner could be more important than delaying things for “non-expert consumers.” While nothing in life is guaranteed, had information about asset backed securities been available to the marketplace in more useful structured format rather than &quot;Web-friendly&quot; html format, the odds of an expert or an academic or a pro-sumer discovering the underlying defects in the asset class sooner would have been higher. Html was used for ABS because html, like the thousands of pages of paper underlying ABS, is &quot;document-centric.&quot;  Alas, that didn&#039;t turn out well at all.

(Many of us know people who don&#039;t qualify as the &quot;sophisticated&quot; investors deemed smart enough or wealthy enough to trade ABS, but who might have noticed ABS defects sooner had they been able to analyze structured disclosure. Now, thanks to opacity, we all own document-centric ABS, or at least its wreckage.)

As it happened, thanks to a private implementation of XBRL, expert short-sellers discovered the ABS problems rather late, but before the rest of the marketplace, meaning they likely profited much more handsomely than would have been the case had the defects been discovered via disclosure of structured information to the larger marketplace. (See http://www.wired.com/techbiz/it/magazine/17-03/wp_reboot.) Well-structured GAAP reporting helped end the Great Depression by restoring trust in public companies; well-structured ABS reporting could do the same by restoring trust in ABS. It certainly couldn&#039;t be worse than unstructured or semi-structured ABS “information” – information that was, in fact, posted on EDGAR in “Web-friendly,” document-centric, html format! Neither html nor the documents it represented were up to the task of fully describing complex securities big enough to create systemic risk.

A simpler example is that I&#039;m not a CPA. I rely on the market to value public companies according to their financial statements and am thrilled the market will be able to do so more efficiently thanks to XBRL. That means my key investment criteria, such as business strategy and product quality, will become relatively more important. Alas, there aren&#039;t mandatory XBRL tags for those criteria -- yet. But it sure will be nice when, someday, I won&#039;t need to wade through the text of hundreds of 10-Ks and 10-Qs to locate and compare what every company says about its own strategy and product quality. And if people who are better at accounting than I am can mash-up strategy, quality and accounting, because they&#039;re reported in a compatible widely-accepted format that they had to learn to be able to use modern financial statements, more power to them. That simply means better allocation of capital and a rising tide for all.

That is all to make two main points:

First, making the perfect the enemy of the good can have significant consequences, and good people like David who have dedicated countless hours to XBRL have experience that can do for business reporting in the 21st century what GAAP did for business reporting in the 20th century.

Second, while GAAP is the most famous use case for XBRL so far, because the XBRL standard is applicable to all business information it&#039;s difficult to predict future benefits. We do know this: Better information makes better markets, and the more applications for which a standard is used, the better the tools to use the standard are likely to become. To the extent XBRL can expedite the disclosure and analysis of information that market participants may find important, and to the extent its application to other purposes can lead to the creation of even more useful tools, full speed ahead. (Since executive compensation disclosure relies on GAAP for part of its content and XBRL data tags already exist, that might be a logical next step.)

Disclosure: Long EDGR and BR, both of which include XBRL in their strategy.</description>
		<content:encoded><![CDATA[<p>David notes: &#8220;It still is not clear that the Web, defined as the experience of non-expert consumers, has — or should — drive design and technology choices for XBRL.&#8221;</p>
<p>That is an excellent point worth repeating. It goes to the heart of the role of information in the marketplace.</p>
<p>Some information is more likely to be used by &#8220;expert consumers,&#8221; other information by &#8220;pro-sumers,&#8221; and still other information by &#8220;non-expert consumers.&#8221; A remarkable thing about XBRL is its robust capability to support a variety of uses. (As for admitting “the obvious,” note that XBRL US has called itself “the national consortium for XML business reporting standards” on its Web site for some time.)</p>
<p>The more information in the marketplace available to information consumers at various levels, the better the price system works — whether it&#8217;s the price of gasoline, the price of food, the price of stocks, or some other price. In the case of gasoline, experts can tell you what octane rating is better for your car; a pro-sumer might account for the car wash discount at a particular station; you can balance the distance to a particular gas station against its lower price. (Unfortunately, it remains difficult to raise capital to compete against oil companies because it’s costly and relatively complex for small companies to live in the relevant information eco-system — perhaps XML and XBRL can help with that, too.)</p>
<p>What should &#8220;drive design and technology choices for XBRL&#8221; is what gets the most information into the marketplace in the most timely manner so that information consumers at various levels can use the information, directly or in collaboration with other information consumers with varying degrees of expertise.</p>
<p>Here’s one example of why getting information into the marketplace sooner could be more important than delaying things for “non-expert consumers.” While nothing in life is guaranteed, had information about asset backed securities been available to the marketplace in more useful structured format rather than &#8220;Web-friendly&#8221; html format, the odds of an expert or an academic or a pro-sumer discovering the underlying defects in the asset class sooner would have been higher. Html was used for ABS because html, like the thousands of pages of paper underlying ABS, is &#8220;document-centric.&#8221;  Alas, that didn&#8217;t turn out well at all.</p>
<p>(Many of us know people who don&#8217;t qualify as the &#8220;sophisticated&#8221; investors deemed smart enough or wealthy enough to trade ABS, but who might have noticed ABS defects sooner had they been able to analyze structured disclosure. Now, thanks to opacity, we all own document-centric ABS, or at least its wreckage.)</p>
<p>As it happened, thanks to a private implementation of XBRL, expert short-sellers discovered the ABS problems rather late, but before the rest of the marketplace, meaning they likely profited much more handsomely than would have been the case had the defects been discovered via disclosure of structured information to the larger marketplace. (See <a href="http://www.wired.com/techbiz/it/magazine/17-03/wp_reboot.)" rel="nofollow">http://www.wired.com/techbiz/it/magazine/17-03/wp_reboot.)</a> Well-structured GAAP reporting helped end the Great Depression by restoring trust in public companies; well-structured ABS reporting could do the same by restoring trust in ABS. It certainly couldn&#8217;t be worse than unstructured or semi-structured ABS “information” – information that was, in fact, posted on EDGAR in “Web-friendly,” document-centric, html format! Neither html nor the documents it represented were up to the task of fully describing complex securities big enough to create systemic risk.</p>
<p>A simpler example is that I&#8217;m not a CPA. I rely on the market to value public companies according to their financial statements and am thrilled the market will be able to do so more efficiently thanks to XBRL. That means my key investment criteria, such as business strategy and product quality, will become relatively more important. Alas, there aren&#8217;t mandatory XBRL tags for those criteria &#8212; yet. But it sure will be nice when, someday, I won&#8217;t need to wade through the text of hundreds of 10-Ks and 10-Qs to locate and compare what every company says about its own strategy and product quality. And if people who are better at accounting than I am can mash-up strategy, quality and accounting, because they&#8217;re reported in a compatible widely-accepted format that they had to learn to be able to use modern financial statements, more power to them. That simply means better allocation of capital and a rising tide for all.</p>
<p>That is all to make two main points:</p>
<p>First, making the perfect the enemy of the good can have significant consequences, and good people like David who have dedicated countless hours to XBRL have experience that can do for business reporting in the 21st century what GAAP did for business reporting in the 20th century.</p>
<p>Second, while GAAP is the most famous use case for XBRL so far, because the XBRL standard is applicable to all business information it&#8217;s difficult to predict future benefits. We do know this: Better information makes better markets, and the more applications for which a standard is used, the better the tools to use the standard are likely to become. To the extent XBRL can expedite the disclosure and analysis of information that market participants may find important, and to the extent its application to other purposes can lead to the creation of even more useful tools, full speed ahead. (Since executive compensation disclosure relies on GAAP for part of its content and XBRL data tags already exist, that might be a logical next step.)</p>
<p>Disclosure: Long EDGR and BR, both of which include XBRL in their strategy.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kurt Cagle</title>
		<link>http://hitachidatainteractive.com/2009/06/30/admitting-the-obvious-about-xbrl/comment-page-1/#comment-30031</link>
		<dc:creator>Kurt Cagle</dc:creator>
		<pubDate>Tue, 30 Jun 2009 22:40:12 +0000</pubDate>
		<guid isPermaLink="false">http://hitachidatainteractive.com/?p=651#comment-30031</guid>
		<description>David,

I figured that my post might be somewhat controversial, and am honored that you chose to respond to it as you have.

We&#039;re entering an interesting age. Over the last ten to fifteen years, there has been a steady ramp of complexity of data models and modeling mechanisms. I&#039;ve been following a fairly fundamental sea-change in the move towards RESTful architectures, a direct consequence of the understanding that data models and ontologies need to be understood separate from process. Especially when you move into large ontologies (e.g., financial reporting, health level services, aircraft component specifications), the challenge that people face is how best to model them, not just in terms of the ontology but in terms of the language formalism of the ontology. 

Moreover, once a modeling formalism has been determined, the question comes back to whether such a formalism can in fact be expressed in a language that is efficient for all the potential uses, or whether there are in fact multiple representations that may be more efficient for certain types of applications rather than others. This is the heart of my contention, more than anything - that whereas XBRL has certain advantages for certain types of implementations (many deriving either from a desire to target spreadsheet or SQL data stores) and implementors (financial services companies) it is perhaps not as efficient a representation (largely owing to the need to rectify links) for use within XML processing systems.

I am not an accountant. I have, through the last couple of years, learned enough of the terminology to understand most of what&#039;s needed from an implementation designer&#039;s standpoint, but have neither the talent nor the inclination to be a CPA. I am, however, an XML developer, and have fourteen books to my credit as well as a couple of hundred articles, white papers, seminar presentations and the like within the XML and web development space. The upshot of this is not to play a game of my skills are better than yours are, but rather to point out that I have the experience to see that many of the issues that XBRL has spent years on are issues that could readily be resolved much more cleanly by the tools (and perhaps more importantly, methodologies) that have emerged in the last decade on the XML side.

Consider, for instance, the model currently deployed by the National Information Exchange Model XML (NIEM) body as part of dealing with complex ontologies in the US Federal information space. The NIEM model is intriguing - the use of information exchange packages and packaged documentation (known as IEPs and IEPDs, respectively) provide a way for a number of federal agencies, each dealing with their own particular domain needs, to nonetheless share a common set of ontologies. 

Certainly these domains are no less complex in their own right than accounting standards are. They do deploy differing namespaces to help manage the complexity, because namespaces establish the boundaries of domains, and yet those namespaces do not materially contribute to the complexity of the specifications - indeed, they actually work well in simplifying application development around instances of these standards. What&#039;s more, because there is in fact a central clearing house for NIEM IEPDs - for the schemas and not the instances - what this does is allow for a good middle ground in which common entities can be defined, yet that also provides room for the evolution of new, distinct objects that can either add to or subclass existing IEPs.

In essence, there is a distinct difference in philosophy between NIEM and XBRL. XBRL works upon a property-centric ideal (something that again works better in relational databases or spreadsheets than in XML per se) in which a given report is a bag of properties, the role of the schema is only to identify those properties, and structure exists only by dint of contexts. It is a model that is, ironically, much more similar in approach to RDF/OWL than may be obvious on the surface, though the latter technologies, like XBRL, tends to suffer from performance because all structure has to be imputed.

The NIEM architecture, on the other hand, is document-centric. Each IEP effectively defines a set of structural constaints, either through XSD or through other schematic languages (such as Schematron, which could go a long way to solving the validation complexities associated with equations or functions within XBRL if expressed in a document-centric approach). Presentation into XHTML, DocBook, OOXML, etc. can be easily accomplished in XSLT (and more so in XSLT2). IEP document instances can readily be stored in XML databases, and queried via XQuery in order to form reports and analyses. 

Because of the document-centric nature of IEPs, development of web based tools for both viewing and editing such reports online through XForms, Air apps, .NET or Java clients and so forth is made much simpler as well; I can point together an online application for viewing and editing police reports or similar IEPD apps in a few hours, and because of the underlying standardization that&#039;s involved, I also feel reasonably comfortable in asserting that police reports that I create for use in Washington state could just as readily be read and understood in Texas, Virginia or the US Dept. of Justice.

In the interest of avoiding namespaces and structures (even if moderated), XBRL relies upon using external citations as a mechanism for equivalency (a namespace would have done the job just as well, by encoding the equivalency as a citation reference within an &lt;appinfo&gt; field within a schema definition, and would have been far more efficient in terms of processing). Labels similarly would have resided within structures, or would have been referenced en mass by the schema in question in an external document (perhaps tying in DITA, which would have allowed you to maintain all of this information in a distributed topical system).

Put another way, rather than relying upon a plethora of perfectly good standards, XBRL has reinvented most of them under the justification that Accounting is different than any other form of human intellectual endeavor. This is perhaps the most egregious problem that I see with XBRL. XML is defined to be a semantically neutral language, and is designed so for a number of very good reasons. As a semantically neutral language (the definition of the concept of an infoset along with a few syntactical rules for constructing representations of such infosets) it is independent of transport mechanism, it can be transformed and processed in any number of different ways with any number of tools, it can be queried in a manner that is independent of the underlying semantics, and you can mix and match multiple conceptual namespaces as appropriate without fear of collision. These take place because at their core, XML tools are very document-centric. 

XBRL&#039;s property-centric approach could work well in a semantic environment ... and contrary to your assertion, Semantic web tools such as SPARQL, RDFa, GRDDL, and RIF are very solid, there are a number of vendors providing such tools, and those tools do everything that XBRL can do, from buidling associations and referencing citations to performing inference analysis. SW differs from XML in that, while both work upon graphs, SW&#039;s focus is edge-centric rather than the node-specific mode that XML employs. XBRL is not some third alternative - a graph has edges and nodes, it doesn&#039;t have (nor does it need) accounting.

Ultimately, my contentions here are that XBRL, &lt;i&gt;as it stands now&lt;/i&gt;, neither has the comparatively lightweight structure that makes it attractive as a web technology, nor does it have the rigorous formalism that makes it attractive as a semantic technology. This means that while XBRL can be utilized from within stand-alone applications that have very dedicated functionality, it will be harder and harder to justify this as the migration to web applications become a flood. You will have few people that will write XBRL extensions in Firefox or XBRL parsers in Drupal, perhaps not an issue if you are a company reporting earnings to the SEC, but a definite issue if you are an analyst or fund manager. 

It means that it becomes a harder sell for XBRL as an accounting standard outside of the immediate accounting circles, because NIEM type models are gaining adherents among agencies that need to work with more than just accounting information (it may also be part of the reason for the foot-dragging of the SEC in regards to XBRL adoption, as they have a chance to evaluate alternative models).

I am not advocating that XBRL abandon the standard that they have worked diligently for more than a dozen years. In the domain where it grew up, the tools are adequate for the needs asked of it, and changing standards is always a difficult process. What I am advocating, however, is that some serious thought be given to an XBRL 3.0 that is able to be expressed canonically in an efficient XML representation and in OWL, that can be effectively packaged (and there should be a package mechanism), that can work well both for the Fortune 500 company and for the SME market. Perhaps it is time that a re-examination of XBRL in light of NIEM, OWL, the XQuery stack and similar architectures be made - the technology has changed profoundly in the last decade, and to insist upon using out of date and largely abandoned technology simply because it is too difficult to make the social adoption changes seems disingenuous to me.

One final point in that regard. I am simply a writer. What I say will not matter to my company one way or the other - I cover the XML industry as an analyst, nothing more. What will matter at the end of the day is whether developers adopt this technology - not CEOs, not CFOs. If the technology does not provide a compelling benefit in terms of what it can do in terms of the existing pipelines and toolsets that they have to work with, then you will not see wide-spread adoption of the technology beyond the absolute minimum necessary to comply to state mandates. I was originally very positive about XBRL because I believed that the goals that it was designed to solve were noble and necessary. Yet the more time I&#039;ve spent working with it, the more I have to wonder whether it will in fact solve the problems it was intended to solve. Perhaps I am wrong, and my own perspective blinds me to the brilliance of the language, but overall I see a language that attempts to optimize on constraints that are no longer important while missing the constraints that are.

Thank you for the engaging conversation.

Kurt</description>
		<content:encoded><![CDATA[<p>David,</p>
<p>I figured that my post might be somewhat controversial, and am honored that you chose to respond to it as you have.</p>
<p>We&#8217;re entering an interesting age. Over the last ten to fifteen years, there has been a steady ramp of complexity of data models and modeling mechanisms. I&#8217;ve been following a fairly fundamental sea-change in the move towards RESTful architectures, a direct consequence of the understanding that data models and ontologies need to be understood separate from process. Especially when you move into large ontologies (e.g., financial reporting, health level services, aircraft component specifications), the challenge that people face is how best to model them, not just in terms of the ontology but in terms of the language formalism of the ontology. </p>
<p>Moreover, once a modeling formalism has been determined, the question comes back to whether such a formalism can in fact be expressed in a language that is efficient for all the potential uses, or whether there are in fact multiple representations that may be more efficient for certain types of applications rather than others. This is the heart of my contention, more than anything &#8211; that whereas XBRL has certain advantages for certain types of implementations (many deriving either from a desire to target spreadsheet or SQL data stores) and implementors (financial services companies) it is perhaps not as efficient a representation (largely owing to the need to rectify links) for use within XML processing systems.</p>
<p>I am not an accountant. I have, through the last couple of years, learned enough of the terminology to understand most of what&#8217;s needed from an implementation designer&#8217;s standpoint, but have neither the talent nor the inclination to be a CPA. I am, however, an XML developer, and have fourteen books to my credit as well as a couple of hundred articles, white papers, seminar presentations and the like within the XML and web development space. The upshot of this is not to play a game of my skills are better than yours are, but rather to point out that I have the experience to see that many of the issues that XBRL has spent years on are issues that could readily be resolved much more cleanly by the tools (and perhaps more importantly, methodologies) that have emerged in the last decade on the XML side.</p>
<p>Consider, for instance, the model currently deployed by the National Information Exchange Model XML (NIEM) body as part of dealing with complex ontologies in the US Federal information space. The NIEM model is intriguing &#8211; the use of information exchange packages and packaged documentation (known as IEPs and IEPDs, respectively) provide a way for a number of federal agencies, each dealing with their own particular domain needs, to nonetheless share a common set of ontologies. </p>
<p>Certainly these domains are no less complex in their own right than accounting standards are. They do deploy differing namespaces to help manage the complexity, because namespaces establish the boundaries of domains, and yet those namespaces do not materially contribute to the complexity of the specifications &#8211; indeed, they actually work well in simplifying application development around instances of these standards. What&#8217;s more, because there is in fact a central clearing house for NIEM IEPDs &#8211; for the schemas and not the instances &#8211; what this does is allow for a good middle ground in which common entities can be defined, yet that also provides room for the evolution of new, distinct objects that can either add to or subclass existing IEPs.</p>
<p>In essence, there is a distinct difference in philosophy between NIEM and XBRL. XBRL works upon a property-centric ideal (something that again works better in relational databases or spreadsheets than in XML per se) in which a given report is a bag of properties, the role of the schema is only to identify those properties, and structure exists only by dint of contexts. It is a model that is, ironically, much more similar in approach to RDF/OWL than may be obvious on the surface, though the latter technologies, like XBRL, tends to suffer from performance because all structure has to be imputed.</p>
<p>The NIEM architecture, on the other hand, is document-centric. Each IEP effectively defines a set of structural constaints, either through XSD or through other schematic languages (such as Schematron, which could go a long way to solving the validation complexities associated with equations or functions within XBRL if expressed in a document-centric approach). Presentation into XHTML, DocBook, OOXML, etc. can be easily accomplished in XSLT (and more so in XSLT2). IEP document instances can readily be stored in XML databases, and queried via XQuery in order to form reports and analyses. </p>
<p>Because of the document-centric nature of IEPs, development of web based tools for both viewing and editing such reports online through XForms, Air apps, .NET or Java clients and so forth is made much simpler as well; I can point together an online application for viewing and editing police reports or similar IEPD apps in a few hours, and because of the underlying standardization that&#8217;s involved, I also feel reasonably comfortable in asserting that police reports that I create for use in Washington state could just as readily be read and understood in Texas, Virginia or the US Dept. of Justice.</p>
<p>In the interest of avoiding namespaces and structures (even if moderated), XBRL relies upon using external citations as a mechanism for equivalency (a namespace would have done the job just as well, by encoding the equivalency as a citation reference within an &lt;appinfo&gt; field within a schema definition, and would have been far more efficient in terms of processing). Labels similarly would have resided within structures, or would have been referenced en mass by the schema in question in an external document (perhaps tying in DITA, which would have allowed you to maintain all of this information in a distributed topical system).</p>
<p>Put another way, rather than relying upon a plethora of perfectly good standards, XBRL has reinvented most of them under the justification that Accounting is different than any other form of human intellectual endeavor. This is perhaps the most egregious problem that I see with XBRL. XML is defined to be a semantically neutral language, and is designed so for a number of very good reasons. As a semantically neutral language (the definition of the concept of an infoset along with a few syntactical rules for constructing representations of such infosets) it is independent of transport mechanism, it can be transformed and processed in any number of different ways with any number of tools, it can be queried in a manner that is independent of the underlying semantics, and you can mix and match multiple conceptual namespaces as appropriate without fear of collision. These take place because at their core, XML tools are very document-centric. </p>
<p>XBRL&#8217;s property-centric approach could work well in a semantic environment &#8230; and contrary to your assertion, Semantic web tools such as SPARQL, RDFa, GRDDL, and RIF are very solid, there are a number of vendors providing such tools, and those tools do everything that XBRL can do, from buidling associations and referencing citations to performing inference analysis. SW differs from XML in that, while both work upon graphs, SW&#8217;s focus is edge-centric rather than the node-specific mode that XML employs. XBRL is not some third alternative &#8211; a graph has edges and nodes, it doesn&#8217;t have (nor does it need) accounting.</p>
<p>Ultimately, my contentions here are that XBRL, <i>as it stands now</i>, neither has the comparatively lightweight structure that makes it attractive as a web technology, nor does it have the rigorous formalism that makes it attractive as a semantic technology. This means that while XBRL can be utilized from within stand-alone applications that have very dedicated functionality, it will be harder and harder to justify this as the migration to web applications become a flood. You will have few people that will write XBRL extensions in Firefox or XBRL parsers in Drupal, perhaps not an issue if you are a company reporting earnings to the SEC, but a definite issue if you are an analyst or fund manager. </p>
<p>It means that it becomes a harder sell for XBRL as an accounting standard outside of the immediate accounting circles, because NIEM type models are gaining adherents among agencies that need to work with more than just accounting information (it may also be part of the reason for the foot-dragging of the SEC in regards to XBRL adoption, as they have a chance to evaluate alternative models).</p>
<p>I am not advocating that XBRL abandon the standard that they have worked diligently for more than a dozen years. In the domain where it grew up, the tools are adequate for the needs asked of it, and changing standards is always a difficult process. What I am advocating, however, is that some serious thought be given to an XBRL 3.0 that is able to be expressed canonically in an efficient XML representation and in OWL, that can be effectively packaged (and there should be a package mechanism), that can work well both for the Fortune 500 company and for the SME market. Perhaps it is time that a re-examination of XBRL in light of NIEM, OWL, the XQuery stack and similar architectures be made &#8211; the technology has changed profoundly in the last decade, and to insist upon using out of date and largely abandoned technology simply because it is too difficult to make the social adoption changes seems disingenuous to me.</p>
<p>One final point in that regard. I am simply a writer. What I say will not matter to my company one way or the other &#8211; I cover the XML industry as an analyst, nothing more. What will matter at the end of the day is whether developers adopt this technology &#8211; not CEOs, not CFOs. If the technology does not provide a compelling benefit in terms of what it can do in terms of the existing pipelines and toolsets that they have to work with, then you will not see wide-spread adoption of the technology beyond the absolute minimum necessary to comply to state mandates. I was originally very positive about XBRL because I believed that the goals that it was designed to solve were noble and necessary. Yet the more time I&#8217;ve spent working with it, the more I have to wonder whether it will in fact solve the problems it was intended to solve. Perhaps I am wrong, and my own perspective blinds me to the brilliance of the language, but overall I see a language that attempts to optimize on constraints that are no longer important while missing the constraints that are.</p>
<p>Thank you for the engaging conversation.</p>
<p>Kurt</p>
]]></content:encoded>
	</item>
</channel>
</rss>
