Why roll your own?


Why should a developer use their own code when there's a ton of high quality open source already out there?

I hear a lot of different reasons:

  1. It's too hard to learn someone else's stuff.
  2. How do I know it is good?
  3. How will we maintain it?  If there are bugs, I have to be able to fix them.
  4. It doesn't fit our design.
All of these are good reasons IF in fact, they are good reasons.  Too few do the work to figure this out.
  1. It is easier to write and understand your own code base than it is to understand what someone else did.  But most people don't even try.  Look folks, this is why they pay you the big bucks ... to understand complex stuff.
  2. You have to look at it.  And I mean really look.  There are a lot of ways to find good stuff.  Is anyone else using it? That's a good sign.  Is it frequently updated? Another good one.  Does it have a vibrant community around it? Another good one.  If the last update was two years ago, you may be SOOL (it's an acronym that basically means out of luck).
  3. That's one of the cool things about maintenance if you find good Open Source because a vibrant open source community will both accept and fix bugs.  And if you are serious (and I mean truly serious) about #1, and choose that source base, you'll also become a contributor back to it.  So not only can you adopt the fixes of others, but you can also make your own fixes.
  4. This can often be a problem.  How flexible is your stack?  How many more dependencies can you take?  If you have a good stack, you are much more likely to find open source that fits onto it without additional dependencies.  You may be challenged around upgrades stack components though, and that's where contributing back becomes important.
What's the value here for adopting open source?
  • Time To Market -- Adopting somebody else's code can let you focus on other things that are necessary to bring your product to market.
  • Maintenance -- It isn't just new development that you could be saving, it might also be on maintenance.  A good open source project will also maintain the solution for you.  And that has tremendous value.  Software costs are often quoted as 20% development, 80% maintenance.
  • Reputation -- Using good open source can enhance the reputation of your own products (if the open source base has a good reputation).  Contributing back can also be a significant reputation enhancer for your organization.
How does this fit in with standards?  A lot of open source projects integrate using standards. Some even BECOME standards (e.g., Schematron).  That's one of the values of standards.

Tomorrow, I'll talk about some cool new open source projects for CDA.

   -- Keith

Implementing Partial CDA Validation


In Partial Rejection and Levels of Validity in CDA (or anything else for that matter) I discussed levels of validation of CDA content.  Now I have to make that real.  There are two different ways to go about it.  As you might recall, here are a few of the levels in the partial validation hierarchy:

Level 0: Totally bogus content.  Is this even XML?
Level 1: The CDA Header is valid.
Level 2a: Level 1 + the narrative content is valid according to the CDA Schema
Level 2b: Level 2 + the LOINC codes for documents and sections are recognized as valid.

The first level is just doing an XML Parse without validation.  This will ensure content is well-formed XML.  If you fail this test, no need to go further.

The next level validates everything up through nonXMLBody or structuredBody.  This is easy.  Craft a new CDA Schema by editing POCD_MT000040.xsd as follows (delete struck out material and insert underlined material):

  <xs:complexType name="POCD_MT000040.ClinicalDocument">
      <xs:element name="realmCode" type="CS" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="typeId" type="POCD_MT000040.InfrastructureRoot.typeId"/>
      <xs:element name="templateId" type="II" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="id" type="II"/>
      <xs:element name="code" type="CE"/>
      <xs:element name="title" type="ST" minOccurs="0"/>
      <xs:element name="effectiveTime" type="TS"/>
      <xs:element name="confidentialityCode" type="CE"/>
      <xs:element name="languageCode" type="CS" minOccurs="0"/>
      <xs:element name="setId" type="II" minOccurs="0"/>
      <xs:element name="versionNumber" type="INT" minOccurs="0"/>
      <xs:element name="copyTime" type="TS" minOccurs="0"/>
      <xs:element name="recordTarget" type="POCD_MT000040.RecordTarget" maxOccurs="unbounded"/>
      <xs:element name="author" type="POCD_MT000040.Author" maxOccurs="unbounded"/>
      <xs:element name="dataEnterer" type="POCD_MT000040.DataEnterer" minOccurs="0"/>
      <xs:element name="informant" type="POCD_MT000040.Informant12" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="custodian" type="POCD_MT000040.Custodian"/>
      <xs:element name="informationRecipient" type="POCD_MT000040.InformationRecipient" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="legalAuthenticator" type="POCD_MT000040.LegalAuthenticator" minOccurs="0"/>
      <xs:element name="authenticator" type="POCD_MT000040.Authenticator" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="participant" type="POCD_MT000040.Participant1" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="inFulfillmentOf" type="POCD_MT000040.InFulfillmentOf" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="documentationOf" type="POCD_MT000040.DocumentationOf" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="relatedDocument" type="POCD_MT000040.RelatedDocument" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="authorization" type="POCD_MT000040.Authorization" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="componentOf" type="POCD_MT000040.Component1" minOccurs="0"/>
      <xs:element name="component" type="POCD_MT000040.Component2"/>
      <xs:any processContents/>
    <xs:attribute name="nullFlavor" type="NullFlavor" use="optional"/>
    <xs:attribute name="classCode" type="ActClinicalDocument" use="optional" fixed="DOCCLIN"/>
    <xs:attribute name="moodCode" type="ActMood" use="optional" fixed="EVN"/>

This will result in the schema processor ignoring anything after the CDA Header.  Or will it? Actually, this will fail, as the schema now violates the Unique Particle Attribution constraint of XML Schema 1.0.  However, if you could be sure that componentOf would be present, setting minOccurs="1" on that declaration resolves the problem.  But not every CCDA requires that, and so that little fix won't work.  OK, what if we change the definition of that last component so that it can contain anything?  Yep, that works.

It should look something like this:
  <xs:complexType name="POCD_MT000040.Component2">
      <xs:any processContents="skip" />
    <xs:anyAttribute processContents="skip"/>

So, now <component> can contain any sort of well formed XML content, and your "header validator" won't care.

An alternative implementation would use a specialized XSL identity template with some exceptions to skip any unrecognized content after componentOf, and simply delete the component element definition in POCD_MT000040.ClinicalDocument.

The next challenge is validating narrative only content.  For that, you want to tweak section definitions within the document so that you don't care about validating any content that isn't <text>, <title> or perhaps <code> within <section>, and that you validate subsection content.

That's a bit trickier.  For this case, you could define a <component> element at the top level which would be overriden by specializations of <component> defined within the header or entries (which you really don't care about), but which would be processed when matched by <xs:any processContents='lax'>.  However, rather than do that, my recommendation would be to create a specialized identity template that copies only what you want to validate within sections, and skips anything you don't care to validate.  Then you can just use the standard CDA Schema to validate the content without any changes (because all content within a section is optional according to the schema).

In that way, what you've just done is eliminated the potentially invalid content.  There's extra value there, because now what you have is a transform of the original content which, if "narrative valid", is probably safe to keep around for viewing and transformation by a stylesheet.

That identity template is a simple exercise in software engineering.  I'll leave it to the interested reader to figure it out.

Oh, one final thing: Don't be dumb and validate in easy to hard order.  Validate in the other order, because it will cost less in processing time for good documents.  Let the bad ones pay the performance penalty for multiple validation stages.

   -- Keith