Friday, November 25, 2011

keensocial.com has been discontinued

My keensocial.com has been deactivated. Some of my blog posts will not work properly since they need files from keensocial.com. I'm terribly busy now, so will not be able to fix it in the next days. Sorry for the inconvenience.

Friday, November 04, 2011

Defining a file format using XML and XML Schema (XSD) in C#/Java - V

The complete implementation of XML file format is divided into the following steps:
  1. XML Schema (XSD) definition
    • Let an extension for my file type: .xef (XML example file-type)
    • Generate an XML file with data compatible to previously defined schema
  2. Validate my XML with the schema
  3. Compress huge data fragments
  4. Encrypt sensitive data
  5. Implementation in Java (Compliance with XSD 1.1)
In the previous four posts (I, II, III, IV), we developed an XML file format on .Net platform. While developing a new application using this technique I needed some advanced features of XSD 1.1 like Type Alternatives and Assertion which are not yet supported by Microsoft.

The type alternatives provides a mechanism to choose a particular type for an element based on the value of an attribute. The element xs:alternative does the trick. Following is an XSD snippet for alternative data type:
<xs:element name="Animal" type="AnimalType">
    <xs:alternative test="@kind eq 'dog'" type="DogType" />
    <xs:alternative test="@kind eq 'cat'" type="CatType" />
</xs:element>

<xs:complexType name="AnimalType">
    <xs:sequence>
        <xs:element> name="Name" type="xs:string" />
        <xs:element> name="Talk" type="xs:string" />
    </xs:sequence>
    <xs:attribute name="kind" />
</xs:complexType>

<xs:complexType name="DogType">
    <xs:complexContent>
        <xs:extension base="AnimalType">
            <xs:sequence>
                <xs:element> name="Veterinarian" type="xs:string" />
            </xs:sequence>
        </xs:extension>
    </xs:complexContent>
</xs:complexType>

<xs:complexType name="CatType">
    <xs:complexContent>
        <xs:extension base="AnimalType">
            <xs:sequence>
                <xs:element> name="Owner" type="xs:string" />
            </xs:sequence>
        </xs:extension>
    </xs:complexContent>
</xs:complexType>
The corresponding XML can be the following:
<Animal kind="dog">
    <Name>Woof</Name>
    <Talk>Woof! Woof!</Talk>
    <Veterinarian>Mike</Veterinarian>
</Animal>
Assertion, as the name suggests, is a mechanism to put constraints on a particular element. It is achieved through the xs:assert element.
<xs:complexType name="DogType">
    <xs:complexContent>
        <xs:extension base="AnimalType">
            <xs:sequence>
                <xs:element> name="Veterinarian" type="xs:string" />
            </xs:sequence>
        </xs:extension>
    </xs:complexContent>
    <xs:assert test="(Talk eq 'Woof!') or (Talk eq 'Woof! Woof!')" />
</xs:complexType>
While validating my XML file with the new schema using my C# program, the validation was failing. After some investigation I have found that Microsoft doesn't support XSD 1.1 yet. So I have to look for alternatives. Apache Xerces-C++ didn't implement XSD1.1 support yet, but Xerces-J did.

After some googling in the Internet, it was clear that Java 7 has taken over the implementation from Xerces-J. I created a Java desktop application in Eclipse and recreate the C# program. Compressing and encrypting were also done using (GZIPOutputStream) java.util.zip and Apache security (com.sun.org.apache.xml.internal.security) packages respectively. To validate against XSD1.1 you need to use SchemaFactory and all your source schema to this factory. Following code snippet shows how to validate and parse an XML file:
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Path xmlFile = Paths.get(fileName);
Document xmlDoc = null;
try (BufferedInputStream in = new BufferedInputStream(Files.newInputStream(xmlFile))){

 SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
 sf.setErrorHandler(new ErrorHandler() {

  @Override
  public void error(SAXParseException arg0) throws SAXException {
   System.out.println("Error occurred: " + arg0.getMessage());
  }

  @Override
  public void fatalError(SAXParseException arg0)
    throws SAXException {
   System.out.println("Fatal error occurred: " + arg0.getMessage());
  }

  @Override
  public void warning(SAXParseException arg0) throws SAXException {
   System.out.println("Warning: " + arg0.getMessage());
  }
 });

 //StreamSource schemaDocument = new StreamSource(schemaName);
 //Schema s = sf.newSchema(schemaDocument);
 
 // we want to validate against the following schemas; already existing in our res folder
 sf.setResourceResolver(new ResourceResolver());
 Schema s = sf.newSchema(new StreamSource[] { 
   new StreamSource(this.getClass().getResourceAsStream("res/xmldsig-core-schema.xsd"), "xmldsig-core-schema.xsd"),
            new StreamSource(this.getClass().getResourceAsStream("res/xenc-schema.xsd"), "xenc-schema.xsd"),
            new StreamSource(this.getClass().getResourceAsStream("res/set.xsd"), "set.xsd")
   }
 );     
 
 Validator v = s.newValidator();
 StreamSource instanceDocument = new StreamSource(fileName);
 v.validate(instanceDocument);
 
 xmlDoc = builder.parse(in);
  
} catch(SAXException | IOException e) {
  e.printStackTrace();
  System.exit(1);
}
The set.xsd file has been extended to work with compressed and encrypted elements.
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
 xmlns:xenc="http://www.w3.org/2001/04/xmlenc#" elementFormDefault="qualified">

  <xsd:import namespace="http://www.w3.org/2001/04/xmlenc#"></xsd:import>

  <xsd:element name="XmlFileFormat">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="Value1" type="xsd:integer"></xsd:element>
        <xsd:element name="Value2" type="xsd:string"></xsd:element>
        <xsd:element name="Value3" type="xsd:float"></xsd:element>
        <xsd:element name="Settings2">
          <xsd:complexType>
            <xsd:sequence minOccurs="0" maxOccurs="1">
              <xsd:element name="ValueX" type="xsd:integer">
              </xsd:element>
              <xsd:element name="ValueBulks">
                <xsd:complexType mixed="true">
                  <xsd:sequence>
                    <xsd:element name="ValueBulk" type="xsd:float"
                        minOccurs="0" maxOccurs="unbounded"/>
                    <xsd:element ref="xenc:EncryptedData" minOccurs="0" maxOccurs="1" />
                  </xsd:sequence>
                  <xsd:attribute name="Compress" use="optional" type="xsd:boolean" />
                  <xsd:attribute name="Encrypt" use="optional" type="xsd:boolean" />
                </xsd:complexType>
              </xsd:element>
            </xsd:sequence>
          </xsd:complexType>
        </xsd:element>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>

The complete source code can be downloaded from the following link:
http://keensocial.freeiz.com/blogs/xmlfileformat/xmlfileformat-j.zip