May 24, 2011

JAXB and DTD - Apache log4j Example

There is a common misconception that JAXB requires an XML Schema.  This is understandable since JAXB offers the ability to derive Java classes from an XML schema, and generate XML schemas from annotated Java classes.  In reality the JAXB annotations are used to map XML to objects, and although not part of the JAXB specification, the XJC tool offers the ability to generate Java classes from a DTD.  In this post we will use JAXB to interact with XML that corresponds to the Apache log4j DTD.


Apache log4j DTD

From the Apache log4j download, the DTD can be found in the following location:

<log4j_home>/source/main/resources/org/apache/log4j/xml/

Running XJC - Generate the Java Model

Since we are using XJC on a DTD file and not an XML schema we need to specify the -dtd flag:

xjc -p blog.log4j -d out -dtd log4j.dtd

This will produce the following classes:

blog\log4j\Appender.java
blog\log4j\AppenderRef.java
blog\log4j\Category.java
blog\log4j\CategoryFactory.java
blog\log4j\ConnectionSource.java
blog\log4j\DataSource.java
blog\log4j\ErrorHandler.java
blog\log4j\Filter.java
blog\log4j\Layout.java
blog\log4j\Level.java
blog\log4j\Log4JConfiguration.ja
blog\log4j\Log4JData.java
blog\log4j\Log4JEvent.java
blog\log4j\Log4JEventSet.java
blog\log4j\Log4JLocationInfo.jav
blog\log4j\Log4JProperties.java
blog\log4j\Logger.java
blog\log4j\LoggerFactory.java
blog\log4j\LoggerRef.java
blog\log4j\ObjectFactory.java
blog\log4j\Param.java
blog\log4j\Plugin.java
blog\log4j\Priority.java
blog\log4j\Renderer.java
blog\log4j\RollingPolicy.java
blog\log4j\Root.java
blog\log4j\RootRef.java
blog\log4j\ThrowableRenderer.jav
blog\log4j\TriggeringPolicy.java

XML Input (sample1.xml)

The input file we will be using for this example comes from the following location in the Apache log4j download:

<log4j_home>/site/apidocs/org/apache/log4j/examples/doc-files/

One interesting to note is that some of the element and attribute names have the colon character in them.  If this XML file corresponded to an XML schema then the portion before the colon would represent a namespace prefix, but in this case the colon is part of the node name.

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE log4j:configuration PUBLIC "-//APACHE//DTD LOG4J 1.2//EN" "log4j.dtd">
<!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

      http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.

-->

<log4j:configuration xmlns:log4j='http://jakarta.apache.org/log4j/'>

 <appender name="STDOUT" class="org.apache.log4j.ConsoleAppender">
           <layout class="org.apache.log4j.PatternLayout">
             <param name="ConversionPattern"
      value="%d %-5p [%t] %C{2} (%F:%L) - %m%n"/>
           </layout>     
 </appender>
 
 <category name="org.apache.log4j.xml">
   <priority value="info" />
 </category>
 
 <root>
    <priority value ="debug" />
       <appender-ref ref="STDOUT" />
 </root>
 
</log4j:configuration>

Below is the relevant fragment from the Apache log4j.dtd:

<!-- A configuration element consists of optional renderer
elements,appender elements, categories and an optional root
element. -->

<!ELEMENT log4j:configuration (renderer*, throwableRenderer?,
                               appender*,plugin*, (category|logger)*,root?,
                               (categoryFactory|loggerFactory)?)>

<!-- The "threshold" attribute takes a level value below which -->
<!-- all logging statements are disabled. -->

<!-- Setting the "debug" enable the printing of internal log4j logging   -->
<!-- statements.                                                         -->

<!-- By default, debug attribute is "null", meaning that we not do touch -->
<!-- internal log4j logging settings. The "null" value for the threshold -->
<!-- attribute can be misleading. The threshold field of a repository  -->
<!-- cannot be set to null. The "null" value for the threshold attribute -->
<!-- simply means don't touch the threshold field, the threshold field   --> 
<!-- keeps its old value.                                                -->
     
<!ATTLIST log4j:configuration
  xmlns:log4j              CDATA #FIXED "http://jakarta.apache.org/log4j/" 
  threshold                (all|trace|debug|info|warn|error|fatal|off|null) "null"
  debug                    (true|false|null)  "null"
  reset                    (true|false) "false"
>

Log4JConfiguration

If you look at the blog.log4j.Log4JConfiguration class you will see that the colon character appears in the names specified in both the @XmlRootElement and @XmlAttribute annotations:

package blog.log4j;

import java.util.ArrayList;
import java.util.List;
import javax.xml.bind.annotation.XmlAccessType;
import javax.xml.bind.annotation.XmlAccessorType;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlElements;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;
import javax.xml.bind.annotation.adapters.CollapsedStringAdapter;
import javax.xml.bind.annotation.adapters.NormalizedStringAdapter;
import javax.xml.bind.annotation.adapters.XmlJavaTypeAdapter;

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "", propOrder = {
    "renderer",
    "throwableRenderer",
    "appender",
    "plugin",
    "categoryOrLogger",
    "root",
    "categoryFactoryOrLoggerFactory"
})
@XmlRootElement(name = "log4j:configuration")
public class Log4JConfiguration {

    @XmlAttribute(name = "xmlns:log4j")
    @XmlJavaTypeAdapter(NormalizedStringAdapter.class)
    protected String xmlnsLog4J;

    // Other content omitted

}

Demo Code

Since there is a colon (':') character in some of the node names, we need to leverage a parser that is non-namespace aware.  One way to do this is to instantiate a non-namespace aware SAX parser, and set a JAXB UnmarshallerHandler as the ContentHandler.  Once the parsing is finished we can obtain the unmarshalled object from the UnmarshallerHandler.

package blog.log4j;

import java.io.FileInputStream;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
import javax.xml.bind.UnmarshallerHandler;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;

public class Demo {

    public static void main(String[] args) throws Exception {
        SAXParserFactory spf = SAXParserFactory.newInstance();
        SAXParser sp = spf.newSAXParser();
        XMLReader xr = sp.getXMLReader();

        JAXBContext jc = JAXBContext.newInstance("blog.log4j");
        Unmarshaller unmarshaller = jc.createUnmarshaller();
        UnmarshallerHandler unmarshallerHandler = unmarshaller.getUnmarshallerHandler();
        xr.setContentHandler(unmarshallerHandler);

        FileInputStream xmlStream = new FileInputStream("src/blog/log4j/sample1.xml");
        InputSource xmlSource = new InputSource(xmlStream);
        xr.parse(xmlSource);

        Log4JConfiguration config = (Log4JConfiguration) unmarshallerHandler.getResult();

        Marshaller marshaller = jc.createMarshaller();
        marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
        marshaller.marshal(config, System.out);
    }

}

5 comments:

  1. Great post man, you have indeed covered the topic in great details with code example. JAXB is been core for java to XML and XML to java object marshaling and thanks for clearing misconception that that JAXB requires an XML Schema and can't work with DTD.

    Javin
    Why String is immutable in Java

    ReplyDelete
  2. great xjc example. I've been using jaxb on a more complex dtd, which causes xjc to throw name conflict errors, like this one:

    [ERROR] A class/interface with the same name '' is already in use. Use a class customization to resolve this conflict.

    I've been trying to resolve this with external customisations, as mentioned here http://jaxb.java.net/tutorial/section_5_3-Overriding-Names.html - however that doesn't seem to work for me.

    Have you been able to use customisation for DTDs? Would be great if you could post an example, as I haven't been able to find anything.

    Cheers,
    Deniz

    ReplyDelete
  3. Hi Deniz,

    I haven't tried a customization file with a DTD. You may want to ask this question on the JAXB Java.net forum:
    - http://www.java.net/forums/glassfish/metro-and-jaxb

    I'll try to put together an example, but the JAXB forum or StackOverflow.com might get you an answer faster.

    -Blaise

    ReplyDelete
  4. really helpful thanks a lot

    ReplyDelete

Note: Only a member of this blog may post a comment.