Xtrans: Difference between revisions
imported>Gfis hints |
imported>Gfis JavaImportChecker |
||
Line 1: | Line 1: | ||
==Applications== | |||
===JavaImportChecker=== | |||
This class is a pseudo serializer to be applied after <code>JavaTransformer</code>. It checks the <code>import</code> statements in a Java source file and reports: | |||
* '''superfluous imports''' which are never used | |||
* '''missing imports''', either because: | |||
** the class is in the same package | |||
** classes prefixed with their package name are not properly recognized by the tool (for example <code>java.util.Date</code>) - these should be explicitely imported also | |||
** inherited enums are not properly recognized by the tool | |||
The tool checks class names when they start with an uppercase letter [A-Z], somewhere followed by a lowercase letter [a-z]. | |||
All sources in a project can be checked with a shell command line like: | |||
find ../xtool/src -iname "*.java" | xargs -l -ißß java -jar dist/xtrans.jar -java ßß -jimp | |||
The corresponding output was: | |||
SchemaBean | |||
PathStack | |||
XPathLink | |||
XPathSelect | |||
XmlnsPrefix | |||
XtoolServlet | |||
import only: Enumeration | |||
import only: InputStream | |||
import only: ServletConfig | |||
import only: ServletContext | |||
import only: ZipFile | |||
SchemaBeanBase | |||
import only: Date | |||
import only: Timestamp | |||
PathElement | |||
IndexPage | |||
import only: HttpSession | |||
import only: Iterator | |||
Messages | |||
SchemaList | |||
XmlnsXref | |||
NonClosingInputStream | |||
SchemaArray | |||
use only: Date | |||
== Bugs == | == Bugs == | ||
====General Problems==== | ====General Problems==== |
Revision as of 19:48, 11 October 2016
Applications
JavaImportChecker
This class is a pseudo serializer to be applied after JavaTransformer
. It checks the import
statements in a Java source file and reports:
- superfluous imports which are never used
- missing imports, either because:
- the class is in the same package
- classes prefixed with their package name are not properly recognized by the tool (for example
java.util.Date
) - these should be explicitely imported also - inherited enums are not properly recognized by the tool
The tool checks class names when they start with an uppercase letter [A-Z], somewhere followed by a lowercase letter [a-z].
All sources in a project can be checked with a shell command line like:
find ../xtool/src -iname "*.java" | xargs -l -ißß java -jar dist/xtrans.jar -java ßß -jimp
The corresponding output was:
SchemaBean PathStack XPathLink XPathSelect XmlnsPrefix XtoolServlet import only: Enumeration import only: InputStream import only: ServletConfig import only: ServletContext import only: ZipFile SchemaBeanBase import only: Date import only: Timestamp PathElement IndexPage import only: HttpSession import only: Iterator Messages SchemaList XmlnsXref NonClosingInputStream SchemaArray use only: Date
Bugs
General Problems
- Though most transformers convert from the raw (specific) format to an XMLized representation, there are a few exceptions where general binary or text files are converted to the specific format which is then wrapped into XML. Examples are Base64, Quoted Prinatble and Morse Code.
- Most transformers store values in XML elements, but sometimes it seemed easier to store them in attributes of elements. DTA and Datev are examples for the latter case.
- For formats with many different tags (SWIFT for example) the question arises whether such tags are syntax or data. These tags can be converted to id attributes of a generalized XML "field" element, or a seperate element for each such tag can be generated. The SwiftTransformer made the latter decision.
Test
- Not all format conversions are precisely reversible.
- There are only a few test cases.
Incompletene Transformers
- general.XMLTransformer - insufficient serialization of entities; serializer should be replaced by Apaches's
- general.CountingTransformer - cannot generate, but serializes any XML to a sorted list with counts for all elements, and the accumulated length of their direct character content
- net.URITransformer - the set of supported schemas is incomplete, and serializing is not implemented.
- organizer.LDIFTransformer - not well tested, and serializing is not implemented.
Hints for Developers
Xtrans currently processes only a limited set of formats. You are encouraged to:
- play with the format transformer classes,
- email any suggestions for improvement,
- contribute patches for corrections,
- contribute new transformer classes.
Coding conventions
Please try to remain close to the current programming style:
- Write Javadoc comments before all methods and public members.
- Note that the Java sources are compiled with UTF-8 source encoding:
<javac srcdir="${src.home}" destdir="${build.classes}" listfiles="yes" encoding="utf8" source="1.4" target="1.4" debug="${javac.debug}" debuglevel="${javac.debuglevel}">
- Determine the proper accents and non-ASCII characters, and write them in Unicode in the Java source files. Use an Unicode enabled editor that handles UTF-8 properly; write some Unicode characters in the header comment such that the editor can detect the UTF-8 encoding.
- Use reliable sources for the format definition like RFCs or ISO standards, and document them in the Javadoc header of the class.
Reversibility
The transformers should try to serialize XML to exactly the same specific format from which they are able to generate XML. The test Ant targets perform a "generate - serialize - binary compare" sequence to check the reversibility of the transformation.
Some formats don't have a well-defined canonical representation. In JCL, for example, the line breaks and the spaces for field separation are lost in the XML representation, and cannot exactly be reproduced by the serializer. In these cases, subsequent "generate - serialize" sequences should finally produce an identical result.
Future Extensions
- more text processing formats:
- (La)TeX - similiar to RTF
- dot instruction oriented formats: IBM DCF, nroff, troff, perldoc
- binary formats like IBM DCA/RFT, Siemens Hit, WordPerfect
- common tagset for text processing features
- raster image processing formats:
- TIFF
- EXIF - at least the header
- GIF, BMP etc.
- vector image processing formats with target SVG:
- WMF
- Flash?
- RTF DO, AmiPro, WordPerfect Graphics ...
- ZIP file tree pseudo transformer