How to validate XML file against XSD schema and list all validation errors
Recently in my project, I had a requirement to validate XML file against XSD schema file and list all validation errors. XSD (XML Schema Definition) is a way to specify metadata (schema, constraints, etc) about the xml data. To validate an XML file against an XSD file, we normally do something like this:
[groovy]
import javax.xml.XMLConstants
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.SchemaFactory
try {
File datafile = new File(xmlFilePath) //instanceof XML file.
File schemaFile = new File(xsdFilePath) //instance of XSD file.
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
Schema schema = factory.newSchema(new StreamSource(new
FileReader(schemaFile))) //instance of schema
Validator validator = schema.newValidator() // instance of validator
validator.validate(new StreamSource(new FileReader(datafile)))
} catch (SAXParseException ex) {
log.error "Error at Line: ${ex.lineNumber} Column: ${ex.columnNumber} in ${xmlFile.name}"
log.error "Message: ${ex.message}"
}
[/groovy]
SAXParseException is thrown by validate() when first validation error is encountered
in the XML file. We can use ‘lineNumber’ and ‘columnNumber’ fields of
SAXParseException object to know where exactly the error is. If there is no validation
error (i.e the XML file complies to the specified schema), then no exception will be
thrown.
But, through this approach, we cannot get information about all validation errors in the XML file in a single run. If an XML file has multiple validation errors (say four), then in first run, the exception will be thrown as soon as the first error will be encountered and we do not get to know about remaining three errors. To know about the subsequent errors, we need to fix the previous error and validate the file again and again till no exception is thrown.
Now to list all errors in one go, we need to populate list
of errors by overriding error handling behaviour of the validate(). We need to add a custom ErrorHandler which populates a list of validation errors.
All we need to do is..
[groovy]
Validator validator = schema.newValidator()
List exceptions = [] //Empty list to store errors
//Create a custom error handler that populates the list when errors occur.
validator.setErrorHandler(new ErrorHandler() {
@Override
public void warning(SAXParseException exception) throws SAXException {
exceptions << exception
}
@Override
public void fatalError(SAXParseException exception) throws SAXException {
exceptions << exception
}
@Override
public void error(SAXParseException exception) throws SAXException {
exceptions << exception
}
});
[/groovy]
Now we can log the errors,
[groovy]
if (exceptions) {
exceptions.each { ex ->
log.error "Error at Line: ${ex.lineNumber} Column: ${ex.columnNumber} in ${xmlFile.name}"
log.error "Message: ${ex.message}"
}
return false
} else {
log.info("No errors found in ${xmlFile.name}")
}
[/groovy]
Hope it is helpful.