XML to JSON using MapNeat
Following my previous article, I wanted to expand on the capabilities of the mapneat library.
In this tutorial, I will show you how you can transform an existing XML source into a/the desired JSON format.
Let’s start with the following XML structure:
<customer>
<firstname>Mike</firstname>
<lastname>Smith</lastname>
<visits count="3">
<visit>
<country>France</country>
<date>2010-01-22</date>
</visit>
<visit>
<country>Italy</country>
<date>1983-01-22</date>
</visit>
<visit>
<country>Romania</country>
<date>2010-01-22</date>
</visit>
<visit>
<country>Bulgaria</country>
<date>2010-01-25</date>
</visit>
</visits>
<email type="business">mail@bsi.com</email>
<email type="personal">mail@pers.com</email>
<age>67</age>
</customer>
From it, we want to obtain a JSON like:
{
"person" : {
"firstName" : "Mike",
"lastName" : "Smith",
"personalEmails" : [ "mail@pers.com" ],
"businessEmails" : [ "mail@bsi.com" ],
"hasVisitedRomania" : "true"
},
"visits" : {
"yearsActive" : [ "2010", "1983" ],
"countries" : [ "France", "Italy", "Romania", "Bulgaria" ]
}
}
We want to morph the source XML into a JSON that:
- Has two separated nodes for
person
andvisits
; - Has the customer’s emails grouped into two separated arrays based on their type (
<email type="...">
); - Has an optional
hasVisistedRomania
field in case Romania appears in the visits list; - Has an array containing all the years during which the customer was active (visited countries around the globe) - no duplications accepted
- Has an array containing all the countries the customer has visited - no duplications allowed
The corresponding mapneat transformation might look like:
json(MapNeatSource.fromXml(xml)) {
"person" /= json {
"firstName" *= "$.customer.firstname"
"lastName" *= "$.customer.lastname"
"personalEmails" *= "$.customer.email[?(@.type == 'personal')].content"
"businessEmails" *= "$.customer.email[?(@.type == 'business')].content"
if (sourceCtx().read<MutableList<String>>("$.customer.visits.visit[*].country").contains("Romania")) {
"hasVisitedRomania" /= "true"
}
}
"visits" /= json {
"yearsActive" *= {
expression = "$.customer.visits.visit[*].date"
processor = {
(it as MutableList<String>)
.map { ds -> LocalDate.parse(ds, df).year.toString() }
.toSet()
}
}
"countries" *= "$.customer.visits.visit[*].country"
}
}
Explanation
Under the hood, mapneat uses the JSON-java library to convert an XML Source to an intermediary JSON Source automatically.
This step si done automatically when MapNeatSource.fromXml(xml)
is invoked.
At this point, any XML information/reference will be “forever” lost.
For debugging purposes, if you want to see how the intermediary JSON source looks like, especially for debugging purposes, you can do the following:
json(MapNeatSource.fromXml(xml)) {
copySourceToTarget()
println(this)
}
Running the above code on our input XML, would return this:
{
"customer" : {
"visits" : {
"count" : 3,
"visit" : [ {
"date" : "2010-01-22",
"country" : "France"
}, {
"date" : "1983-01-22",
"country" : "Italy"
}, {
"date" : "2010-01-22",
"country" : "Romania"
}, {
"date" : "2010-01-25",
"country" : "Bulgaria"
} ]
},
"firstname" : "Mike",
"email" : [ {
"type" : "business",
"content" : "mail@bsi.com"
}, {
"type" : "personal",
"content" : "mail@pers.com"
} ],
"age" : 67,
"lastname" : "Smith"
}
}
This is the actual JSON source that we morph into our desired format.
Now, looking at the following operations:
"person" /= json {
"firstName" *= "$.customer.firstname"
"lastName" *= "$.customer.lastname"
"personalEmails" *= "$.customer.email[?(@.type == 'personal')].content"
"businessEmails" *= "$.customer.email[?(@.type == 'business')].content"
// ....
First, we observe that we can have json{}
inside json{}
.
This behavior allows us to merge various sources into a single file.
Creating an inner json{}
inside of an outer json{}
is done using the assign operation: /=
.
"$.customer.email[?(@.type == 'personal')].content"
is a json-path expression, that not only selects all emails, but also filters them by their type.
Retrieving information from the source is usually done using
*=
Shift Operations.
Next, given Kotlin’s excellent DSL features, we can actually mix control statements (if/else/case) inside our transformation:
if (sourceCtx().read<MutableList<String>>("$.customer.visits.visit[*].country").contains("Romania")) {
"hasVisitedRomania" /= "true"
}
The above code will make sure, the optional field hasVisitedRomania
only appears if the list of visited countries ("$.customer.visits.visit[*].country"
) contains "Romania"
.
The last part:
val df = DateTimeFormatter.ofPattern("yyyy-MM-dd", Locale.US)
//....
"yearsActive" *= {
expression = "$.customer.visits.visit[*].date"
processor = {
(it as MutableList<String>)
.map { ds -> LocalDate.parse(ds, df).year.toString() }
.toSet()
}
}
Iterates of all the visits extract the year of the visit and collect elements to a Set
(in order to avoid duplications).
Comments