I know there are libraries can covert evtx to json object. However, it doesn't fit my usage. This post is going to share how I convert individual evtx into json format
The data type within the event
The structure of each evtx is quite straight forward, basically, data are stored in Event/System and Event/EventData.
However, the most dynamic part is some data will be stored as attribute and some are text. Here are the example of how I resolve that.
Resolve the dynamic XML DOM
In order to resolve evtx, I selected "xmltodict". It gets you a direct access to specific node, actually, there're only 2 of them.
Load the XML
In xmltodict, starting with simple "with open" statement. Using .parse() function to read the content. Then we have the data on hand.
import xmltodict
with open('./sample.xml') as fd:
evts = xmltodict.parse(fd.read())
Read the System node
Within the system, we can see the challenge where some data stored as an attributes form, such as Provider. And some stored as standard text form, like EventID.
To access the System node, we can just simply access by dict key
system = evts['Event']['System']
As it had already converted as dictionary, let's take a look on what if we iterate all the keys and values once. The best way to iterate all keys and values for a list is to use .items()
So you can see there are two type of values presented. OrderedDict and Str. If you examine carefully with type(), you can tell the OrderedDict is a class of collections.
Take a closer look of Provider node, the OrderedDict is functioned as a wrapper for both attributes, Name and Guid
Since we have all the idea of the data structure now, it could be easily implemented by setting up the condition using type():
_evtDict = {}
for k, v in system.items():
if type(v) == collections.OrderedDict:
_evtAttr = {}
for k1,v1 in v.items():
_evtAttr[k1[1:]] = v1
_evtDict[k] = _evtAttr
if type(v) == str:
_evtDict[k] = v