Skip to main content

XML parsing and Python: the lxml way

Today I was tinkering with an idea involving parsing a Html file by taking an XML structure and building a simple xquery engine in python.

I know...most of my "World changing ideas" finally end up on my unfinished/fragile Github Project category.
But knowing how it's gonna help me in long drive I tried to give it a shot anyway.

And this is when I discovered lxml and immediately fell in love with it.
But rather going to the intricacies I'm gonna share how simple it was for me to pars an xml file with it.

First, we'll make sure we have everything in place.
Things we need
  • lxml
Yeah...that's kinda only thing you'll need :) (and of course Python!!!).
If you are sing Ubuntu or Xubuntu (like me) then just goto synaptics package manager and install it. Or get it from the website if you are gonna tinker in windows.

Now, let's assume that we want to parse a xml file named 'file.xml' with the following content:



Now I want to get all the element_id for my purpose from the file. The following python code serves the purpose:


from lxml import etree

doc = etree.parse('file.xml')
element_list = doc.findall('element')

for store in store_list:
element_id = element.findtext('element_id')


print element_id


Another way would be  


from lxml import etree

doc = etree.parse('file.xml')

for element in doc.getiterator('element'):
element_id = element.findtext('element_id')
print element_id

Comments

Popular posts from this blog

Visualizing large scale Uber Movement Data

Last month one of my acquaintances in LinkedIn pointed me to a very interesting dataset. Uber's Movement Dataset. It was fascinating to explore their awesome GUI and to play with the data. However, their UI for exploring the dataset leaves much more to be desired, especially the fact that we always have to specify source and destination to get relevant data and can't play with the whole dataset. Another limitation also was, the dataset doesn't include any time component. Which immediately threw out a lot of things I wanted to explore. When I started looking out if there is another publicly available dataset, I found one at Kaggle. And then quite a few more at Kaggle. But none of them seemed official, and then I found one released by NYC - TLC which looked pretty official and I was hooked.
To explore the data I wanted to try out OmniSci. I recently saw a video of a talk at jupytercon by Randy Zwitch where he goes through a demo of exploring an NYC Cab dataset using OmniSci. A…

ARCore and Arkit: What is under the hood : Anchors and World Mapping (Part 1)

Reading Time: 7 MIn
Some of you know I have been recently experimenting a bit more with WebXR than a WebVR and when we talk about mobile Mixed Reality, ARkit and ARCore is something which plays a pivotal role to map and understand the environment inside our applications.
I am planning to write a series of blog posts on how you can start developing WebXR applications now and play with them starting with the basics and then going on to using different features of it. But before that, I planned to pen down this series of how actually the "world mapping" works in arcore and arkit. So that we have a better understanding of the Mixed Reality capabilities of the devices we will be working with.
Mapping: feature detection and anchors Creating apps that work seamlessly with arcore/kit requires a little bit of knowledge about the algorithms that work in the back and that involves knowing about Anchors. What are anchors: Anchors are your virtual markers in the real world. As a develope…

ARCore and Arkit, What is under the hood: SLAM (Part 2)

In our last blog post (part 1), we took a look at how algorithms detect keypoints in camera images. These form the basis of our world tracking and environment recognition. But for Mixed Reality, that alone is not enough. We have to be able to calculate the 3d position in the real world. It is often calculated by the spatial distance between itself and multiple keypoints. This is often called Simultaneous Localization and Mapping (SLAM). And this is what is responsible for all the world tracking we see in ARCore/ARKit.
What we will cover today:How ARCore and ARKit does it's SLAM/Visual Inertia OdometryCan we D.I.Y our own SLAM with reasonable accuracy to understand the process better Sensing the world: as a computerWhen we start any augmented reality application in mobile or elsewhere, the first thing it tries to do is to detect a plane. When you first start any MR app in ARKit, ARCore, the system doesn't know anything about the surroundings. It starts processing data from cam…