I have a major showstopper in transforming xml data into a table structure. The system is real-time and receiving hundreds of packed xml messages per second. I can do everything I need to outside of the server to get the data to clients but getting the data into a table in the database is killing me.
I'm going to attach a .sql file to demonstrate my problem. I use sqlserver 2005 (and tried on 2008 with no improvement). The whole problem lies in parsing the xml into data that i can use to upsert a table - it takes too long.
In the same provided I have two approaches - 1 users xquery (as recommended) and the other uses openxml (as definitely not recommended these days).
I consistently get a performance of 90ms for the first approach and this is just no good because it means i can only really handle 10-odd messages per second.
VERY unexpectedly in the second (openxml) approach I get far better performance.
So, 2 questions:
1. how the heck can i get a better parsing performance using this approach(I will look to throttling in a way that i stop processing every message but wait for a second to pass but i hate to do this and will still have xml parsing to do)
2. why the heck is the old openxml approach far, far faster (5 times on my server here).
Any thoughts, corrections, etc. will be gladly accepted and greatly appreciated.
XML parsing is very expensive; you may want to decouple the processes of receiving XML data and parsing it further by inserting incoming documents into an XML column in a staging table and processing it asynchronously.
In case XML parsing is expensive, you can use Pull or VTD-XML, I particularlly recommend vtd-xml, it is the latest and most advanced XML processing software API, it claims to be the fatest and most memory efficent, with built-in XPath support