Breaking large XML documents into chunks

Category: Python - Development Tools

One of the few problems with using Python to process XML is the speed -- if the XML becomes somewhat large (>1Mb), it slows down exponentially as the size of the XML increases. One way to increase the processing speed is to break the XML down via tag name. This is especially handy if you are only interested in one part of the XML, or between certain elements throughout the XML. This script contains a function that handle this problem. It uses the Sax reader from PyXML.The In parameters are the XML as a string, the tag name that you want to build the DOM around, and an optional postition to start at within the XML. It returns a DOM tree and the character position that it stopped at. Date: 14 April, 2012


Breaking XML Documents - Split XML Document - XML Document Break - Xml - Document - Break

Homepage: http://code.activestate.com

Developer: code.activestate.com

License: Artistic License, GNU General Public License (GPL)

Operating System: Windows, Linux, Mac OS, BSD, Solaris

Add a Comment

all are required fields

     
What do you think of this resource?

Select Your Rate:

Votes:0

 

Related Scripts Download

Indite is a plugin for HtmlArea.

developer Developer: troels
license License: GNU Lesser General Public License (LGPL)
operating systems Operating System: ie5.5+, mozilla


AxPoint generates slideshows in PDF format from a simple XML description format.

developer Developer: Matt Sergeant
license License: GNU General Public License (GPL)
operating systems Operating System: All


Kumera is an Open Source Content Management System written in Perl and using XML for data storage, designed for small to medium web sites.

developer Developer: http://www.cyber4.org/ku...
license License: Freeware
operating systems Operating System: Unix, Linux


Hippo CMS is an open source information centered content management system.

developer Developer: Tjeerd Brenninkmeijer
license License: Apache Software License
operating systems Operating System: Unix, Windows


The Data Generator is a free, GNU-licensed, open source script written in JavaScript, PHP and MySQL that lets you quickly generate large volumes of custom data in a variety of formats for use in testing software and populating databases.

developer Developer: Benjamin Keen
license License: GNU General Public License (GPL)
operating systems Operating System: All


SYDI is a project aimed to help system administrators to document their network.

developer Developer: Patrick Ogenstad
license License: BSD License
operating systems Operating System: Windows


This small tool helps you to convert your MySQL database layout into XML.

developer Developer: PhpToys
license License: GNU General Public License (GPL)
operating systems Operating System: ALL


This script takes an xml file as input and output a colorized version of this file, using html or docbook (with emphasis elements and a particular role).

developer Developer: code.activestate.com
license License: Artistic License, GNU General Public License (GPL)
operating systems Operating System: Windows, Linux, Mac OS, BSD, Solaris


This script shows you a reusable way to use "xml.

developer Developer: code.activestate.com
license License: Artistic License, GNU General Public License (GPL)
operating systems Operating System: Windows, Linux, Mac OS, BSD, Solaris