Modular toolkit for data processing

Category: Python - Miscellaneous

Modular toolkit for Data Processing (MDP) is a Python data processing framework. Implemented algorithms include: Principal Component Analysis (PCA), Independent Component Analysis (ICA), Slow Feature Analysis (SFA), Growing Neural Gas (GNG), Factor Analysis, Fisher Discriminant Analysis (FDA), and Gaussian Classifiers. From the user's perspective, MDP consists of a collection of trainable supervised and unsupervised algorithms or other data processing units (nodes) that can be combined into data processing flows. Given a sequence of input data, MDP takes care of successively training or executing all nodes in the flow. This structure allows to specify complex algorithms as a sequence of simpler data processing steps in a natural way. Training can be performed using small chunks of input data, so that the use of very large data sets becomes possible while reducing the memory requirements. Memory usage can also be minimized by defining the internals of the nodes to be single precision. From the developer's perspective, MDP is a framework to make the implementation of new algorithms easier. The basic class 'Node' takes care of tedious tasks like numerical type and dimensionality checking, leaving the developer free to concentrate on the implementation of the training and execution phases. The node then automatically integrates with the rest of the library and can be used in a flow together with other nodes. A node can have multiple training phases and even an undetermined number of phases. This allows for example the implementation of algorithms that need to collect some statistics on the whole input before proceeding with the actual training, or others that need to iterate over a training phase until a convergence criterion is satisfied. MDP has been written in the context of theoretical research in neuroscience, but it has been designed to be helpful in any context where trainable data processing algorithms are used. Its simplicity on the user side together with the reusability of the implemented nodes make it also a valid educational tool. Date: 24 February, 2012


Data Processing - Principal Component Analysis - Gaussian Classifiers - Data - Processing - Gaussian

Homepage: http://sourceforge.net/

Developer: SourceForge.net

License: Freeware

Operating System: All

Add a Comment

all are required fields

     
What do you think of this resource?

Select Your Rate:

Votes:0

 

Related Scripts Download

FreeMat is a free environment for rapid engineering and scientific prototyping and data processing.

developer Developer: SourceForge.net
license License: Artistic License, GNU General Public License (GPL)
operating systems Operating System: Windows, Linux, Mac OS, BSD


This module introduces an alternative syntax a-la shell pipes for sequence-oriented functions, such as filter, map, etc.

developer Developer: code.activestate.com
license License: Artistic License, GNU General Public License (GPL)
operating systems Operating System: Windows, Linux, Mac OS, BSD, Solaris


Only the forms need to be supplied and the plugin handles the data gathering.

developer Developer: wordpress.org
license License: Artistic License, GNU General Public License (GPL)
operating systems Operating System: Windows, Linux, Mac OS, BSD, Solaris


GNU ddrescue copies data from one file or block device (hard disc, cdrom, etc) to another, trying hard to rescue data in case of read errors.

developer Developer: Free Software Foundation, Inc.
license License: GNU General Public License (GPL)
operating systems Operating System: OS Independent


The Data Generator is a free, GNU-licensed, open source script written in JavaScript, PHP and MySQL that lets you quickly generate large volumes of custom data in a variety of formats for use in testing software and populating databases.

developer Developer: Benjamin Keen
license License: GNU General Public License (GPL)
operating systems Operating System: All


Create dynamic-easy-to-use data grid controls for your web site in seconds.

developer Developer: Mike Frank
license License: BSD License
operating systems Operating System: Any


Provider of data recovery software offers file rescue solution to retrieve lost, missed data from damaged storage media.

developer Developer: Data Recovery Reviews
license License: GNU General Public License (GPL)
operating systems Operating System: Not Available


Implements the behaviours of the Regression Class as an extended MovieClip.

developer Developer: members.shaw.ca
license License: Freeware
operating systems Operating System: All


Implements the static behaviours of the Stat Class, which deals with Probability and Statistics math algorithms.

developer Developer: members.shaw.ca
license License: Freeware
operating systems Operating System: All