.. -*- mode: rst; fill-column: 78 -*-
.. ex: set sts=4 ts=4 sw=4 et tw=79:
  ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ###
  #
  #   See COPYING file distributed along with the PyMVPA package for the
  #   copyright and license terms.
  #
  ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ###

.. _intro:

************
Introduction
************

.. index:: MVPA

PyMVPA is a Python_ module intended to ease pattern classification
analysis of large datasets. It provides high-level abstraction of typical
processing steps and a number of implementations of some popular algorithms.
While it is not limited to neuroimaging data it is eminently suited for such
datasets. PyMVPA is truly free software (in every respect) and additionally
requires nothing but free software to run. Theoretically PyMVPA should run
on anything that can run a Python_ interpreter, although the proof is yet to
come.

PyMVPA stands for *Multivariate Pattern Analysis* in Python_.

.. _Python: http://www.python.org


What this Manual is NOT
~~~~~~~~~~~~~~~~~~~~~~~

.. index:: textbook, review, API reference, examples

This manual does not make an attempt to be a comprehensive introduction into
machine learning theory or pattern recognition techniques. There is a wealth
of high-quality text books about this field available. A very good example is:
`Pattern Recognition and Machine Learning`_ by `Christopher M. Bishop`_.

A good starting point to learn about the application of machine learning
algorithms to (f)MRI data are two recent reviews by Norman et al. [1]_ and
Haynes and Rees [2]_.

This manual also does not describe every bit and piece of the PyMVPA package.
For more information, please have a look at the `API documentation`_, which is a
comprehensive and up-to-date description of the whole package.

More examples and usage patterns extending the ones described here can be taken
from the examples shipped with the PyMVPA source distribution (`doc/examples/`)
or even the unit test battery, also part of the source distribution
(in the `tests/` directory).

.. _API Documentation: api/index.html
.. _Christopher M. Bishop: http://research.microsoft.com/~cmbishop/
.. _Pattern Recognition and Machine Learning: http://research.microsoft.com/~cmbishop/PRML

.. [1] Norman, K.A., Polyn, S.M., Detre, G.J. & Haxby, J.V. (2006). Beyond
       mind-reading: multi-voxel pattern analysis of fMRI data. Trends in
       Cognitive Science 10, 424–430.
.. [2] Haynes, J.D. & Rees, G. (2007). Decoding mental states from brain
       activity in humans. Nature Reviews Neuroscience, 7, 523–534.


.. _history:

.. index:: history, MVPA toolbox for Matlab, license, free software

A bit of History
~~~~~~~~~~~~~~~~

The roots of PyMVPA date back to early 2005. At that time it was a C++ library
(no Python_ yet) developed by Michael Hanke and Sebastian Krüger, intended to
make it easy to apply artificial neural networks to pattern recognition
problems.

During a visit to `Princeton University`_ in spring 2005, Michael Hanke
was introduced to the `MVPA toolbox`_ for `Matlab
<http://buchholz.hs-bremen.de/aes/aes_matlab.gif>`_, which had several
advantages over a C++ library. Most importantly it was easier to use. While a
user of a C++ library is forced to write a significant amount of front-end
code, users of the MVPA toolbox could simply load their data and start
analyzing it, providing a common interface to functions drawn from a variety
of libraries.

.. _Princeton University: http://www.princeton.edu
.. _MVPA toolbox: http://www.csbmb.princeton.edu/mvpa/

However, there are some disadvantages to writing a toolbox in Matlab. While
users in general benefit from the powers of Matlab, they are at the same time
bound to the goodwill of a commercial company. That this is indeed a problem
becomes obvious when one considers the time when the vendor of Matlab was not
willing to support the Mac platform. Therefore even if the MVPA toolbox is
`GPL-licensed`_ it cannot fully benefit from the enormous advantages of the
free software development model environment (free as in free speech, not only
free beer).

.. _GPL-licensed: http://www.gnu.org/copyleft/gpl.html

For these reasons, Michael thought that a successor to the C++ library
should remain truly free software, remain fully object-oriented (in contrast
to the MVPA toolbox), but should be at least as easy to use and extensible
as the MVPA toolbox.

After evaluating some possibilities Michael decided that `Python`_ is the most
promising candidate that was fully capable of fulfilling the intended
development goal. Python is a very powerful language that magically combines
the possibility to write really fast code and a simplicity that allows one to
learn the basic concepts within a few days.

.. index:: RPy, PyMatlab

One of the major advantages of Python is the availability of a huge amount of
so called *modules*. Modules can include extensions written in a hardcore
language like C (or even FORTRAN) and therefore allow one to incorporate
high-performance code without having to leave the Python
environment. Additionally some Python modules even provide links to other
toolkits. For example `RPy`_ allows to use the full functionality of R_ from
inside Python. Even Matlab can be used via some Python modules (see PyMatlab_
for an example).

.. _RPy: http://rpy.sourceforge.net/
.. _R: http://www.r-project.org
.. _PyMatlab: http://code.google.com/p/pymatlab/

After the decision for Python was made, Michael started development with a
simple k-Nearest-Neighbour classifier and a cross-validation class. Using
the mighty NumPy_ package made it easy to support data of any dimensionality.
Therefore PyMVPA can easily be used with 4d fMRI dataset, but equally well
with EEG/MEG data (3d) or even non-neuroimaging datasets.

.. index:: NIfTI

By September 2007 PyMVPA included support for reading and writing datasets
from and to the `NIfTI format`_, kNN and Support Vector Machine classifiers,
as well as several analysis algorithms (e.g. searchlight and incremental
feature search).

.. _NIfTI format: http://nifti.nimh.nih.gov/

During another visit in Princeton in October 2007 Michael met with `Yaroslav
Halchenko`_ and `Per B. Sederberg`_. That incident and the following
discussions and hacking sessions of Michael and Yaroslav lead to a major
refactoring of the PyMVPA codebase, making it much more flexible/extensible,
faster and easier than it has ever been before.

.. _Yaroslav Halchenko: http://www.onerussian.com/
.. _Per B. Sederberg: http://www.princeton.edu/~persed/


.. _requirements:
.. index:: requirements

Prerequisites
~~~~~~~~~~~~~

Like every other Python module PyMVPA requires at least a basic knowledge of
the Python language. However, if one has no prior experience with Python one
can benefit from the simplicity of the Python language and acquire this
knowledge within a few days by studying some of the many tutorials available
on the web.

.. links to good tutorials (numpy for matlab users, dive into python, ...)

As PyMVPA is about pattern recognition a basic understanding about machine
learning principles is necessary to correctly apply methods with PyMVPA to
ensure interpretability of the results.

.. index:: dependencies, Python, NumPy

Dependencies
''''''''''''

The following software packages are required or PyMVPA will not work at all.

  Python_ 2.4 (or later)
    With some modifications PyMVPA could probably work with Python 2.3, but as
    it is quite old already and Python 2.4 is widely available there should be
    no need to do this.
  NumPy_
    PyMVPA makes extensive use of NumPy to store and handle data. There is no
    way around it.

.. _NumPy: http://numpy.scipy.org/


.. index:: recommendations, SciPy, PyNIfTI, Shogun, R, RPy

Strong Recommendations
''''''''''''''''''''''

While most parts of PyMVPA will work without any additional software, some
functionality makes use of additional software packages. It is strongly
recommended to install these packages as well.

  SciPy_: linear algebra, standard distributions
    SciPy_ is mainly used by the statistical testing and the logistic
    regression classifier code. However, in the long run SciPy might be used a
    lot more and could become a required dependency of PyMVPA.
  PyNIfTI_: access to NIfTI files
    PyMVPA provides a convenient wrapper for datasets stored in the NIfTI
    format. If you don't need that, PyNIfTI is not necessary, but otherwise
    it makes it really easy to read from and write to NIfTI images.
  Shogun_: various classifiers
    PyMVPA currently can make use of several SVM implementations of the
    Shogun_ toolbox. It requires the modular python interface of Shogun to be
    installed. Any version from 0.6 on should work.
  R_ and RPy_: more classifiers
    Currently PyMVPA provides a wrapper around the LARS library.

.. _SciPy: http://www.scipy.org/
.. _LIBSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/
.. _PyNIfTI: http://niftilib.sourceforge.net/pynifti/
.. _Shogun: http://www.shogun-toolbox.org


.. index:: suggestions, IPython, FSL, AFNI, LIBSVM

Suggestions
''''''''''''

The following list of software is not required by PyMVPA, but it might make
life a lot easier and leads to more efficiency when using PyMVPA.

  IPython_: frontend
    If you want to use PyMVPA interactively it is strongly recommend to use
    IPython_. If you think: *"Oh no, not another one, I already have to learn
    about PyMVPA."* please invest a tiny bit of time to watch the `Five Minutes
    with IPython`_ screencasts at showmedo.com_, so at least you know what you
    are missing.
  FSL_: preprocessing and analysis of (f)MRI data
    PyMVPA provides some simple bindings to FSL output and filetypes (e.g. EV
    files and MELODIC output directories). This makes it fairly easy to e.g.
    use FSL's implementation of ICA for data reduction and proceed with
    analyzing the estimated ICs in PyMVPA.
  AFNI_: preprocessing and analysis of (f)MRI data
    Similar to FSL, AFNI is a free package for processing (f)MRI data.
    Though its primary data file format is BRIK files, it has the ability
    to read and write NIFTI files, which easily integrate with PyMVPA.
  LIBSVM_: fast SVM classifier
    Only the C library is required and none of the Python bindings that are
    available on the upstream website. PyMVPA provides its own Python wrapper
    for LIBSVM which is a fork based on the one included in the LIBSVM package.
    Additionally the upstream LIBSVM distribution causes flooding of the console
    with a huge amount of debugging messages. Please see the `Building from
    Source`_ section for information on how to build an alternative version that
    does not have this problem.
  matplotlib_: Matlab-style plotting library for Python
    This is a very powerful plotting library that allows you to export into
    a large variety of raster and vector formats, and thus, is ideal to
    produce publication quality figures.

.. _matplotlib: http://matplotlib.sourceforge.net/
.. _IPython: http://ipython.scipy.org
.. _Five Minutes with IPython: http://showmedo.com/videos/series?name=CnluURUTV
.. _showmedo.com: http://showmedo.com
.. _FSL: http://www.fmrib.ox.ac.uk/fsl/
.. _AFNI: http://afni.nimh.nih.gov/afni/


.. _obtaining:
.. index:: installation

Installation
~~~~~~~~~~~~

.. Point to source and binary distribution. Preach idea of free software.
   Step by step guide to install it on difficult systems like Windows.

.. Don't forget to mention that the only reasonable way to use this piece
   of software (like every other piece) is under Debian! Also mention that
   Ubuntu is no excuse ;-)

The easiest way to obtain PyMVPA is to use pre-built binary packages.
Currently we provide such packages or installers for the Debian/Ubuntu family
and 32-bit Windows (see below). Since version 0.2.2 there is also an initial
version of a RPM package for OpenSUSE 10.3. If there are no binary packages
for your operating system or platform yet, you can build PyMVPA from source.
Please refer to `Building from Source`_ for more information.

.. index:: binary packages
.. index:: Debian

.. _install_debian:

Debian
''''''

PyMVPA is available as an `official Debian package`_ (`python-mvpa`;
since *lenny*). The documentation is provided by the optional
`python-mvpa-doc` package. To install PyMVPA simply do::

  sudo aptitude install python-mvpa

.. _official Debian package: http://packages.debian.org/python-mvpa

.. index:: backports, Debian, Ubuntu
.. _install_debianbackports:

Debian backports and inofficial Ubuntu packages
'''''''''''''''''''''''''''''''''''''''''''''''

Backports for the current Debian stable release and binary packages for recent
Ubuntu releases are available from a `repository at the University of
Magdeburg`_. Please read the `package repository instructions`_ to learn about
how to obtain them. Otherwise install as you would do with any other Debian
package.

.. _repository at the University of Magdeburg: http://apsy.gse.uni-magdeburg.de
.. _package repository instructions: http://apsy.gse.uni-magdeburg.de/main/index.psp?sec=1&page=hanke/debian&lang=en


.. _install_win:

Windows
'''''''

There are a few Python distributions for Windows. In theory all of them should
work equally well. However, we only tested the standard Python distribution
from www.python.org (with version 2.5.2).

First you need to download and install Python. Use the Python installer for
this job. Yo do not need to install the Python test suite and utility scripts.
From now on we will assume that Python was installed in `C:\\Python25` and that
this directory has been added to the `PATH` environment variable.

For a minimal installation of PyMVPA the only thing you need in addition is
NumPy_. Download a matching NumPy windows installer for your Python version
(in this case 2.5) from the `SciPy download page`_ and install it.

Now, you can use the PyMVPA windows installer to install PyMVPA on your system.
If done, verify that everything went fine by opening a command promt and start
Python by typing `python` and hit enter. Now you should see the Python prompt.
Import the mvpa module, which should cause no error messages.

  >>> import mvpa
  >>>

Although you have a working installation already, most likely you want to
install some additional software. First and foremost install SciPy_ -- download from the same page where you also got the NumPy installer.

If you want to use PyMVPA to analyze fMRI datasets, you probably also want to
install PyNIfTI_. Download the corresponding installer from the website of the
`NIfTI libraries`_ and install it. PyNIfTI does not come with the required
`zlib` library, so you also need to download and install it. A binary installer is available from the `GnuWin32 project`_. Install it in some arbitrary folder
(just the binaries nothing else), find the `zlib1.dll` file in the `bin`
subdirectory and move it in the Windows `system32` directory. Verify that it
works by importing the `nifti` module in Python.

  >>> import nifti
  >>>

Another piece of software you might want to install is matplotlib_. The project
website offers a binary installer for Windows. If you are using the standard
Python distribution and matplotlib complains about a missing `msvcp71.dll`, be
sure to obey the installation instructions for Windows on the matplotlib
website.

With this set of packages you should be able to run most of the PyMVPA examples
which are shipped with the source code in the `doc/examples` directory.

.. _SciPy download page: http://scipy.org/Download
.. _NIfTI libraries: http://niftilib.sourceforge.net/
.. _GnuWin32 project: http://gnuwin32.sourceforge.net/


.. _install_suse:

OpenSUSE
''''''''

To install the provided RPM package for OpenSUSE, simply download it, open a
console and invoke (the example command refers to PyMVPA 0.2.2 and OpenSUSE
10.3)::

  rpm -i pymvpa-0.2.2-1suse10_3.i586.rpm

Please refer to the section about :ref:`building on OpenSUSE <build_suse>` for
notes about the installation of the dependencies.


.. _buildfromsource:
.. index:: building from source, source package, MacOSX

Building from Source
~~~~~~~~~~~~~~~~~~~~

If a binary package for your platform and operating system is provided, you do
not have to build the packages on your own -- use the corresponding pre-build
packages instead. However, if there are no binary packages for your system, or
you want to try a new (unreleased) version of PyMVPA, you can easily build
PyMVPA on your own. Any recent linux distribution should be capable of doing it
(e.g. RedHat). Additionally, building PyMVPA also works on Mac OSX and Windows
systems.

.. _PyMVPA project website: http://pkg-exppsy.alioth.debian.org/pymvpa/


.. index:: releases, development snapshot

Three Ways to Obtain the Sources
''''''''''''''''''''''''''''''''

The first step is obtaining the sources. The source code tarballs of all
PyMVPA releases are available from the `PyMVPA project website`_.
Alternatively, one can also download a tarball of the latest development
snapshot_ (i.e. the current state of the *master* branch of the PyMVPA source
code repository).

.. _snapshot:  http://git.debian.org/?p=pkg-exppsy/pymvpa.git;a=snapshot;h=refs/heads/master;sf=tgz
.. index:: Git repository

If you want to have access to both, the full PyMVPA history and the latest
development code, you can use the PyMVPA Git_ repository, which is publicly
available. To view the repository, please point your web browser to gitweb:

  http://git.debian.org/?p=pkg-exppsy/pymvpa.git

The gitweb browser also allows to download arbitrary development snapshots
of PyMVPA. For a full clone (aka checkout) of the PyMVPA repository simply
do:

  :command:`git clone git://git.debian.org/git/pkg-exppsy/pymvpa.git`

After a short while you will have a `pymvpa` directory below your current
working directory, that contains the PyMVPA repository.

.. _Git: http://git.or.cz/


Build it (General instructions)
'''''''''''''''''''''''''''''''

In general you can build PyMVPA like any other Python module (using the Python
*distutils*). This general method will be outline first. However, in some
situations or on some platforms alternative ways of building PyMVPA might be
more covenient -- alternative approaches are listed at the end of this section.

To build PyMVPA from source simply enter the root of the source tree (obtained
by either extracting the source package or cloning the repository) and run:

  :command:`python setup.py build_ext`

If you are using a Python version older than 2.5, you need to have
python-ctypes (>= 1.0.1) installed to be able to do this.

Now, you are ready to install the package. Do this by invoking:

  :command:`python setup.py install`

Most likely you need superuser privileges for this step. If you want to install
in a non-standard location, please take a look at the :command:`--prefix`
option. You also might want to consider :command:`--optimize`.

Now you should be ready to use PyMVPA on your system.

.. index:: LIBSVM, SWIG

Build with enabled LIBSVM bindings
''''''''''''''''''''''''''''''''''

From the 0.2 release of PyMVPA on, the LIBSVM_ classifier extension is not
build by default anymore. However, it is still shipped with PyMVPA and can be
enabled at build time. To be able to do this you need to have SWIG_installed on
your system.

PyMVPA needs a patched LIBSVM version, as the original distribution generates
a huge amount of debugging messages and therefore makes the console and PyMVPA
output almost unusable. Debian (since lenny: 2.84.0-1) and Ubuntu (since gutsy)
already include the patched version. For all other systems a minimal copy of
the patched sources is included in the PyMVPA source package (`3rd/libsvm`).

If you do not have a proper LIBSVM_ package, you can build the library from 
the copy of the code that is shipped with PyMVPA. To do this, simply invoke::

  make 3rd

Now build PyMVPA as described above. The build script will automatically
detect that LIBSVM_ is available and builds the LIBSVM wrapper module for you.

If your system provides an appropriate LIBSVM_ version, you need to have the
development files (headers and library) installed. Depending on where you
installed them, it might be necessary to specify the full path to that location
with the `--include-dirs`, `--library-dirs` and `--swig` options. Now add the '--with-libsvm' flag when building PyMVPA::

  python setup.py build_ext --with-libsvm \
      [ -I<LIBSVM_INCLUDEDIR> -L<LIBSVM_LIBDIR> ]

The installation procedure is equivalent to the build setup without LIBSVM_,
except that the '--with--libsvm' flag also has to be set when installing::

  python setup.py install --with-libsvm

.. _SWIG: http://www.swig.org/

.. index:: alternative build procedure

Alternative build procedure
'''''''''''''''''''''''''''

Alternatively, if you are doing development in PyMVPA or if you
simply do not want (or do not have sufficient permissions to do so) to
install PyMVPA system wide, you can simply call `make` (same `make
build`) in the top-level directory of the source tree to build
PyMVPA. Then extend or define your environment variable `PYTHONPATH`
to point to the root of PyMVPA sources (i.e. where you invoked all
previous commands from):

  export PYTHONPATH=$PWD

However, please note that this procedure also always builds the LIBSVM_
extension and therefore also requires the patched LIBSVM version and SWIG to be
available.


.. index:: citation, PyMVPA poster


Building on Windows Systems
'''''''''''''''''''''''''''

On Windows the whole situation is a little more tricky, as the system doesn't
come with a compiler by default. Nevertheless, it is easily possible to build
PyMVPA from source. Although, one could use the Microsoft compiler that comes
with Visual Studio to do it, but as this is commercial software and not
everybody has access to it, we will outline a way that exclusively involves
free and open source software.

First one needs to install the packages required to run PyMVPA as explained
:ref:`above <install_win>`.

Next we need to obtain and install the MinGW compiler collection. Download the
*Automated MinGW Installer* from the `MinGW project website`_. Now, run it and
choose to install the `current` package. You will need the *MinGW base tools*,
*g++* compiler and *MinGW Make*. For the remaining parts of the section, we
will assume that MinGW got installed in `C:\\MinGW` and the directory
`C:\\MinGW\\bin` has been added to the `PATH` environment variable, to be able
to easily access all MinGW tools. Note, that it is not necessary to install
`MSYS`_ to build PyMVPA, but it might handy to have it.

If you want to build the LIBSVM wrapper for PyMVPA, you also need to download
SWIG_ (actually *swigwin*, the distribution for Windows). SWIG does not have to
be installed, just unzip the file you downloaded and add the root directory of
the extracted sources to the `PATH` environment variable (make sure that this
directory contains `swig.exe`, if not, you haven't downloaded `swigwin`).

PyMVPA comes with a specific build setup configuration for Windows
-- `setup.cfg.win` in the root of the source tarball. Please rename this file
to `setup.cfg` (and overwrite the existing one). This is only necessary, if you
have *not* configured your Python distutils installation to always use MinGW
instead of the Mircrosoft compilers.

Now, we are ready to build PyMVPA. The easiest way to do this, is to make use
of the `Makefile.win` that is shipped with PyMVPA to build a binary installer
package (`.exe`). Make sure, that the settings at the top of `Makefile.win`
(the file is located in the root directory of the source distribution)
correspond to your Python installation -- if not, first adjust them accordingly
before your proceed. When everything is set, do::

  mingw32-make -f Makefile.win installer

Upon success you can find the installer in the `dist` subdirectory. Install it
as described :ref:`above <install_win>`.


.. _MinGW project website: http://www.mingw.org/
.. _MSYS: http://www.mingw.org/msys.shtml


.. _build_suse:

OpenSUSE
''''''''

Building PyMVPA on OpenSUSE involves the following steps (tested with 10.3):
First add the OpenSUSE science repository, that contains most of the required
packages (e.g. NumPy, SciPy, matplotlib), to the Yast configuration. The URL
for OpenSUSE 10.3 is::

  http://download.opensuse.org/repositories/science/openSUSE_10.3/

Now, install the following required packages: 

  * a recent C and C++ compiler (e.g. GCC 4.1)
  * `python-devel` (Python development package)
  * `python-numpy` (NumPy)
  * `swig` (SWIG is only necessary, if you want to make use of LIBSVM)

Now you can simply compile and install PyMVPA, as outlined above, in the
general build instructions (or alternatively using the method with LIBSVM).

If you have problems compiling the NIfTI libraries and PyNIfTI on OpenSUSE, try
the following: Download the `nifticlib` source tarball, extract it and run
`make` in the top-level source directory. Be sure to install the `zlib-devel`
package before. Now, download the `pynifti` source tarball extract it, and edit
`setup.py`. Change the line::

  libraries = [ 'niftiio' ],

to::

  libraries = [ 'niftiio', 'znz', 'z' ],

as mentioned in the PyNIfTI installation instructions. This is necessary, as
the above approach does only generate static NIfTI libraries which are not
properly linked with all dependencies. Now, compile PyNIfTI with::

  python setup.py build_ext -I <path_to_nifti>/include \
      -L <path_to_nifti>/lib --swig-opts="-I<path_to_nifti>/include"

where `<path_to_nifti>` is the directory that contains the extracted
`nifticlibs` sources. Finally, install PyNIfTI with::

  sudo python setup.py install

If you want to run the PyMVPA examples including the ones that make use of the
plotting capabilities of `matplotlib` you need to install of few more packages
(mostly due to broken dependencies in the corresponding OpenSUSE packages):

  * `python-scipy`
  * `python-gobject2`
  * `python-gtk`


How to cite PyMVPA
~~~~~~~~~~~~~~~~~~

The PyMVPA toolbox was first presented with a poster_ at annual meeting of the
*German Society for Psychophysiology and its Application* in Magdeburg,
2008. This is currently the prefered way to cite PyMVPA. However, we submitted
a paper introducing the toolbox, which should become replace the poster soon.

.. _poster: http://pkg-exppsy.alioth.debian.org/pymvpa/files/PyMVPA_PuG2008.pdf


Credits
~~~~~~~

(needs some more words, for now just a list)

  * NumPy, SciPy
  * LIBSVM
  * Shogun
  * IPython
  * Debian (for hosting, environment, ...)
  * FOSS community
  * Credits to individual labs if they officially donate time ;-)

.. Please add some notes when you think that you should give credits to someone
   that enables or motivates you to work on PyMVPA ;-)
