[Numpy-discussion] [SciPy-Dev] Good-bye, sort of (John Hunter)
josef.pktd@gmai...
josef.pktd@gmai...
Sun Aug 15 14:40:17 CDT 2010
On Sun, Aug 15, 2010 at 1:43 PM, Sturla Molden <sturla@molden.no> wrote:
>> Could those contributing here put up a Cookbook page of "reasons why
>> we've moved on from MATLAB", to be used as a resource by people trying
>> to convince supervisors/professors/sponsors/clients that they should
>> be allowed to use Python?
> There are many reasons, to name a few:
> 1. Ease of programming. Python is just a better language. Python is a
> general purpose language, MATLAB is restricted to simple numerical code.
> Python is easier to read (Matlab is more like Ruby or Lua).
Organizing the code for a project in matlab is a lot more difficult
than in python, modules, classes, datastructures besides structs are
missing, weaker or messier.
I find numerical code easy to read and write in matlab.
> 2. Pass-by-value of arrays in Matlab. This basically means Matlab is
> unsuable for work with large data sets. Serious Matlab programming usually
> mean C programming with MEX wrappers. I find myself writing much more C or
> C++ when working with Matlab, which seriously impairs my productivity and
> creativity (it is easier to prototype and experiment in Python than C).
It's not fully pass-by-value, a copy is only made if the matrix/array
is changed, but not if only accessed for getting the data. And from
the comments I have seen they keep improving lazy copying.
> 3. Better integration with C and Fortran in Python. Cython, ctypes and
> f2py is a lot easier than hand-coding CMEX and FMEX wrappers. C and
> Fortran integration is important for scientific computing. Also, Matlab
> does not play well with Fortran 95, just Fortran 77. For scientific
> computing, C, C++ and Fortran 77 are PITAs compared to Fortran 95. I even
> prefer Cython to the former three.
> 4. Matlab is single-treaded. Python allows threads, although there is a
> GIL (which can be freed when running C or Fortran library code). Matlab
> use MKL for multi-cores, NumPy/SciPy can be compiled against MKL or ACML.
> Threads are not just for parallel processing, but also for I/O and GUI
> interaction.
since 2007a:
"MATLAB has multithreaded computation support for many linear algebra
and element-wise numeric operations, allowing performance improvement
on multicore and multiprocessor systems."
> 5. Heap-fragmentation makes long-running (32-bit) Matlab processes
> unstable. Instead of fixing the broken allocator, Matlab has a "pack"
> command that must be run e.g. once an hour. A long-running Python process
> can be trusted to complete if the code is stable, not so with Matlab in my
> experience. One has to save the state of the Matlab simulation/analysis
> regularly, and restart from where it stopped.
> 6. Matlab saves figures by taking a copy of the screen buffer. It is very
> annoying to discover that a long-running process has saved dozens of
> images of the screen saver, Windows' login screen, totally black images,
> or the mouse pointer.
I never heard of or seen this
> 7. Can we use OpenGL or VTK easily from Matlab?
>
> 8. GUI programming: Compare Matlab's rudimentary GUI support with wxPython
> or PyQt.
>
> 9. It is easy to use MPI from Python (e.g. mpi4py or just ctypes with the
> MPI library). The multiprocessing package also makes Python
> multiprocessing and IPC easy. Matlab has a huge fingerprint. Do you want
> multiple Matlab instances running in parallel? How much RAM do you have?
I found the parallel processing toolbox (mainly using parfor)
relatively easy to use to keep four cores busy with some lengthy
nested optimization loops.
> 10. Memory-mapping is not usable in Matlab due to e.g. pass-by-value
> semantics, even though it is theoretically possible to memory map a file
> using a C MEX function. In Python we can memory map a file and alias the
> buffer with a NumPy array.
> 11. Database-support. Python has support for most major databases, and
> even has embedded dabases included in the standard library (e.g. bsddb and
> sqlite). Python supports HDF5 (PyTables and h5py). Efficient use of HDF5
> is difficult from Matlab due to pass-by-value semantics.
I never needed it, but with odbc and jdbc support and and
http://developer.berlios.de/projects/mksqlite/ database support
shouldn't be so bad.
http://www.mathworks.com/access/helpdesk/help/toolbox/database/
> 12. Python costs less, even when subscribing to Enthough.
>
> 13. Matlab toolboxes cost extra. There is no extra fee to use scipy.signal
> and scipy.stats.
>
> 14. Google and NASA use Python.
>
> 15. It is easier to teach Python to students. My university (Oslo) use
> Python as teaching language for that reason. That is, with this textbook
> written for teaching Python to science students:
>
> http://books.google.no/books?id=cVof07z_rA4C&printsec=frontcover#v=onepage&q&f=false
>
> 16. Matplotlib makes nicer figures (e.g. antialiased rendering).
>
> 17. It is easy to spawn processes from Python and communicate with pipes.
> Thus we can use external programs easily from Python.
>
> 18. Python's standard library has support for almost any thinkable
> problem. The number of external libraries is immense.
>
> 19. Another memory issue: NumPy does not make a copy when creating a view.
> We don't need to copy data to reference strided array sections or
> subarrays. NumPy works like Fortran 95 pointers, which is very memory
> efficient and expressive for numerical work.
>
> 20. Etc. (I could go on and on, but have to stop somewhere.)
> Some strong points for Matlab:
>
> 1. The IDE, debugger, and interactive prompt. ipython cannot compare to
> Matlab.
>
> 2. FFTW.
>
> 3. Java VM.
>
> 4. More users.
More freely available packages in some (scientific) areas/fields.
(commercial) toolboxes are available for many areas.
> 5. Better documentation.
My impression is that matlab is improving pretty fast, I have seen big
improvements between 2006a and 2009b or 2010a stats, econometrics
toolboxes, parallel support, new classes. Some of the criticism of an
old matlab might not be accurate anymore.
and matlab is doing pretty well
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
I feel I have to defend MATLAB, because besides GAUSS (and of course
numpy), it's the best matrix/array language, which is very nice for
writing econometrics/statistics code (and I find it much more readable
than R, Stata or SAS code).
Josef
>> A similar page for IDL would be great....and did anyone notice that
>> IDL 8.0 has a number of language enhancements, all designed to make it
>> more like Python? Sadly, they fall well short.
> IDL feels too much like FORTRAN IV, whcih by the way is an F-word.
> Sturla
