[Numpy-discussion] ANN: pandas 0.10.0 released

Wes McKinney wesmckinn@gmail....
Mon Dec 17 11:19:49 CST 2012


hi all,

I'm super excited to announce the pandas 0.10.0 release. This is
a major release including a new high performance file reading
engine with tons of new user-facing functionality as well, a
bunch of work on the HDF5/PyTables integration layer,
much-expanded Unicode support, a new option/configuration
interface, integration with the Google Analytics API, and a wide
array of other new features, bug fixes, and performance
improvements. I strongly recommend that all users get upgraded as
soon as feasible. Many performance improvements made are quite
substantial over 0.9.x, see vbenchmarks at the end of the e-mail.

As of this release, we are no longer supporting Python 2.5. Also,
this is the first release to officially support Python 3.3.

Note: there are a number of minor, but necessary API changes that
long-time pandas users should pay attention to in the What's New.

Thanks to all who contributed to this release, especially Chang
She, Yoval P, and Jeff Reback (and everyone else listed in the
commit log!).

As always source archives and Windows installers are on PyPI.

What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html
Installers: http://pypi.python.org/pypi/pandas

$ git log v0.9.1..v0.10.0 --pretty=format:%aN | sort | uniq -c | sort -rn
    246 Wes McKinney
    140 y-p
     99 Chang She
     45 jreback
     18 Abraham Flaxman
     17 Jeff Reback
     14 locojaydev
     11 Keith Hughitt
      5 Adam Obeng
      2 Dieter Vandenbussche
      1 zach powers
      1 Luke Lee
      1 Laurent Gautier
      1 Ken Van Haren
      1 Jay Bourque
      1 Donald Curtis
      1 Chris Mulligan
      1 alex arsenovic
      1 A. Flaxman

Happy data hacking!

- Wes

What is it
==========
pandas is a Python package providing fast, flexible, and
expressive data structures designed to make working with
relational, time series, or any other kind of labeled data both
easy and intuitive. It aims to be the fundamental high-level
building block for doing practical, real world data analysis in
Python.

Links
=====
Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst
Documentation: http://pandas.pydata.org
Installers: http://pypi.python.org/pypi/pandas
Code Repository: http://github.com/pydata/pandas
Mailing List: http://groups.google.com/group/pydata

Performance vs. v0.9.0
======================

Benchmarks from https://github.com/pydata/pandas/tree/master/vb_suite
Ratio < 1 means that v0.10.0 is faster

                                           v0.10.0     v0.9.0      ratio
name
unstack_sparse_keyspace                     1.2813   144.1262     0.0089
groupby_frame_apply_overhead               20.1520   337.3330     0.0597
read_csv_comment2                          25.3097   363.2860     0.0697
groupbym_frame_apply                       75.1554   504.1661     0.1491
frame_iteritems_cached                      0.0711     0.3919     0.1815
read_csv_thou_vb                           35.2690   191.9360     0.1838
concat_small_frames                        12.9019    55.3561     0.2331
join_dataframe_integer_2key                 5.8184    21.5823     0.2696
series_value_counts_strings                 5.3824    19.1262     0.2814
append_frame_single_homogenous              0.3413     0.9319     0.3662
read_csv_vb                                18.4084    46.9500     0.3921
read_csv_standard                          12.0651    29.9940     0.4023
panel_from_dict_all_different_indexes      73.6860   158.2949     0.4655
frame_constructor_ndarray                   0.0471     0.0958     0.4918
groupby_first                               3.8502     7.1988     0.5348
groupby_last                                3.6962     6.7792     0.5452
panel_from_dict_two_different_indexes      50.7428    86.4980     0.5866
append_frame_single_mixed                   1.2950     2.1930     0.5905
frame_get_numeric_data                      0.0695     0.1119     0.6212
replace_fillna                              4.6349     7.0540     0.6571
frame_to_csv                              281.9340   427.7921     0.6590
replace_replacena                           4.7154     7.1207     0.6622
frame_iteritems                             2.5862     3.7463     0.6903
series_align_int64_index                   29.7370    41.2791     0.7204
join_dataframe_integer_key                  1.7980     2.4303     0.7398
groupby_multi_size                         31.0066    41.7001     0.7436
groupby_frame_singlekey_integer             2.3579     3.1649     0.7450
write_csv_standard                        326.8259   427.3241     0.7648
groupby_simple_compress_timing             41.2113    52.3993     0.7865
frame_fillna_inplace                       16.2843    20.0491     0.8122
reindex_fillna_backfill                     0.1364     0.1667     0.8181
groupby_multi_series_op                    15.2914    18.6651     0.8193
groupby_multi_cython                       17.2169    20.4420     0.8422
frame_fillna_many_columns_pad              14.9510    17.5114     0.8538
panel_from_dict_equiv_indexes              25.8427    29.9682     0.8623
merge_2intkey_nosort                       19.0755    22.1138     0.8626
sparse_series_to_frame                    167.8529   192.9920     0.8697
reindex_fillna_pad                          0.1410     0.1617     0.8720
merge_2intkey_sort                         44.7863    51.3315     0.8725
reshape_stack_simple                        2.6698     3.0502     0.8753
groupby_indices                             7.2264     8.2314     0.8779
sort_level_one                              4.3845     4.9902     0.8786
sort_level_zero                             4.3362     4.9198     0.8814
write_store                                16.0587    18.2042     0.8821
frame_reindex_both_axes                     0.3726     0.4183     0.8907
groupby_multi_different_numpy_functions    13.4164    15.0509     0.8914
index_int64_intersection                   25.3705    28.1867     0.9001
groupby_frame_median                        7.7491     8.6011     0.9009
frame_drop_dup_na_inplace                   2.6290     2.9155     0.9017
dataframe_reindex_columns                   0.3052     0.3372     0.9049
join_dataframe_index_multi                 20.5651    22.6893     0.9064
frame_ctor_list_of_dict                   101.7439   112.2260     0.9066
groupby_pivot_table                        18.4551    20.3184     0.9083
reindex_frame_level_align                   0.9644     1.0531     0.9158
stat_ops_level_series_sum_multiple          7.3637     8.0230     0.9178
write_store_mixed                          38.2528    41.6604     0.9182
frame_reindex_both_axes_ix                  0.4550     0.4950     0.9192
stat_ops_level_frame_sum_multiple           8.1975     8.9055     0.9205
panel_from_dict_same_index                 25.7938    28.0147     0.9207
groupby_series_simple_cython                5.1310     5.5624     0.9224
frame_sort_index_by_columns                41.9577    45.1816     0.9286
groupby_multi_python                       54.9727    59.0400     0.9311
datetimeindex_add_offset                    0.2417     0.2584     0.9356
frame_boolean_row_select                    0.2905     0.3100     0.9373
frame_reindex_axis1                         2.9760     3.1742     0.9376
stat_ops_level_series_sum                   2.3382     2.4937     0.9376
groupby_multi_different_functions          14.0333    14.9571     0.9382
timeseries_timestamp_tzinfo_cons            0.0159     0.0169     0.9397
stats_rolling_mean                          1.6904     1.7959     0.9413
melt_dataframe                              1.5236     1.6181     0.9416
timeseries_asof_single                      0.0548     0.0582     0.9416
frame_ctor_nested_dict_int64              134.3100   142.6389     0.9416
join_dataframe_index_single_key_bigger     15.6578    16.5949     0.9435
stat_ops_level_frame_sum                    3.2475     3.4414     0.9437
indexing_dataframe_boolean_rows             0.2382     0.2518     0.9459
timeseries_asof_nan                        10.0433    10.6006     0.9474
frame_reindex_axis0                         1.4403     1.5184     0.9485
concat_series_axis1                        69.2988    72.8099     0.9518
join_dataframe_index_single_key_small       6.8492     7.1847     0.9533
dataframe_reindex_daterange                 0.4054     0.4240     0.9562
join_dataframe_index_single_key_bigger      6.4616     6.7578     0.9562
timeseries_timestamp_downsample_mean        4.5849     4.7787     0.9594
frame_fancy_lookup                          2.5498     2.6544     0.9606
series_value_counts_int64                   2.5569     2.6581     0.9619
frame_fancy_lookup_all                     30.7510    31.8465     0.9656
index_int64_union                          82.2279    85.1500     0.9657
indexing_dataframe_boolean_rows_object      0.4809     0.4977     0.9662
frame_ctor_nested_dict                     91.6129    94.8122     0.9663
stat_ops_series_std                         0.2450     0.2533     0.9673
groupby_frame_cython_many_columns           3.7642     3.8894     0.9678
timeseries_asof                            10.4352    10.7721     0.9687
series_ctor_from_dict                       3.7707     3.8749     0.9731
frame_drop_dup_inplace                      3.0007     3.0746     0.9760
timeseries_large_lookup_value               0.0242     0.0248     0.9764
read_table_multiple_date_baseline        1201.2930  1224.3881     0.9811
dti_reset_index                             0.6339     0.6457     0.9817
read_table_multiple_date                 2600.7280  2647.8729     0.9822
reindex_frame_level_reindex                 0.9524     0.9674     0.9845
reindex_multiindex                          1.3483     1.3685     0.9853
frame_insert_500_columns                  102.1249   103.4329     0.9874
frame_drop_duplicates                      19.3780    19.6157     0.9879
reindex_daterange_backfill                  0.1870     0.1889     0.9899
stats_rank2d_axis0_average                 25.0480    25.2801     0.9908
series_align_left_monotonic                13.1929    13.2558     0.9953
timeseries_add_irregular                   22.4635    22.5122     0.9978
read_store_mixed                           13.4398    13.4560     0.9988
lib_fast_zip                               11.1289    11.1354     0.9994
match_strings                               0.3831     0.3833     0.9995
read_store                                  5.5526     5.5290     1.0043
timeseries_sort_index                      22.7172    22.5976     1.0053
timeseries_1min_5min_mean                   0.6224     0.6175     1.0079
stats_rank2d_axis1_average                 14.6569    14.5339     1.0085
reindex_daterange_pad                       0.1886     0.1867     1.0102
timeseries_period_downsample_mean           6.4241     6.3480     1.0120
frame_drop_duplicates_na                   19.3303    19.0970     1.0122
stats_rank_average_int                     23.3569    22.9996     1.0155
lib_fast_zip_fillna                        14.1394    13.8473     1.0211
index_datetime_intersection                17.2626    16.8986     1.0215
timeseries_1min_5min_ohlc                   0.7054     0.6891     1.0237
stats_rank_average                         31.3440    30.3845     1.0316
timeseries_infer_freq                      10.9854    10.6439     1.0321
timeseries_slice_minutely                   0.0637     0.0611     1.0418
index_datetime_union                       17.9083    17.1640     1.0434
series_align_irregular_string              89.9470    85.1344     1.0565
series_constructor_ndarray                  0.0127     0.0119     1.0742
indexing_panel_subset                       0.5692     0.5214     1.0917
groupby_apply_dict_return                  46.3497    42.3220     1.0952
reshape_unstack_simple                      3.2901     2.9089     1.1310
timeseries_to_datetime_iso8601              4.2305     3.6015     1.1746
frame_to_string_floats                     53.6217    37.2041     1.4413
reshape_pivot_time_series                 170.4340   107.9068     1.5795
sparse_frame_constructor                    6.2714     3.5053     1.7891
datetimeindex_normalize                    37.2718     6.9329     5.3761

Columns: test_name | target_duration [ms] | baseline_duration [ms] | ratio


More information about the NumPy-Discussion mailing list