[Scipy-svn] r3129 - in trunk/Lib/sandbox/pyem/data: . iris iris/src

scipy-svn@scip... scipy-svn@scip...
Sun Jul 1 06:36:06 CDT 2007


Author: cdavid
Date: 2007-07-01 06:35:54 -0500 (Sun, 01 Jul 2007)
New Revision: 3129

Added:
   trunk/Lib/sandbox/pyem/data/iris/
   trunk/Lib/sandbox/pyem/data/iris/COPYING
   trunk/Lib/sandbox/pyem/data/iris/data.py
   trunk/Lib/sandbox/pyem/data/iris/data.pyc
   trunk/Lib/sandbox/pyem/data/iris/iris.py
   trunk/Lib/sandbox/pyem/data/iris/iris.pyc
   trunk/Lib/sandbox/pyem/data/iris/src/
   trunk/Lib/sandbox/pyem/data/iris/src/convert.py
   trunk/Lib/sandbox/pyem/data/iris/src/iris.data
   trunk/Lib/sandbox/pyem/data/iris/src/iris.names
Modified:
   trunk/Lib/sandbox/pyem/data/setup.py
Log:
Add iris data from UCI ML database

Added: trunk/Lib/sandbox/pyem/data/iris/COPYING
===================================================================
--- trunk/Lib/sandbox/pyem/data/iris/COPYING	2007-07-01 10:04:07 UTC (rev 3128)
+++ trunk/Lib/sandbox/pyem/data/iris/COPYING	2007-07-01 11:35:54 UTC (rev 3129)
@@ -0,0 +1,34 @@
+# The code and descriptive text is copyrighted and offered under the terms of
+# the BSD License from the authors; see below. However, the actual dataset may
+# have a different origin and intellectual property status. See the SOURCE and
+# COPYRIGHT variables for this information.
+
+# Copyright (c) 2007 David Cournapeau <cournape@gmail.com>
+#
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the author nor the names of any contributors may be used
+#       to endorse or promote products derived from this software without
+#       specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+# TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Added: trunk/Lib/sandbox/pyem/data/iris/data.py
===================================================================
--- trunk/Lib/sandbox/pyem/data/iris/data.py	2007-07-01 10:04:07 UTC (rev 3128)
+++ trunk/Lib/sandbox/pyem/data/iris/data.py	2007-07-01 11:35:54 UTC (rev 3129)
@@ -0,0 +1,122 @@
+#! /usr/bin/env python
+# -*- coding: utf-8 -*-
+# Last Change: Sun Jul 01 08:00 PM 2007 J
+
+# The code and descriptive text is copyrighted and offered under the terms of
+# the BSD License from the authors; see below. However, the actual dataset may
+# have a different origin and intellectual property status. See the SOURCE and
+# COPYRIGHT variables for this information.
+
+# Copyright (c) 2007 David Cournapeau <cournape@gmail.com>
+#
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the author nor the names of any contributors may be used
+#       to endorse or promote products derived from this software without
+#       specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
+# TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
+# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
+# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+# WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
+# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
+# ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+"""Iris dataset."""
+
+__docformat__ = 'restructuredtext'
+
+COPYRIGHT   = """See SOURCE. """
+TITLE       = "Iris Plants Database"
+SOURCE      = """Creator: R.A. Fisher 
+Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
+Date: July, 1988
+
+This is a copy of UCI ML iris datasets, except that the data are in mm instead
+of cm, so that exact values as int can be given.
+
+References: 
+   - Fisher,R.A. "The use of multiple measurements in taxonomic problems"
+     Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to
+     Mathematical Statistics" (John Wiley, NY, 1950).
+   - Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
+     (Q327.D83) John Wiley & Sons.  ISBN 0-471-22361-1.  See page 218.
+   - Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
+     Structure and Classification Rule for Recognition in Partially Exposed
+     Environments".  IEEE Transactions on Pattern Analysis and Machine
+     Intelligence, Vol. PAMI-2, No. 1, 67-71.
+   - Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule".  IEEE Transactions
+     on Information Theory, May 1972, 431-433.
+   - See also: 1988 MLC Proceedings, 54-64.  Cheeseman et al's AUTOCLASS II
+     conceptual clustering system finds 3 classes in the data."""
+
+DESCRSHORT  = """The famous Iris database, first used by Sir R.A Fisher"""
+
+DESCRLONG   = """This is perhaps the best known database to be found in the
+pattern recognition literature.  Fisher's paper is a classic in the field and
+is referenced frequently to this day.  (See Duda & Hart, for example.)  The
+data set contains 3 classes of 50 instances each, where each class refers to a
+type of iris plant.  One class is linearly separable from the other 2; the
+latter are NOT linearly separable from each other.  """
+
+NOTE        = """
+Number of Instances: 150 (50 in each of three classes)
+
+Number of Attributes: 4 numeric, predictive attributes and the class
+
+Attribute Information:
+   - sepal length in mm
+   - sepal width in mm
+   - petal length in mm
+   - petal width in mm
+   - class: 
+        - Iris-Setosa
+        - Iris-Versicolour
+        - Iris-Virginica
+
+Missing Attribute Values: None
+
+Class Distribution: 33.3% for each of 3 classes.
+"""
+
+def load():
+    """load the iris data and returns them.
+    
+    :returns:
+        data: recordarray
+            a record array of the data.
+    """
+    import numpy
+    from iris import SL, SW, PL, PW, CLI
+    PW = numpy.array(PW).astype(numpy.float)
+    PL = numpy.array(PL).astype(numpy.float)
+    SW = numpy.array(SW).astype(numpy.float)
+    SL = numpy.array(SL).astype(numpy.float)
+    data    = {}
+    for i in CLI.items():
+        name = i[0][5:]
+        data[name] = numpy.empty(len(i[1]), [('petal width', numpy.int),\
+                        ('petal length', numpy.int),
+                        ('sepal width', numpy.int),
+                        ('sepal length', numpy.int)])
+        data[name]['petal width'] = numpy.round(PW[i[1]] * 10)
+        data[name]['petal length'] = numpy.round(PL[i[1]] * 10)
+        data[name]['sepal width'] = numpy.round(SW[i[1]] * 10)
+        data[name]['sepal length'] = numpy.round(SL[i[1]] * 10)
+    
+    return data

Added: trunk/Lib/sandbox/pyem/data/iris/data.pyc
===================================================================
(Binary files differ)


Property changes on: trunk/Lib/sandbox/pyem/data/iris/data.pyc
___________________________________________________________________
Name: svn:mime-type
   + application/octet-stream

Added: trunk/Lib/sandbox/pyem/data/iris/iris.py
===================================================================
--- trunk/Lib/sandbox/pyem/data/iris/iris.py	2007-07-01 10:04:07 UTC (rev 3128)
+++ trunk/Lib/sandbox/pyem/data/iris/iris.py	2007-07-01 11:35:54 UTC (rev 3129)
@@ -0,0 +1,12 @@
+# Autogenerated by convert.py at Sun, 01 Jul 2007 10:36:56 +0000
+
+SL = ['5.1', '4.9', '4.7', '4.6', '5.0', '5.4', '4.6', '5.0', '4.4', '4.9', '5.4', '4.8', '4.8', '4.3', '5.8', '5.7', '5.4', '5.1', '5.7', '5.1', '5.4', '5.1', '4.6', '5.1', '4.8', '5.0', '5.0', '5.2', '5.2', '4.7', '4.8', '5.4', '5.2', '5.5', '4.9', '5.0', '5.5', '4.9', '4.4', '5.1', '5.0', '4.5', '4.4', '5.0', '5.1', '4.8', '5.1', '4.6', '5.3', '5.0', '7.0', '6.4', '6.9', '5.5', '6.5', '5.7', '6.3', '4.9', '6.6', '5.2', '5.0', '5.9', '6.0', '6.1', '5.6', '6.7', '5.6', '5.8', '6.2', '5.6', '5.9', '6.1', '6.3', '6.1', '6.4', '6.6', '6.8', '6.7', '6.0', '5.7', '5.5', '5.5', '5.8', '6.0', '5.4', '6.0', '6.7', '6.3', '5.6', '5.5', '5.5', '6.1', '5.8', '5.0', '5.6', '5.7', '5.7', '6.2', '5.1', '5.7', '6.3', '5.8', '7.1', '6.3', '6.5', '7.6', '4.9', '7.3', '6.7', '7.2', '6.5', '6.4', '6.8', '5.7', '5.8', '6.4', '6.5', '7.7', '7.7', '6.0', '6.9', '5.6', '7.7', '6.3', '6.7', '7.2', '6.2', '6.1', '6.4', '7.2', '7.4', '7.9', '6.4', '6.3', '6.1', '7.7', '6.3', '6.4', '6.0', '6.9', '6.7', '6.9', '5.8', '6.8', '6.7', '6.7', '6.3', '6.5', '6.2', '5.9']
+
+SW = ['3.5', '3.0', '3.2', '3.1', '3.6', '3.9', '3.4', '3.4', '2.9', '3.1', '3.7', '3.4', '3.0', '3.0', '4.0', '4.4', '3.9', '3.5', '3.8', '3.8', '3.4', '3.7', '3.6', '3.3', '3.4', '3.0', '3.4', '3.5', '3.4', '3.2', '3.1', '3.4', '4.1', '4.2', '3.1', '3.2', '3.5', '3.1', '3.0', '3.4', '3.5', '2.3', '3.2', '3.5', '3.8', '3.0', '3.8', '3.2', '3.7', '3.3', '3.2', '3.2', '3.1', '2.3', '2.8', '2.8', '3.3', '2.4', '2.9', '2.7', '2.0', '3.0', '2.2', '2.9', '2.9', '3.1', '3.0', '2.7', '2.2', '2.5', '3.2', '2.8', '2.5', '2.8', '2.9', '3.0', '2.8', '3.0', '2.9', '2.6', '2.4', '2.4', '2.7', '2.7', '3.0', '3.4', '3.1', '2.3', '3.0', '2.5', '2.6', '3.0', '2.6', '2.3', '2.7', '3.0', '2.9', '2.9', '2.5', '2.8', '3.3', '2.7', '3.0', '2.9', '3.0', '3.0', '2.5', '2.9', '2.5', '3.6', '3.2', '2.7', '3.0', '2.5', '2.8', '3.2', '3.0', '3.8', '2.6', '2.2', '3.2', '2.8', '2.8', '2.7', '3.3', '3.2', '2.8', '3.0', '2.8', '3.0', '2.8', '3.8', '2.8', '2.8', '2.6', '3.0', '3.4', '3.1', '3.0', '3.1', '3.1', '3.1', '2.7', '3.2', '3.3', '3.0', '2.5', '3.0', '3.4', '3.0']
+
+PL = ['1.4', '1.4', '1.3', '1.5', '1.4', '1.7', '1.4', '1.5', '1.4', '1.5', '1.5', '1.6', '1.4', '1.1', '1.2', '1.5', '1.3', '1.4', '1.7', '1.5', '1.7', '1.5', '1.0', '1.7', '1.9', '1.6', '1.6', '1.5', '1.4', '1.6', '1.6', '1.5', '1.5', '1.4', '1.5', '1.2', '1.3', '1.5', '1.3', '1.5', '1.3', '1.3', '1.3', '1.6', '1.9', '1.4', '1.6', '1.4', '1.5', '1.4', '4.7', '4.5', '4.9', '4.0', '4.6', '4.5', '4.7', '3.3', '4.6', '3.9', '3.5', '4.2', '4.0', '4.7', '3.6', '4.4', '4.5', '4.1', '4.5', '3.9', '4.8', '4.0', '4.9', '4.7', '4.3', '4.4', '4.8', '5.0', '4.5', '3.5', '3.8', '3.7', '3.9', '5.1', '4.5', '4.5', '4.7', '4.4', '4.1', '4.0', '4.4', '4.6', '4.0', '3.3', '4.2', '4.2', '4.2', '4.3', '3.0', '4.1', '6.0', '5.1', '5.9', '5.6', '5.8', '6.6', '4.5', '6.3', '5.8', '6.1', '5.1', '5.3', '5.5', '5.0', '5.1', '5.3', '5.5', '6.7', '6.9', '5.0', '5.7', '4.9', '6.7', '4.9', '5.7', '6.0', '4.8', '4.9', '5.6', '5.8', '6.1', '6.4', '5.6', '5.1', '5.6', '6.1', '5.6', '5.5', '4.8', '5.4', '5.6', '5.1', '5.1', '5.9', '5.7', '5.2', '5.0', '5.2', '5.4', '5.1']
+
+PW = ['0.2', '0.2', '0.2', '0.2', '0.2', '0.4', '0.3', '0.2', '0.2', '0.1', '0.2', '0.2', '0.1', '0.1', '0.2', '0.4', '0.4', '0.3', '0.3', '0.3', '0.2', '0.4', '0.2', '0.5', '0.2', '0.2', '0.4', '0.2', '0.2', '0.2', '0.2', '0.4', '0.1', '0.2', '0.1', '0.2', '0.2', '0.1', '0.2', '0.2', '0.3', '0.3', '0.2', '0.6', '0.4', '0.3', '0.2', '0.2', '0.2', '0.2', '1.4', '1.5', '1.5', '1.3', '1.5', '1.3', '1.6', '1.0', '1.3', '1.4', '1.0', '1.5', '1.0', '1.4', '1.3', '1.4', '1.5', '1.0', '1.5', '1.1', '1.8', '1.3', '1.5', '1.2', '1.3', '1.4', '1.4', '1.7', '1.5', '1.0', '1.1', '1.0', '1.2', '1.6', '1.5', '1.6', '1.5', '1.3', '1.3', '1.3', '1.2', '1.4', '1.2', '1.0', '1.3', '1.2', '1.3', '1.3', '1.1', '1.3', '2.5', '1.9', '2.1', '1.8', '2.2', '2.1', '1.7', '1.8', '1.8', '2.5', '2.0', '1.9', '2.1', '2.0', '2.4', '2.3', '1.8', '2.2', '2.3', '1.5', '2.3', '2.0', '2.0', '1.8', '2.1', '1.8', '1.8', '1.8', '2.1', '1.6', '1.9', '2.0', '2.2', '1.5', '1.4', '2.3', '2.4', '1.8', '1.8', '2.1', '2.4', '2.3', '1.9', '2.3', '2.5', '2.3', '1.9', '2.0', '2.3', '1.8']
+
+CLI = {'Iris-virginica': [100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149], 'Iris-setosa': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49], 'Iris-versicolor': [50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]}
+

Added: trunk/Lib/sandbox/pyem/data/iris/iris.pyc
===================================================================
(Binary files differ)


Property changes on: trunk/Lib/sandbox/pyem/data/iris/iris.pyc
___________________________________________________________________
Name: svn:mime-type
   + application/octet-stream

Added: trunk/Lib/sandbox/pyem/data/iris/src/convert.py
===================================================================
--- trunk/Lib/sandbox/pyem/data/iris/src/convert.py	2007-07-01 10:04:07 UTC (rev 3128)
+++ trunk/Lib/sandbox/pyem/data/iris/src/convert.py	2007-07-01 11:35:54 UTC (rev 3129)
@@ -0,0 +1,40 @@
+#! /usr/bin/env python
+# Last Change: Sun Jul 01 07:00 PM 2007 J
+
+# This script generates a python file from the txt data
+import time
+import csv
+
+dataname = 'iris.data'
+f = open(dataname, 'r')
+a = csv.reader(f)
+el = [i for i in a]
+# Remove last value corresponding to empty line in data file
+el.remove(el[-1])
+assert len(el) == 150
+
+sl = [i[0] for i in el]
+sw = [i[1] for i in el]
+pl = [i[2] for i in el]
+pw = [i[3] for i in el]
+cl = [i[4] for i in el]
+
+dcl = dict([(i, []) for i in cl])
+for i in range(len(cl)):
+    dcl[cl[i]].append(i)
+
+# Write the data in oldfaitful.py
+a = open("iris.py", "w")
+a.write('# Autogenerated by convert.py at %s\n\n' % 
+        time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.gmtime()))
+
+def dump_var(var, varname):
+    a.write(varname + " = ")
+    a.write(str(var))
+    a.write("\n\n")
+
+dump_var(sl, 'SL')
+dump_var(sw, 'SW')
+dump_var(pl, 'PL')
+dump_var(pw, 'PW')
+dump_var(dcl, 'CLI')


Property changes on: trunk/Lib/sandbox/pyem/data/iris/src/convert.py
___________________________________________________________________
Name: svn:executable
   + *

Added: trunk/Lib/sandbox/pyem/data/iris/src/iris.data
===================================================================
--- trunk/Lib/sandbox/pyem/data/iris/src/iris.data	2007-07-01 10:04:07 UTC (rev 3128)
+++ trunk/Lib/sandbox/pyem/data/iris/src/iris.data	2007-07-01 11:35:54 UTC (rev 3129)
@@ -0,0 +1,151 @@
+5.1,3.5,1.4,0.2,Iris-setosa
+4.9,3.0,1.4,0.2,Iris-setosa
+4.7,3.2,1.3,0.2,Iris-setosa
+4.6,3.1,1.5,0.2,Iris-setosa
+5.0,3.6,1.4,0.2,Iris-setosa
+5.4,3.9,1.7,0.4,Iris-setosa
+4.6,3.4,1.4,0.3,Iris-setosa
+5.0,3.4,1.5,0.2,Iris-setosa
+4.4,2.9,1.4,0.2,Iris-setosa
+4.9,3.1,1.5,0.1,Iris-setosa
+5.4,3.7,1.5,0.2,Iris-setosa
+4.8,3.4,1.6,0.2,Iris-setosa
+4.8,3.0,1.4,0.1,Iris-setosa
+4.3,3.0,1.1,0.1,Iris-setosa
+5.8,4.0,1.2,0.2,Iris-setosa
+5.7,4.4,1.5,0.4,Iris-setosa
+5.4,3.9,1.3,0.4,Iris-setosa
+5.1,3.5,1.4,0.3,Iris-setosa
+5.7,3.8,1.7,0.3,Iris-setosa
+5.1,3.8,1.5,0.3,Iris-setosa
+5.4,3.4,1.7,0.2,Iris-setosa
+5.1,3.7,1.5,0.4,Iris-setosa
+4.6,3.6,1.0,0.2,Iris-setosa
+5.1,3.3,1.7,0.5,Iris-setosa
+4.8,3.4,1.9,0.2,Iris-setosa
+5.0,3.0,1.6,0.2,Iris-setosa
+5.0,3.4,1.6,0.4,Iris-setosa
+5.2,3.5,1.5,0.2,Iris-setosa
+5.2,3.4,1.4,0.2,Iris-setosa
+4.7,3.2,1.6,0.2,Iris-setosa
+4.8,3.1,1.6,0.2,Iris-setosa
+5.4,3.4,1.5,0.4,Iris-setosa
+5.2,4.1,1.5,0.1,Iris-setosa
+5.5,4.2,1.4,0.2,Iris-setosa
+4.9,3.1,1.5,0.1,Iris-setosa
+5.0,3.2,1.2,0.2,Iris-setosa
+5.5,3.5,1.3,0.2,Iris-setosa
+4.9,3.1,1.5,0.1,Iris-setosa
+4.4,3.0,1.3,0.2,Iris-setosa
+5.1,3.4,1.5,0.2,Iris-setosa
+5.0,3.5,1.3,0.3,Iris-setosa
+4.5,2.3,1.3,0.3,Iris-setosa
+4.4,3.2,1.3,0.2,Iris-setosa
+5.0,3.5,1.6,0.6,Iris-setosa
+5.1,3.8,1.9,0.4,Iris-setosa
+4.8,3.0,1.4,0.3,Iris-setosa
+5.1,3.8,1.6,0.2,Iris-setosa
+4.6,3.2,1.4,0.2,Iris-setosa
+5.3,3.7,1.5,0.2,Iris-setosa
+5.0,3.3,1.4,0.2,Iris-setosa
+7.0,3.2,4.7,1.4,Iris-versicolor
+6.4,3.2,4.5,1.5,Iris-versicolor
+6.9,3.1,4.9,1.5,Iris-versicolor
+5.5,2.3,4.0,1.3,Iris-versicolor
+6.5,2.8,4.6,1.5,Iris-versicolor
+5.7,2.8,4.5,1.3,Iris-versicolor
+6.3,3.3,4.7,1.6,Iris-versicolor
+4.9,2.4,3.3,1.0,Iris-versicolor
+6.6,2.9,4.6,1.3,Iris-versicolor
+5.2,2.7,3.9,1.4,Iris-versicolor
+5.0,2.0,3.5,1.0,Iris-versicolor
+5.9,3.0,4.2,1.5,Iris-versicolor
+6.0,2.2,4.0,1.0,Iris-versicolor
+6.1,2.9,4.7,1.4,Iris-versicolor
+5.6,2.9,3.6,1.3,Iris-versicolor
+6.7,3.1,4.4,1.4,Iris-versicolor
+5.6,3.0,4.5,1.5,Iris-versicolor
+5.8,2.7,4.1,1.0,Iris-versicolor
+6.2,2.2,4.5,1.5,Iris-versicolor
+5.6,2.5,3.9,1.1,Iris-versicolor
+5.9,3.2,4.8,1.8,Iris-versicolor
+6.1,2.8,4.0,1.3,Iris-versicolor
+6.3,2.5,4.9,1.5,Iris-versicolor
+6.1,2.8,4.7,1.2,Iris-versicolor
+6.4,2.9,4.3,1.3,Iris-versicolor
+6.6,3.0,4.4,1.4,Iris-versicolor
+6.8,2.8,4.8,1.4,Iris-versicolor
+6.7,3.0,5.0,1.7,Iris-versicolor
+6.0,2.9,4.5,1.5,Iris-versicolor
+5.7,2.6,3.5,1.0,Iris-versicolor
+5.5,2.4,3.8,1.1,Iris-versicolor
+5.5,2.4,3.7,1.0,Iris-versicolor
+5.8,2.7,3.9,1.2,Iris-versicolor
+6.0,2.7,5.1,1.6,Iris-versicolor
+5.4,3.0,4.5,1.5,Iris-versicolor
+6.0,3.4,4.5,1.6,Iris-versicolor
+6.7,3.1,4.7,1.5,Iris-versicolor
+6.3,2.3,4.4,1.3,Iris-versicolor
+5.6,3.0,4.1,1.3,Iris-versicolor
+5.5,2.5,4.0,1.3,Iris-versicolor
+5.5,2.6,4.4,1.2,Iris-versicolor
+6.1,3.0,4.6,1.4,Iris-versicolor
+5.8,2.6,4.0,1.2,Iris-versicolor
+5.0,2.3,3.3,1.0,Iris-versicolor
+5.6,2.7,4.2,1.3,Iris-versicolor
+5.7,3.0,4.2,1.2,Iris-versicolor
+5.7,2.9,4.2,1.3,Iris-versicolor
+6.2,2.9,4.3,1.3,Iris-versicolor
+5.1,2.5,3.0,1.1,Iris-versicolor
+5.7,2.8,4.1,1.3,Iris-versicolor
+6.3,3.3,6.0,2.5,Iris-virginica
+5.8,2.7,5.1,1.9,Iris-virginica
+7.1,3.0,5.9,2.1,Iris-virginica
+6.3,2.9,5.6,1.8,Iris-virginica
+6.5,3.0,5.8,2.2,Iris-virginica
+7.6,3.0,6.6,2.1,Iris-virginica
+4.9,2.5,4.5,1.7,Iris-virginica
+7.3,2.9,6.3,1.8,Iris-virginica
+6.7,2.5,5.8,1.8,Iris-virginica
+7.2,3.6,6.1,2.5,Iris-virginica
+6.5,3.2,5.1,2.0,Iris-virginica
+6.4,2.7,5.3,1.9,Iris-virginica
+6.8,3.0,5.5,2.1,Iris-virginica
+5.7,2.5,5.0,2.0,Iris-virginica
+5.8,2.8,5.1,2.4,Iris-virginica
+6.4,3.2,5.3,2.3,Iris-virginica
+6.5,3.0,5.5,1.8,Iris-virginica
+7.7,3.8,6.7,2.2,Iris-virginica
+7.7,2.6,6.9,2.3,Iris-virginica
+6.0,2.2,5.0,1.5,Iris-virginica
+6.9,3.2,5.7,2.3,Iris-virginica
+5.6,2.8,4.9,2.0,Iris-virginica
+7.7,2.8,6.7,2.0,Iris-virginica
+6.3,2.7,4.9,1.8,Iris-virginica
+6.7,3.3,5.7,2.1,Iris-virginica
+7.2,3.2,6.0,1.8,Iris-virginica
+6.2,2.8,4.8,1.8,Iris-virginica
+6.1,3.0,4.9,1.8,Iris-virginica
+6.4,2.8,5.6,2.1,Iris-virginica
+7.2,3.0,5.8,1.6,Iris-virginica
+7.4,2.8,6.1,1.9,Iris-virginica
+7.9,3.8,6.4,2.0,Iris-virginica
+6.4,2.8,5.6,2.2,Iris-virginica
+6.3,2.8,5.1,1.5,Iris-virginica
+6.1,2.6,5.6,1.4,Iris-virginica
+7.7,3.0,6.1,2.3,Iris-virginica
+6.3,3.4,5.6,2.4,Iris-virginica
+6.4,3.1,5.5,1.8,Iris-virginica
+6.0,3.0,4.8,1.8,Iris-virginica
+6.9,3.1,5.4,2.1,Iris-virginica
+6.7,3.1,5.6,2.4,Iris-virginica
+6.9,3.1,5.1,2.3,Iris-virginica
+5.8,2.7,5.1,1.9,Iris-virginica
+6.8,3.2,5.9,2.3,Iris-virginica
+6.7,3.3,5.7,2.5,Iris-virginica
+6.7,3.0,5.2,2.3,Iris-virginica
+6.3,2.5,5.0,1.9,Iris-virginica
+6.5,3.0,5.2,2.0,Iris-virginica
+6.2,3.4,5.4,2.3,Iris-virginica
+5.9,3.0,5.1,1.8,Iris-virginica
+

Added: trunk/Lib/sandbox/pyem/data/iris/src/iris.names
===================================================================
--- trunk/Lib/sandbox/pyem/data/iris/src/iris.names	2007-07-01 10:04:07 UTC (rev 3128)
+++ trunk/Lib/sandbox/pyem/data/iris/src/iris.names	2007-07-01 11:35:54 UTC (rev 3129)
@@ -0,0 +1,62 @@
+1. Title: Iris Plants Database
+
+2. Sources:
+     (a) Creator: R.A. Fisher
+     (b) Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
+     (c) Date: July, 1988
+
+3. Past Usage:
+   - Publications: too many to mention!!!  Here are a few.
+   1. Fisher,R.A. "The use of multiple measurements in taxonomic problems"
+      Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions
+      to Mathematical Statistics" (John Wiley, NY, 1950).
+   2. Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis.
+      (Q327.D83) John Wiley & Sons.  ISBN 0-471-22361-1.  See page 218.
+   3. Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System
+      Structure and Classification Rule for Recognition in Partially Exposed
+      Environments".  IEEE Transactions on Pattern Analysis and Machine
+      Intelligence, Vol. PAMI-2, No. 1, 67-71.
+      -- Results:
+         -- very low misclassification rates (0% for the setosa class)
+   4. Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule".  IEEE 
+      Transactions on Information Theory, May 1972, 431-433.
+      -- Results:
+         -- very low misclassification rates again
+   5. See also: 1988 MLC Proceedings, 54-64.  Cheeseman et al's AUTOCLASS II
+      conceptual clustering system finds 3 classes in the data.
+
+4. Relevant Information:
+   --- This is perhaps the best known database to be found in the pattern
+       recognition literature.  Fisher's paper is a classic in the field
+       and is referenced frequently to this day.  (See Duda & Hart, for
+       example.)  The data set contains 3 classes of 50 instances each,
+       where each class refers to a type of iris plant.  One class is
+       linearly separable from the other 2; the latter are NOT linearly
+       separable from each other.
+   --- Predicted attribute: class of iris plant.
+   --- This is an exceedingly simple domain.
+
+5. Number of Instances: 150 (50 in each of three classes)
+
+6. Number of Attributes: 4 numeric, predictive attributes and the class
+
+7. Attribute Information:
+   1. sepal length in cm
+   2. sepal width in cm
+   3. petal length in cm
+   4. petal width in cm
+   5. class: 
+      -- Iris Setosa
+      -- Iris Versicolour
+      -- Iris Virginica
+
+8. Missing Attribute Values: None
+
+Summary Statistics:
+	         Min  Max   Mean    SD   Class Correlation
+   sepal length: 4.3  7.9   5.84  0.83    0.7826   
+    sepal width: 2.0  4.4   3.05  0.43   -0.4194
+   petal length: 1.0  6.9   3.76  1.76    0.9490  (high!)
+    petal width: 0.1  2.5   1.20  0.76    0.9565  (high!)
+
+9. Class Distribution: 33.3% for each of 3 classes.

Modified: trunk/Lib/sandbox/pyem/data/setup.py
===================================================================
--- trunk/Lib/sandbox/pyem/data/setup.py	2007-07-01 10:04:07 UTC (rev 3128)
+++ trunk/Lib/sandbox/pyem/data/setup.py	2007-07-01 11:35:54 UTC (rev 3129)
@@ -5,6 +5,7 @@
     config = Configuration('data',parent_package,top_path)
     config.add_subpackage('oldfaithful')
     config.add_subpackage('pendigits')
+    config.add_subpackage('iris')
     config.make_config_py() # installs __config__.py
     return config
 



More information about the Scipy-svn mailing list