PA2: Array Set
Introduction
The objective of this assignment is to implement a contiguous implementation of a Set ADT that includes most of the functionality of Python's built-in set type. The functionality of our Set class will be almost the same as the functionality of Python's set class, with some exceptions described below.
It is important that you understand the material in Section 5.3 of our textbook as you work on this project. That section shows how to implement a variable-sized collection type using fixed-sized arrays. The specific example described in the book is the Python list class, but many of the issues will be the same for a set.
Your set class will store elements in an unordered dynamic array. This is not a
very efficient approach, because determining set membership requires a
sequential search through the array. As a result, some set operations may take
quadratic time. For example: set1.isdisjoint(set2)
will require
O(n*m) steps in the worst case: we need to search through all
m elements in set2
for each of the n items in
set1
. We will consider more efficient implementations of the Set
ADT later in the semester.
Background: Array Class
You must use the provided Array
class defined in the module t_array.py as the underlying data structure for your
Set class. This is similar to the Array
class described in our
textbook. However, this Array
class has visualization code that
will allow us to inspect the contents of all of the arrays that are in existence
at any one time. You should use this class as the basis of a dynamic array
implementation for your Set
internal storage.
Here is the pydoc documentation for the Array
class:
class Array(__builtin__.object) | Array class that supports turtle graphics visualization. | | There are two public, class-level variables that can be used | to control the visualization: | | Array.visible - Set this to False to turn off graphics. | Array.speed - Set this value to change the speed of the animation. | This value indicates the length of the pauses (in seconds) | between updates. | | Methods defined here: | | __getitem__(self, index) | x.__getitem__(y) <==> x[y] | Returns the contents of the indicated array entry. | | __init__(self, size) | Creates an array with a given capacity. | | __iter__(self) | Arrays do not support iteration. | | __len__(self) | x.__len__() <==> len(x) | Returns the size of the array. | | __setitem__(self, index, value) | x.__setitem__(i, y) <==> x[i]=y | Stores the value at the indicated index. | | free(self) | Frees this array. | | The array will no longer appear in the visualization, and may | be garbage collected as long as no external references exist. | | ----------------------------------------------------------------------
You will probably want to turn off the graphics for some tests, as they can slow down program execution considerably.
You should not modify t_array.py
.
The Set ADT
You should implement a Python class called Set
in a file called
array_set.py
. We are not providing boilerplate code for your Set
class; you will need to write the array_set.py
file from scratch.
The description below is copied almost verbatim from
the
official Python set documentation. We have removed some details
that are specific to the Python implementation of the set class. In
addition, a few methods and operators have been left out for the sake
of simplicity: difference_update
, -=
etc.
You are welcome to complete the remaining methods if you wish, but
they will not affect your grade.
A Set object is an unordered collection of distinct objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.
Like other collections, sets support x in set, len(set), and for x in set. Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior.
- class Set([iterable])
Return a new Set object whose elements are taken from optional parameter iterable. If iterable is not specified, a new empty set is returned.
Instances of
Set
provide the following operations:
- add(elem)
Add element elem to the set.
- remove(elem)
Remove element elem from the set. Raises
KeyError
if elem is not contained in the set.
- discard(elem)
Remove element elem from the set if it is present.
- pop()
Remove and return an arbitrary element from the set. Raises
KeyError
if the set is empty.
- clear()
Remove all elements from the set.
- x in s
Test x for membership in s.
- isdisjoint(other)
Return True if the set has no elements in common with other. Sets are disjoint if and only if their intersection is the empty set. Raises a TypeError if other is not a Set instance.
- issubset(other)
- set <= other
Test whether every element in the set is in other. Raises a TypeError if other is not a Set instance.
- set < other
Test whether the set is a proper subset of other, that is, set <= other and set != other. Raises a TypeError if other is not a Set instance.
- issuperset(other)
- set >= other
Test whether every element in other is in the set. Raises a TypeError if other is not a Set instance.
- set > other
Test whether the set is a proper superset of other, that is, set >= other and set != other. Raises a TypeError if other is not a Set instance.
- union(other)
- set | other | ...
Return a new set with elements from the set and all others. Raises a TypeError if other is not a Set instance.
- intersection(other)
- set & other & ...
Return a new set with elements common to the set and all others. Raises a TypeError if other is not a Set instance.
- difference(other)
- set - other - ...
Return a new set with elements in the set that are not in the others. Raises a TypeError if other is not a Set instance.
- symmetric_difference(other)
- set ^ other
Return a new set with elements in either the set or other but not both. Raises a TypeError if other is not a Set instance.
- copy()
Return a new set with a shallow copy of s.
Set
supports set to set comparisons. Two sets are equal if and
only if every element of each set is contained in the other (each is a subset of
the other). A set is less than another set if and only if the first set is a
proper subset of the second set (is a subset, but is not equal). A set is
greater than another set if and only if the first set is a proper superset of
the second set (is a superset, but is not equal).
The subset and equality comparisons do not generalize to a complete ordering
function. For example, any two disjoint sets are not equal and are not subsets
of each other, so all of the following return False: a<b, a==b, or a>b. Accordingly, sets do not implement the
__cmp__
method.
- __len__()
Return the cardinality of the set.
- __eq__(other)
Returns true if the set is equal to another set. Two sets are equal if and only if every element of each set is contained in the other (each is a subset of the other). If other is not a
Set
instance, this should return False.
-
__str__()
__repr__() Return a string representation of the set. The format should be as follows:
'set([ELEMENT_1, ELEMENT_2,..., ELEMENT_N])'
Where the string representations of each element should be obtained using therepr()
method.
- __iter__()
Return an iterator for the set.
Note that both the operator and non-operator versions
of union()
, intersection()
, difference()
,
and symmetric_difference()
, issubset()
and issuperset()
require their arguments to be Set
objects. A TypeError
should be raised if any of those
methods receive an argument of another type.
Hint: There are many opportunities for code reuse in this assignment; take advantage of them!
Testing
Part of your grade for this assignment will be based on developing unit
tests for your code. You may use the file set_tests.py as a starting point. That file includes
tests for all Set methods except discard
, remove
and
pop
. 15% of your grade for this assignment will be based on
writing adequate tests for those three methods.
As is the case for all programming assignments in this class, part of your project grade may be based on tests that are not posted publicly. The public test cases that are run when you submit your code are provided as a courtesy to help you verify that your submission is free of syntax errors; they should not be regarded as comprehensive. You should thoroughly test your own code and code stricly to the project specification. If you have any questions about whether a particular behavior is admissible by the spec, post a question on Piazza.
Memory Management (Optional)
The Python runtime provides automatic garbage collection, cleaning up unused
data structures when they are no longer referenced. However, the visualization
portion of the Array
class in t_array.py
keeps its own
list of all Array
objects in existence, thus preventing Python from
cleaning up as it goes. This can lead to the visualization becoming cluttered
after a while with unused Array
objects.
If you would like to clean up the visualization for easier debugging, you
will need to help the Array
class to manage its memory by doing two
things:
- Clean up the old arrays when you resize your internal storage for the
Set
class. When you are done with the oldArray
, you can tell theArray
class you are done with it by calling thefree()
method on the object you no longer want to track in the visualization. - Clean up the internal array when a
Set
object itself is cleaned up by the Python runtime. In order to do this, create a__del__()
method in theSet
class. Inside that method, callfree()
on your internal array.
Submission
DEADLINE: Wednesday, October 8, 2014 at 23:59 (11:59PM) ET
You should submit a zip file containing two .py files through the Web-CAT online submission system before the
deadline: a module named array_set.py
containing your completed Set
class, and an updated version of the unit testing module
(set_tests.py
). As always, your code should conform to the CS240 Python
Coding Conventions. Make sure that the array_set.py
file
contains your name and the JMU Honor Code statement. You do not need to provide
docstrings for the methods in the unit testing class, and you do not need to
submit t_array.py
.
Refer to the course syllabus for the policy on late submissions. These penalties will be applied manually while grading. Web-CAT does not apply them correctly; therefore, that behavior has been disabled.
Acknowledgments
The description of the Set class is largely taken from the official Python documentation. © 2001-2010 Python Software Foundation; All Rights Reserved. Here is a copy of the PSF License: psf_license.txt. The documentation has been modified for clarity and consistency with this project.