Group. Set Thinking.
Time to push a bit further on the grouping operations. I don’t get far before realizing that I need to break away and try again later. Not great, but good.
Happy St. Patrick’s Day to those who celebrate, and happy whatever day you perceive it to be to those who do not.
Although diverted by a squirrel(!) yesterday, I remain aware that the grouping operation is only partly done and that we “should” work toward finishing it up.
Above, you see the most distracting object in human history, the Shiny Chrome Squirrel of Unending Distraction. There may be many like it, but this one is mine. Look at it. Try to forget it. You’re still thinking about it, aren’t you?
Readers, to the extent that there are any, may remember that our XGroup XImplementation subclass contains a dictionary whose keys are an XSet of specific values from the grouping scopes, and whose values are a list (not an XSet) of the XSet “records” whose key values are those in the dictionary key.
That’s the state of things so far: we haven’t finished the XGroup yet. Here’s our current final test:
def test_build_x_group(self):
peeps = self.build_peeps()
scopes = SetBuilder()\
.put("department", "department")\
.put("job", "job")\
.set()
group_dictionary = defaultdict(list)
for person, scope in peeps:
keys = person.re_scope(scopes)
group_dictionary[keys].append((person, scope))
x_group = XGroup(group_dictionary)
group_set = XSet(x_group)
xset_it = iter(group_set)
x1, s = next(xset_it)
assert x1['department'] == 'it'
assert x1['job'] == 'serf'
x2, s = next(xset_it)
assert x2['department'] == 'it'
assert x2['job'] == 'sdet'
x3, s = next(xset_it)
assert x3['department'] == 'sales' and x3['job'] == 'closer'
We need to choose the form of the destructor, er, um, I mean we need to choose the form of the XSet that the XGroup will provide. What we have above may or may not be part of what we choose.
The purpose of the grouping operation is to produce a set with a set of field values for every combination of values in the records of source set. Putting that in something closer to set theoretic parlance … hold my chai …
Let R be a set { rii: ? } for arbitrary r and i. Think records.
Let S be a set { ss: ? } for arbitrary s. Think scopes or field names.
Group(R, S) is a set of pairs each with an element with scope ‘keys’ and an element with scope ‘values’
Group(R, S) = { {Kkeys, RKvalues} } such that
for every k in K, there is at least one record r in R such that r.re_scope(k) == k. Think “r has the same key field values as the keys in k”.
And RK is the set of all records rk such that rk.re_scope(k) == k. And for every record Ri in R such that rk.re_scope(k) == k, Ri is in RK. (I mean for the records rk to have their original scopes. I have not specified that.)
I will not try to write that out in Latex for you. I may try to write it out with a pen and if I do, I’ll include a picture.
We want our group set to produce a set of pairs, with keys ‘keys’ and ‘values’, with one pair for each unique combination of key values in R in the ‘keys’ set, and all the records with those particular key values in the values
set.
This could be a very good idea, or a very bad one. I think we’ll find it easy enough to create and useful for processing, but clearly it is not easy to specify in words.
We’ll modify the test above, and create new ones, leading to the structure we want. This is enough to fail:
def test_build_x_group(self):
peeps = self.build_peeps()
scopes = SetBuilder()\
.put("department", "department")\
.put("job", "job")\
.set()
group_dictionary = defaultdict(list)
for person, scope in peeps:
keys = person.re_scope(scopes)
group_dictionary[keys].append((person, scope))
x_group = XGroup(group_dictionary)
group_set = XSet(x_group) # contains pairs of XSets with scopes 'keys' and 'values'
xset_it = iter(group_set)
rec, s = next(xset_it) # pair with scopes 'keys' and 'values'
keys = rec['keys']
assert keys['department'] == 'it'
assert keys['job'] == 'serf'
# x2, s = next(xset_it)
# assert x2['department'] == 'it'
# assert x2['job'] == 'sdet'
# x3, s = next(xset_it)
# assert x3['department'] == 'sales' and x3['job'] == 'closer'
I expect a None on the assignment to keys
.
keys = rec['keys']
> assert keys['department'] == 'it'
E TypeError: 'NoneType' object is not subscriptable
We need to change the __iter__
in XGroup, from this:
def __iter__(self):
for group_keys, records in self._dict.items():
print("iter", group_keys)
yield group_keys, XSet.null
This now needs to create a two-element set for each key in the group_keys, one element of the tuple being keys and one values. Like this:
def __iter__(self):
for group_keys, records in self._dict.items():
result = XSet.from_tuples(((group_keys, 'keys'), (tuple(records), 'values')))
yield result, XSet.null
The code there goes beyond the test, since I have included the records, converted to a tuple because they have to be hashable. The test passes. Let’s commit, perhaps prematurely, but the XGroup is still under development and is not used elsewhere yet. Commit: progressing XGroup.
I add these two lines to the test and it continues to pass:
recs = rec['values']
assert len(recs) == 2
The two records that are in there are these:
e1 = SetBuilder() \
.put("it", "department") \
.put("serf", "job") \
.put(1000, "pay") \
.set()
e2 = SetBuilder() \
.put("it", "department") \
.put("serf", "job") \
.put(1100, "pay") \
.set()
Let’s make the sets provided include the employee name so that we can check for it. With that done (you’ll see the tests below, we can extend the test a bit.)
I have found a defect! Check the last few lines of my test:
rec, s = next(xset_it) # pair with scopes 'keys' and 'values'
keys = rec['keys']
assert keys['department'] == 'it'
assert keys['job'] == 'serf'
recs = rec['values']
for ee, s in recs:
if s == 1:
break
assert ee['name'] == 'alex'
r_alex = recs[1]
assert r_alex == ee
When I search for the set element in recs
with scope == 1, I get the record I expect. When I use the indexing form recs[1]
, I do not get the same record. In fact, what I get is the second record (the one whose index is 1, since python indexes start at zero.)
Somehow, although indexing by a string works, indexing by a number does not.
The issue is that recs
is not an XSet, it is a tuple.
- Stop Stop Stop
-
At this point, I realize that the code in the test, working toward the set I have in mind, is not as far along as I thought it was. I try a couple of things and realize that I need to take a step or two back.
-
I decide to roll back to the commit above, and stop for the morning, or perhaps even for the day. The code isn’t ready, and neither am I. The wise move is to back away for a bit and then go at it again. For once, I do the wise thing.
Summary
Is this a bad outcome, or a good one? I’d certainly have preferred a smooth set of small steps resulting in the XGroup object I’m aiming for, so it’s not a great outcome. But I think it’s a good one, because after just a very small amount of trying to adjust the current ode to produce an answer, I realized that I am not clear either on the way to build the set or the way to iterate it if I had it.
I think this means I need some smaller tests to drive smaller steps. At least one of those tests will probably want to deal with a hand-crafted XGroup set and a test for iterating it. We’ll see.
When we step away before doing damage, it’s not a great result, but it is a good one.
See you soon!