Collator Repo on GitHub

Just a few small refinements, because I was sitting here anyway.

I’ve added type hints for the returns from methods that return other than None, for example:

class Collator:
    def results(self) -> Generator[Result, None, None]:
        return (self._result_for(aged_name.name, is_new=aged_name.is_new)
                for aged_name in self.sequencer.aged_names())

    def _result_for(self, name: str, is_new: bool)-> Result:
        return Result(name=name, outcome=(self._outcome_for(name)), is_new=is_new)

class Sequencer:
    def aged_names(self) -> Generator[AgedName, None, None]:
        yield from self._yield_known_names()
        yield from self._yield_and_age_new_names()
        self.new_names = []

    def _yield_known_names(self) -> Generator[AgedName, None, None]:
        for name in self.known_names:
            yield AgedName(name=name, is_new=False)

    def _yield_and_age_new_names(self) -> Generator[AgedName, None, None]:
        for name in self.new_names:
            self.known_names.append(name)
            yield AgedName(name=name, is_new=True)

I changed the form of the generator in results above, and now that I look at it here on the page, I don’t like it as well as the explicit form.. I’ll put it back:

    def results(self) -> Generator[Result, None, None]:
        for aged_name in self.sequencer.aged_names():
            yield self._result_for(aged_name.name, is_new=aged_name.is_new)

    def _result_for(self, name: str, is_new: bool)-> Result:
        return Result(name=name, outcome=(self._outcome_for(name)), is_new=is_new)

I think we might like a shorter name for the loop variable. How about this:

    def results(self) -> Generator[Result, None, None]:
        for an in self.sequencer.aged_names():
            yield self._result_for(an.name, is_new=an.is_new)

What would actually be better, I think, and I kind of hate to say it, would be if aged_names just returned two values, the name and the is_new, like this:

    def results(self) -> Generator[Result, None, None]:
        for name, is_new in self.sequencer.aged_names():
            yield self._result_for(name, is_new)

Ah, let’s pass the aged_name to _result_for. We already know how to reference the AgedName: we’ll just push the dereferencing down a level:

    def results(self) -> Generator[Result, None, None]:
        for aged_name in self.sequencer.aged_names():
            yield self._result_for(aged_name)

    def _result_for(self, aged_name) -> Result:
        name = aged_name.name
        return Result(name, self._outcome_for(name), aged_name.is_new)

Two tests break. I don’t want to lose them, so belay that change.

    def results(self) -> Generator[Result, None, None]:
        return (self._result_for(aged_name.name, aged_name.is_new)
                for aged_name in self.sequencer.aged_names())

    def _result_for(self, name: str, is_new: bool)-> Result:
        return Result(name, self._outcome_for(name), is_new)

I took out the parameter names, which makes it all a bit less messy and easier to read, I think.

I think we’re done here for now.

Summary

I’ve created a repo on GitHub in case anyone wants to browse.

I think we have these objects pretty close to book-publication quality. I am not sure whether I’d suggest going this far with production code. I kind of wish all my code looked this good, but I’m not really sure that it’s worth the time. The point of this kind of thing is to reduce large future work due to hard-to-understand code, by investing smaller amounts of work to make the code easier to understand. It’s not clear where to draw the line on that.

I’d probably advise erring on the side of too much polishing, if I gave advice, which I do not. Since we are usually rushed in production work, a force holding us back from rushing too much would probably be valuable, and we’d probably still rush too much. Anyway, I don’t give advice, I just do things and show you what happens.

See you next time! Probably back on the Forth wagon.