(Apr 25, 2024)
I'm not sure I'm done with this XST experiment, but I've come to some conclusions that I want to set down. Having done that, maybe I'll know what to do next.
(Mar 26, 2024)
Mistakes? Technical Debt? Better ideas? All the above?
(Mar 25, 2024)
OK with that out of my system, what shall we code today? It goes so nicely, until oops.
(Mar 25, 2024)
I'm not out of ideas, but I am out of directions, and my momentum is low. What shall I do?
(Mar 23, 2024)
The swirl that is my life leads me to think about duplicates and such. Very speculative musings, no conclusion. Sort of Captain's Log Stardate 77691.1.
(Mar 21, 2024)
The statistics code is much better now. But a little widget might be just the thing. Turns into a medium-sized widget, but I think it's an improvement.
(Mar 21, 2024)
The statistics code runs, but is written in 1960's style—yes, I DO know what 1960's code looks like—and needs improvement. An interesting series of events with a fine result.
(Mar 20, 2024)
Commonly, when we group data by some criteria, we desire summary information about the groups: counts, totals, averages, and so on. We take a couple of nice steps before my brain is fried.
(Mar 19, 2024)
Yesterday's experiment with `group_by` seemed very successful. What does it tell us about what we're doing and what we should be doing? Bit of a retrospective.
(Mar 18, 2024)
Let's try an experiment, along the lines of my sketch at the end of this morning's article. It goes very well. Very well indeed!
(Mar 18, 2024)
I'm here for another step on the way to the grouping operator. I have at least one new realization going in. More than one coming out. Some code progress as well.
(Mar 17, 2024)
Time to push a bit further on the grouping operations. I don't get far before realizing that I need to break away and try again later. Not great, but good.
(Mar 16, 2024)
A bit of research into the past of Extended Set Theory suggests an interesting possibility for the future, and provides the joy of working something out on my own. Includes some possibly amusing history about how primitive man created such documents.
(Mar 15, 2024)
I'm continuing directly on from the preceding article, to create my XGroup implementation. It may turn out that a break was needed. We'll see.
(Mar 15, 2024)
This morning I'm working toward a join operator, by way of a grouping operator. I have vague ideas, and need some help from my code.
(Mar 14, 2024)
Fancy as my new streaming select is, I think it's, well, not quite the thing. Easily replaced, but do I need to pivot? I surprise myself.
(Mar 13, 2024)
OK, new rule, we're going to read XFlatFile sets into memory. Should we refactor to this new result, or TDD a new class? Let's refactor. Works quite well!
(Mar 13, 2024)
It's 0415 and I am thinking about buffering. I create an XSet with a billion records in it. Python does not sneeze.
(Mar 12, 2024)
The thing about learning is that when we have sucked most of the learning juice out of some fruitful idea, there is often work to be done to get the thing finished. We have at least two of those hanging fruits right now. One of them has an interesting effect!
(Mar 12, 2024)
The Powers That Be have invited me to discuss the risks around this Python XST effort, on the assumption that it is anything other than play.
(Mar 11, 2024)
There's no doubt that we can create sets or set operations that stream their results, avoiding creating large temporary sets in memory. There are issues to think about. First experiment works!
(Mar 10, 2024)
In which, your intrepid author considers times now and times then, assessing whether and how to adjust what we do with XST given the new reality.
(Mar 9, 2024)
A check of our list of ideas leads to discoveries, to key decisions, and to reflection on the realities of software development. Quite a span, as perhaps it always should be. Interesting article reference included.
(Mar 7, 2024)
Ran across this. Rather nice. Then I took one more step. Off a cliff. Fortunately, I didn't look down.
(Mar 7, 2024)
We've been working on calculated fields. Let's see about getting them built into sets. Yucch! I touched a debugger.
(Mar 7, 2024)
Our work on calculations raises some longer-term concerns about large dataset and statistics. Are we in big trouble?
(Mar 6, 2024)
Let's fetch values from XST records for our Expressions. I expect this to go smoothly. It does. We discuss small steps. No, smaller than that.
(Mar 5, 2024)
I have a bit of time, with distractions. Let's see about error conditions in our expressions.
(Mar 5, 2024)
We need to work on assignments and field values. I was thinking values, and then we do assignments instead. That's where the path looked best.
(Mar 4, 2024)
We'll take some steps along the expression path. I'm slightly questioning part of what we have. The path is zig-zag but leads to a good place.
(Mar 4, 2024)
Spikes for the expression parsing have taught us enough. Let's see how we can best move from the spike code into decent production objects. There's one somewhat large issue: symbols. No, two: data types.
(Mar 3, 2024)
Back to the interpreter. Shall we try the lambda thing again, or something else? Lambda seems fraught. Partial function FTW.
(Mar 3, 2024)
We can parse a simple expression. Let's sketch the expression interpreter. Bear gives us a nip, but we have progress.
(Mar 1, 2024)
I think I'll break down and do expressions. I'm thinking a simple Dijkstra-style parser will do the job for us. Let's see if we can find a simple way to develop one.
(Mar 1, 2024)
Suppose we want a set where one element is the sum of two others. What about calculating a scope? I am interrupted and then spot a squirrel.
(Mar 1, 2024)
Somewhat enlightened about how early XSP systems worked, there are a few ways we might go. I need too reflect on possibilities and pick a direction.
(Feb 29, 2024)
Study of some ancient scrolls leaves me thoughtful, and a bit disappointed.
(Feb 27, 2024)
You probably do not want to read this. I'm writing down my thoughts, in the vain hope of figuring out what they are and what they should be. Dave Childs was kind enough to send me some links to articles and other materials about XST. I include them at the end of this article, for reasons. The materials gave me some things to think about.
(Feb 26, 2024)
I was whining earlier about the difficulty of writing tests that assert about elements inside a result set. A thought has come to me.
(Feb 26, 2024)
I want to work on some indexing operations. Along the way I rediscover XFlatFile's existing scope set feature. Just what we need. Or is it? Includes a judicious rollback.
(Feb 25, 2024)
Laurent challenged me to find a better, more elegant formulation for getting the correct re_scoping set from a provided renaming set. I thought I had it. Then I thought I hadn't. Then I thought I had.
(Feb 24, 2024)
We have a long-form cardinality method. Let's use the len function and require it as part of the implementations.
(Feb 24, 2024)
Let's look around and see what we might do. I'll even make a list. Jira is an abomination upon the land.
(Feb 23, 2024)
I have received a bug report! This is great news! Someone is paying attention!
(Feb 23, 2024)
Putting a specialized rename into our flat set showed us that the symbol table there is quite ad hoc. Let's make it more like a set. (Turns out: no.)
(Feb 22, 2024)
Since the Flat sets know their field names once and for all, we could use a better rename that doesn't copy the data. Simple enough, rather nice.
(Feb 22, 2024)
If there was a kind of set that was expressed as a function, we could possibly pipeline operations, reducing memory impact. Is that possible, and is it a good idea? So far, maybe not.
(Feb 21, 2024)
What if a function IS a set instead of returning a set? This might be significant.
(Feb 21, 2024)
We try and succeed in implementing an XFlatFile that refers only to a subset of the file. This is a big deal!
(Feb 21, 2024)
I'm working on the idea of an XFlatFile that only reads a defined subset of the file. Am I working without a net? Not really.
(Feb 20, 2024)
Plus wasn't the right operator for union. Let's update that and then a few other operators.
(Feb 20, 2024)
I had thought that I'd work on the flat data. My thoughts led me astray, to a glimmering of a possibly good idea. So I'll just code something cute to finish the morning's work.
(Feb 19, 2024)
Thinking about rename in the Flat implementation leads to discovery of an interesting defect, and some thinking about use of the powerful generality of set theory.
(Feb 19, 2024)
In an astounding flurry, we are going to build a rename operation.
(Feb 18, 2024)
Childs has defined two "re-scope" operations. Let's see if we can implement one of them, and if we're glad we did.
(Feb 18, 2024)
We'll start with a simple removal of the requirement for implementations to implement `__contains__`. After that, we'll see. (And that's not what happens.)
(Feb 17, 2024)
In your absence, I made a simple but significant change. I had an interesting idea. And an important observation. And more.
(Feb 16, 2024)
We make some progress on our flat files, but progress is slow and there are a lot of words here. Best skim or skip?
(Feb 15, 2024)
Let's see about getting set creation sorted. We need to be better able to create sets with any possible base implementation. Deep confusion ensues.
(Feb 15, 2024)
To process flat files, we want to avoid leaving a file open, and we don't want to open it a zillion times. Do we have to invent buffering? Perhaps not.
(Feb 14, 2024)
Our experimental tests one flat records look good. Let me report on my off-line work, and then let's see what's next.
(Feb 13, 2024)
Let's get started on a flat-file-focused form of set. We make a bit of progress.
(Feb 13, 2024)
I want to begin by thinking about storage: how do we best produce specific memory / file formats? The article turns out to be pure speculation, but perhaps useful speculation.
(Feb 12, 2024)
Having prepared better than we did yesterday morning, let's proceed with wrapping our `frozenset` with a class of our own.
(Feb 11, 2024)
Let's have another try at wrapping our `frozenset` in an object that will work for us. Things go much better ... so far.
(Feb 11, 2024)
I make some useful initial observations, and then, well, I crash and burn. And mix a metaphor.
(Feb 10, 2024)
I think that this morning, I'll take a small step toward having more than one implementation of a set's data.
(Feb 9, 2024)
If I'm going to get serious about this Extended Set Theory thing, a tiny bit of design thinking seems to be in order.
(Feb 9, 2024)
It is 0253 hours. I have a report and another report.
(Feb 8, 2024)
Lets do `project`, as in projection, as preparation for trying a generator approach to set expressions. This part should be easy.
(Feb 8, 2024)
There's a thing I want to do with this XSet stuff, and I don't quite know how to do it. I need to better understand iterators and generators.
(Feb 7, 2024)
In which, we report on a nifty little thing, and then, well, I don't know yet what I'll do. Some concluding remarks on what it is that I do here.
(Feb 6, 2024)
In which, I show you what I'm dealing with, and mention an insight.
(Feb 5, 2024)
Let's see how we can select records from an XSet, using the `restrict` operator.
(Feb 5, 2024)
I propose to push a bit further on the use of Python frozenset to do a little Extended Set Theory. I mention symmetric difference!
(Feb 4, 2024)
Some random reading, and the messing about that I've done in the FAFO series, leads me to want to explore Extended Set Theory in Python. I do not expect it to be useful but it might be interesting.
(Feb 24, 2022)
Yesterday's search for strawberries in the XST patch discovered problems. I want to at least double check what we did. TL;DR: It all works. Odd morning.
(Jan 28, 2022)
I'm pressing forward with the Lispy thing, but I have some concerns. I find it difficult to think about but I think I have an angle. (Spoiler: It's a wrap!)
(Jan 11, 2022)
Just a bit more playing with the Lispy Calculator to while away a few minutes in the afternoon.
(Jan 11, 2022)
Thoughts and observations. Stuff and nonsense.
(Jan 10, 2022)
I'm going to push forward with this LISP / Scheme dialect. I'll begin by explaining why, and why not.
(Jan 9, 2022)
It's still the weekend, so I'm going to follow Peter Norvig's Python LISP Implementation a while and see where it takes me.
(Jan 8, 2022)
It's my weekend and I'll try if I want to. You could try too if it happened to you. Spoiler: This takes a very weird turn. Final line: Your move, Bill!
(Jan 7, 2022)
I guess there's nothing for it but to figure out how to rewrite, i.e. refactor, a set operation based on the existence of helper structures. But how? I have ideas but are any of them any good?
(Jan 6, 2022)
Bill Wake is trying to get me to think in terms of trees. I don't want to, but he does have some good ideas. Thanks, Bill!
(Jan 5, 2022)
Today I plan to experiment with creating some form of expressions that might be optimized. I expect to stumble a lot. Come along, point, and laugh.
(Jan 4, 2022)
No, not smiles and frowns. Algebra. At this moment I don't think much code will be done today. Feel cute, might delete later.
(Jan 3, 2022)
It's 6 AM and I have an idea. This could be very good or very bad.
(Jan 2, 2022)
I don't love the interface for adding functions into an XSet. And I want to add them 'one level in'. Will hilarity ensue? Probably not, but something will happen.
(Jan 1, 2022)
Some HNY thoughts, and more on the function as an element idea. Joy, philosophy, code. What's not to like?
(Dec 31, 2021)
I was thinking before I got up about median and mode. Then I had a truly marvelous idea.
(Dec 30, 2021)
I want Lua tables to be more useful as XSets. There's a hard way. But the current design also offers an easy way. (The answer will surprise you. It surprised me.)
(Dec 29, 2021)
I'm on a path to make ordinary tables behave like XSets. But first, I have to figure out how this thing actually works! Much musing, then some code.
(Dec 28, 2021)
I have in mind small things for today, starting with an interesting and confusing mistake left over from yesterday.
(Dec 27, 2021)
No, I'm not hearing voices. But the code does tell us things, just like any working material. We need to learn to listen. Today, we listen and the result is good.
(Dec 26, 2021)
A look at the code. Maybe a bit more on stats. P.S. I learn something and ditch almost all the code I wrote this morning.
(Dec 25, 2021)
It's Christmas, I'm waiting for the household to wake up, and I enjoy what I'm doing. Perfect holiday so far!
(Dec 24, 2021)
Today I plan to get grouping and summing working. Who knows, it might happen. If not, there's always tomorrow or my birthday.
(Dec 23, 2021)
Let's think about what the current drafts of summing and grouping tell us about our system. Then code (anyway).
(Dec 22, 2021)
Today, rather than make any deep progress, I plan to work on something I consider interesting, sums, averages, and grouping. I promise to publish this even if it explodes. (It doesn't, quite.)
(Dec 21, 2021)
Transformations, optimizations, and the relationship between OO and XST. Got some thinking to do. You get to watch, if you're tough enough. At least a tiny bit of code.
(Dec 20, 2021)
Step by step, inch by inch, slowly we turn long searches into more direct accesses.
(Dec 18, 2021)
I don't want to get stuck in a never-ending series of new setops: there won't be much learning there. Where's the beef?
(Dec 17, 2021)
I'm sure a lot of you have been saying 'Yes, but what about tuples?', or 'Why XST anyway?'. Today, we address those fascinating concerns.
(Dec 16, 2021)
Let's work on those new operations a bit.
(Dec 16, 2021)
Short morning today. I have a tentative plan for indexes. I'll scribble some sets. Might code.
(Dec 15, 2021)
Last night's Zoom Ensemble netted me a few ideas. I'll start exploring those today. Hilarity or perhaps something good will ensue. I can't wait to find out.
(Dec 14, 2021)
Further reading leads me to think about design, and design motivation. Castles in the air. Or underground. Good stuff happens.
(Dec 13, 2021)
In conversation with Bill Wake and with the Internet, I have an idea for something to try. And I'm just about ready to assess where we are and where we should go.
(Dec 12, 2021)
I'm not sure whether this will be useful, but Bill Wake gets the credit if it is.
(Dec 11, 2021)
Bill Wake was trying to hammer an idea into my head. I must think about that. And I have a small idea of my own.
(Dec 10, 2021)
Time to work on the actual restrict operator for CSV, since the pattern-maker experiment was a success.
(Dec 9, 2021)
There's no way around it, I've got to work on the fast restrict today. Might not finish. We'll see.
(Dec 8, 2021)
Getting started with CSV data. And reporting a conversation.
(Dec 7, 2021)
Reflection leads me to focus a bit more on set operations, and less on internal methods. Does this call for a new layer? Also: real technical debt! Updated with idea from Carl Manaster!
(Dec 6, 2021)
I'm going to try to create pipelines using coroutines. I think they may make for a more expressive interface. I turn out to be partially right.
(Dec 6, 2021)
How do you design a thing like this, Brain? Same as everything else? Or not?
(Dec 5, 2021)
I was puzzling over an issue with 'union' and gained an insight that either I've never had, or that I had lost. Whee!
(Dec 3, 2021)
I found the easy way to build an iterator in Lua, so we'll do that and see whether it improves the code as much as I think it will.
(Dec 2, 2021)
I think I'm going to start on restrict today. There are some issues around atoms.
(Dec 1, 2021)
Last night I understood how to do something with XST that I've not in the past been able to do. So let's talk about why XST is interesting and what one might do with it.
(Nov 30, 2021)
I have a random idea about the data structure for sets, so thought I'd give it a try.
(Nov 30, 2021)
Save me, I'm thinking about Extended Set Theory again.