Today, let's move the auxiliary files.

As I put up the Configurator article this morning, I noticed that Transmit sends up my feed.xml file, which I had forgotten about in thinking about the requirements for our iPad project. Since I have to remember it anyway, I’m thinking we’ll work on moving the auxiliary files this morning.

When Jekyll builds my site, it builds various auxiliary files. In particular, I can think of:

the home page index
category indexes
the RSS feed file
[commit and push the site to Github?]

There may be more files than these. We’ll look to be sure. My cunning plan is that the JekyllRunner will just unconditionally move all these files, without regard to whether they are contain new information or not. To move only the changed ones, we’d have to analyze the source files to see what categories they have, and that’s not worth it. The plan at present is just to move them all.

It’s a bit tricky to decide what has to move. In principle, I could modify any of the source files in the site folder on my computer, and Jekyll would regenerate it and then it would need to be moved up. (Otherwise it generates a new copy of the old one, which doesn’t vary from the one that’s already on the site, and doesn’t need to be moved.) The iPad app can’t change any of those files anyway, so I can eyeball the list and decide. All I see in the root is feed.xml and index.html. The former is generated by some magic that I don’t remember. The latter is pretty fancy, including all the code it takes to generate my slightly fancy index page. We don’t need to worry about how it works, we just need to remember to move the copy in the _site folder up to the web page.

And there are category folders, all conveniently under a folder named categories. Again, each of these sub-folders contains an index.html file with some magical scripty stuff in it, and each gets generated and moved to _site, and needs to be moved up to the web site.

So this seems pretty easy, we just need to …

move up feed.xml and index.html from _site;
move up all files found in rj_com/categories.

Since we have code that reads files names and folders from the input and moves corresponding files. Note however the words just and easy. These always bode ill.

NARRATOR:
Indeed, it did bode ill.

I’ll create a feed and index and some category folders, with dummy files in the test site. Then we can write a test about them.

Tozier has arrived and mentions that maybe we should think about having an FTP object. Curiously, Chet and I talked about that yesterday. There’s no doubt that the code is asking for one, since we’re doing all that saving and restoring of the folder stuff. However … for now we’ll build our feature.

It turns out that Jekyll has default feed and index. Good enough for our purposes. We still need the category ones. I’ll just make a few.

Since Jekyll generates my real categories magically, we’ll trick it by making some source folders in our test site, in the same format as the category files it generates. Basic Jekyll behavior will move those up to _site. At least that’s my theory. Let’s find out. We’ll run our tests.

We look into _site and sure enough the files are there. We can write some kind of test to see that we move these and our files in the root.

Darn. Enough prep. We have to start thinking now. We find the test easy to write:

  def test_categories_are_uploaded
    jr = set_up_for_jekyll_testing
    jr.run
    sought = './_target/categories/cat-1/index.html'
    assert(File.exist?(sought), 
      "end to end can't find ftped #{sought}")  
  end

It’s easy mostly because we copied it from our existing end-to-end test. Be that as it may, it fails as expected:

Test_JekyllRunner#test_categories_are_uploaded [/Users/ron/programming/test-ftp/test-jekyll-runner.rb:81]:
end to end can't find ftped ./_target/categories/cat-1/index.html

We need to implement the feature. Our ftp_the_files is incomplete, since it doesn’t do this new stuff. When we look at it we also see that it’s not lovely:

  def ftp_the_files(simulated = false)
    perform_in_folder("#{@jekyll_folder}/_site/articles") do 
      ftp_connection = ftp_connected_to_target unless simulated
      result = what_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

We rename what_to_ftp:

  def articles_to_ftp
    perform_in_folder(@ipad) do
      Dir.glob('**/*').collect { |f| rename_md_to_html(f) }
    end
  end

  def ftp_the_files(simulated = false)
    perform_in_folder("#{@jekyll_folder}/_site/articles") do 
      ftp_connection = ftp_connected_to_target unless simulated
      result = articles_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

This makes it more clear that the first bit here is just moving the articles that correspond to our input folder. Our mission now is to add another bit that moves the categories. However, that will clearly glom up that ftp_the_files method even more than it already is. I plan to extract that complex code and put it into a new method, so that ftp_the_files can someday wind up like this:

  # imaginary code ...
  def ftp_the_files(simulated = false)
    move_the_articles(simulated)
    move_the_categories(simulated)
    move_the_root_files(simulated)
  end

As we look at the code blob, I notice that inside there is a method call ftp_connected_to_target. Toziers’s early remarks about needing an FTP object loom large. But we have to get a feature done. let’s press on:

  def ftp_the_files(simulated = false)
    move_the_articles(simulated)
  end

  def move_the_articles(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/articles") do 
      ftp_connection = ftp_connected_to_target unless simulated
      result = articles_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

I see this this as a pure refactoring and expect that all the tests will run, except for our new one. I’m totally sure. So we run them. Modulo the fact that we didn’t rename the articles_to_ftp that we call directly in our tests, we get the expected result.

Now the new category mover will be very much like the article mover. We begin by copy/paste:

  def ftp_the_files(simulated = false)
    move_the_articles(simulated)
    move_the_categories(simulated)
  end

  def move_the_categories(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/categories") do 
      ftp_connection = ftp_connected_to_target unless simulated
      result = categories_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

Same as the articles one, except that we write to a categories folder and we need a new method categories_to_ftp. We do see some duplication and some of it is particularly troubling. But we expect to be able to make the test run without fixing the duplication.

NARRATOR:
It was not to be.

[Time passes. Then more time passes.] Oh! Hello! We’ve just been in a rat hole, because with our new move_the_categories method, our tests throw an error trying to set to the wrong folder. The reason, after some confusion, turns out to be that we are calling perform_in_folder recursively, so that our inner call addresses a Dir instance that isn’t yet restored. (The first call’s ensure has not run.) The fix is not clear.

At base, I think the issue is the meaning of perform_in_folder. What it means as implemented is “starting from wherever Dir happens to be, set it to this new place and then put it back”. What we intend, I believe, is “starting from the original root of wherever we were, set to this new place and then put it back”.

We could, we think, hammer on the path in our categories call to perform_in_folder, and make it work, but then those methods would be coupled to an assumption about Dir and that’s what we’re trying to avoid.

What we need is an instance of Dir with its own memory. However, Dir doesn’t work that way.

NARRATOR:
They told themselves that.
Why didn’t they listen?
Why do they never listen?

Tozier does some research and discovers that Ruby Pathname has enough of the functionality of Dir that it may be useful. In particular you can say Pathname.glob(pattern) and get back the thing you had in mind. So … what if we never move Dir's little man around, never doing chdir. All we ever do with Dir is glob anyway, I think.

Tozier speculates at length. I nod wisely. (Well, no I just sat there.) I am inclined toward using Pathname instances, if only because they might be more safe. I am also inclined to think that our use of Dir.glob is flawed because it doesn’t express our intention. Our intention, probably, is “Hey, give me a list of all the folders and files around this here place”.

The Pathname research is interesting, but we need to get to running tests. Then we can change our implementation of whatever the heck we’re doing with Dir.

NARRATOR:
It will fail.
Oh, will it ever fail.

I was starting to feel we should just revert and start over, but now we see what’s happening, so we decide to go ahead. Let’s see if we can figure a way to make this work. We’ll now put the test back and let it fail.

  def test_categories_are_uploaded
    jr = set_up_for_jekyll_testing
    jr.run
    sought = './_target/categories/cat-1/index.html'
    assert(File.exist?(sought), 
      "end to end can't find ftped #{sought}")  
  end

Sweet. Now we can put our category move call back and then implement it.

Anyway we get back to our single failing test. Right now we have no implementation for move_the_categories. We know that if we build it in the obvious way, we’ll get the perform_in_folder can’t be recursive bug. I’m inclined to do that and make perform_in_folder work. And I have a cunning and evil plan.

class JekyllRunner

  def articles_to_ftp
    perform_in_folder(@ipad) do
      Dir.glob('**/*').collect { |f| rename_md_to_html(f) }
    end
  end

  def categories_to_ftp
    perform_in_folder('test_jekyll_site/_site/categories') do
      Dir.glob('**/*')
    end
  end

  def ftp_the_files(simulated = false)
    move_the_categories(simulated)
    move_the_articles(simulated) #this returns that crappy list of strings
  end

  def move_the_articles(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/articles") do 
      ftp_connection = ftp_connected_to_target unless simulated
      result = articles_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

  def move_the_categories(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/categories") do 
      ftp_connection = ftp_connected_to_target unless simulated
      result = categories_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

This should fail because Dir isn’t in the right place at the time of our cal to glob. If it runs it brings us back to the error we had before we became confused.

Yes. Our error is back:

Errno::ENOENT: No such file or directory @ dir_chdir - test_jekyll_site/_site/categories

Now we “just” have to fix perform_in_folder. Stand back, this may pinch a bit.

First, I hack perform_in_folder to always work from the “root”, thusly:

  def initialize(ipad_folder, jekyll_folder, host_name, host_folder, prefix)
    @ipad = ipad_folder
    @jekyll_folder = jekyll_folder
    @ftp_host_name = host_name
    @ftp_folder = host_folder
    @id_prefix = prefix
    @dir_pwd = Dir.pwd ## New, save starting location
  end

  def perform_in_folder(new_dir, &block)
      pwd = Dir.pwd
      Dir.chdir(@dir_pwd) ## New, to set known starting location
      Dir.chdir(new_dir)
    begin
      result = block.call
    ensure
      Dir.chdir(pwd)
    end
    return result
  end

This grotesque hack just makes sure we always start from our root. Now our test fails for a new reason. This is progress. We discover that it has moved the category folders, but moved them under articles rather than under categories. That’s because when we initialize the JekyllRunner object we make it point at the articles folder:

  def set_up_for_jekyll_testing
    FileUtils.rm_rf("./_target")
    FileUtils.mkdir("./_target")
    FileUtils.mkdir("./_target/articles")
    FileUtils.rm_rf("./test_jekyll_site/articles")
    ipad_folder = '/Users/ron/Dropbox/_source'
    jekyll_folder = './test_jekyll_site'

    # See? We refer to `.../articles`
    return JekyllRunner.new(ipad_folder, jekyll_folder, 
      'localhost', 'programming/test-ftp/_target/articles', 'TEST_')
  end

The last line is aimed at articles and no one ever changes that. I suggest the “fix” is to aim here at _target and have the JekyllRunner aim at articles or categories as appropriate.

I’m feeling this is a bit junky, and we’ve been red for a long time. But maybe we’re close …

  def move_the_articles(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/articles") do 
      ftp_connection = ftp_connected_to_target unless simulated
      ftp_connection.chdir('articles') unless simulated
      result = articles_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

  def move_the_categories(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/categories") do 
      ftp_connection = ftp_connected_to_target unless simulated
      ftp_connection.chdir('categories') unless simulated
      result = categories_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

OK, these new calls to ftp_connection.chdir are ugly … but the tests run. It did take us a while to realize that our new hacked-in chdir needed to be conditioned by the simulated flag.

But the do tests run and the files are really moved. (We couldn’t resist looking even though our tests are surely ever so robust.) Our feature works.

We’re asking ourselves whether we have the brains, judgment, and maturity to clean this up. It’s noon. I have an appointment in Ann Arbor, but that’s not till 2 PM. What might we do by way of cleaning this up?

We don’t like all those unless things. More evil is that these two functions are almost entirely duplicated, so we should be able to do an Extract Method, remove that duplication, and then the unless problem can then be fixed in only one place.

Our two move_ methods get different items to move, by calling categories_to_ftp and articles_to_ftp, which look like this:

  def articles_to_ftp
    perform_in_folder(@ipad) do
      Dir.glob('**/*').collect { |f| rename_md_to_html(f) }
    end
  end

  def categories_to_ftp
    perform_in_folder('test_jekyll_site/_site/categories') do
      Dir.glob('**/*')
    end
  end

These are almost the same, but not so similar that we can merge them. So first we make them more similar:

  def articles_to_ftp
    perform_in_folder(@ipad) do
      Dir.glob('**/*').collect { |f| rename_md_to_html(f) }
    end
  end

  def categories_to_ftp
    perform_in_folder('test_jekyll_site/_site/categories') do
      Dir.glob('**/*').collect { |f| rename_md_to_html(f) }
    end
  end

The rename is benign, just wasteful. Now we can pull the path used up into a parameter, in a new method files_to_ftp. We do categories first:

  def articles_to_ftp
    files_to_ftp(@ipad)
  end

  def categories_to_ftp
    files_to_ftp('test_jekyll_site/_site/categories')
  end

  def files_to_ftp(folder)
    perform_in_folder(folder) do
      Dir.glob('**/*').collect { |f| rename_md_to_html(f) }
    end
  end

We still have the difference in our two moving methods that they call different methods to get the files to me. But now we can use our new files_to_ftp function in each move_ method … using Inline Method on this:

  def move_the_articles(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/articles") do 
      ftp_connection = ftp_connected_to_target unless simulated
      ftp_connection.chdir('articles') unless simulated
      result = articles_to_ftp().collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

To get this:

  def move_the_articles(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/articles") do 
      ftp_connection = ftp_connected_to_target unless simulated
      ftp_connection.chdir('articles') unless simulated
      result = files_to_ftp(@ipad).collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end
      ftp_connection.close unless simulated
      result
    end
  end

move_the_articles now calls our new files_to_ftp method. However, there’s a test that directly calls articles_to_ftp. We tried making it call the files_to_ftp function and it wouldn’t hook up. For now, we commented that test out. This is only semi-righteous but we do check for things actually being FTP’d now so maybe it’s OK.

Again I feel we are getting desperate. The red flags are waving and the siren is going Ahhoooga. But our tests are green again. We’ll push on to remove the categories thing the same way.

  def move_the_categories(simulated)
    perform_in_folder("#{@jekyll_folder}/_site/categories") do 
      ftp_connection = ftp_connected_to_target unless simulated
      ftp_connection.chdir('categories') unless simulated
      result = 
    files_to_ftp('test_jekyll_site/_site/categories').collect do |file|
        ftp_one_file(ftp_connection, file, simulated)
      end

      ftp_connection.close unless simulated
      result
    end
  end

Now we have the two functions similar enough that we could extract and parameterize. The tests are green. We could go ahead.

I declare, however, that we are burned out, we’ve fallen into error enough times, and the next refactoring should be done with fresh eyes if not fresh programmers. Tests are green, it’s legit to pause.

Lessons learned …

NARRATOR:
Will they learn?
Will they?

The Mangle strikes hard today. The code pushed back. Tozier feels a bit that we are Wile E. Coyote getting hit by our own hammer. There’s some kind of ornateness going on that’s troubling him. Truth is, we’ve made some pretty profound changes even though they were only a line or two.

And we’ve been confused a lot, again by these class-level Dir and FTP objects and their silly static behavior. We sort of fixed that by forcing Dir.chdir to remember the root and go back.

My intention with that hack was to remove it later, by building a class that works more like we had in mind. We didn’t get there yet. Tozier still thinks Pathname might help with the problem.

Tozier also thinks our specific path names for the folders aren’t helping us. Is it the Jekyll source folder, the folder we’re building into, or what? There’s too much to keep in our heads.

Part of our problem is that we first wrote some tests pointing to particular folders, and as the program matured, some of those names have turned into arguments in JekyllRunner creation. But we haven’t fully cleaned out the fixed names in the tests.

Relatedly, some of our tests are, by the nature of the way we built them, very invasive. They ask “what are you going to do” rather than letting you do it and then checking. These are later covered by “did you do it” tests but they didn’t get removed. Except for the one we commented out just now.

There was a moment, quite a while back, where, perhaps, we should have reverted and started over. We were confused. We pressed on and worked it out. Would it have been better to revert? My rule of thumb says yes, but we did learn some things and then went forward to fix them. Well, semi-forward. We removed a bunch of stuff and then put it back. I guess that’s a sort of revert.

The main thing is, though, we went pretty deep into confusion (at least I did) and my better judgment tells me to stop and get unconfused. My long experience is that often I don’t do that, I just keep digging. This time, it worked.

Here’s the code if you want to look at it

Friday

The above was done on Thursday. It’s Friday morning as I ready this article for publication. Going over the draft, I vaguely remember that in one of those “Time passes” periods that happen when I’m not writing for a while, we made little changes to “fix” things, and I didn’t record them. And I don’t even remember them.

What does this mean? You know those times when you’re doing something big, and it doesn’t quite work and you see some tiny thing that relates and you fix it, and you move on, and by the time you write your commit message, you don’t even remember what you did, much less why? (Or am I the only one who ever does that?)

For me, at least, that little fix, popped in quickly, represents a little bit of clarity that has been forgotten. It’s probably good. The tests run, we have a good idea what we want to do next for our feature, and we have some clues on how to fix the perform_in_folder to work more like we intend. We might even remember to do it.

On the C3 project, we had a saying that something you did all day yesterday could be pulled out and redone, better, in an hour today. We really didn’t want to revert yesterday. We talked about it two or three times, and I felt it more often than that. But if we reverted, we would have failed, and maybe there wouldn’t be time to get our feature done.

Imagine if anyone really cared about our feature. We’d have felt more pressure to push on. Even without anyone managing us, I think we’ve probably left some cruft in the code, and missed some understanding that a reversion might have given us. We’ll talk about reverting next Tuesday, I imagine. But we’ll decide to move ahead, with extra vigilance to find any cruft we’ve left in.

I am not convinced that’s the right thing to do, but we’ll do it. We’ll surely find some cruft and gain more understanding. Still, I think we wrote a bit of legacy code yesterday, and now we’ll never get rid of all of it, because we won’t revert.

We should revert. We won’t. How about you?