Tozier and I agree that, this time, we should build an object of some kind to contain all the odd behavior that we need for publishing things to my site, including finding the authentication information, creating necessary folders, and moving files. Last time we had this behavior embedded procedurally in our single JekyllRunner object and while it worked, it wasn’t a particularly good design. We’ll try to do better this time.

We talked about authentication last time. Tozier is troubled by having the passwords somewhere in a file, and I certainly agree that having them be part of the web source folders, which they are right now, is dangerous. So I’ve moved them to a private location on my laptop, in the folder /Users/ron/programming/passwords.rb. I’m mentioning that in case you want to break into my house and find those passwords. In our program today, we’ll just require in that file and use the same scheme we had before, which looks like this:

module Passwords
  TEST_USER = 'test_user_name'
  TEST_PASSWORD = 'test_password'
  PROD_USER = 'web_site_user_name'
  PROD_PASSWORD = 'web_site_password'
end

We’ll see shortly how we access this info.

It’s time for a test. Much as I hate this answer, I believe that our first test has to be logging in.

“Why do you hate this, Ron?” you ask. I hate it because I think early stories should focus on actual behavior, and no one really wants to log in. But in this case, I can’t see any sense to starting somewhere else. So here goes …

  def test_ftp_login
    content_location = @test_jekyll_output_folder
    publication_location = Pathname.new('../site_builder_ftp_root/httpdocs/')
    host = 'localhost'
    mode = 'TEST'
    publisher = Publisher.new(content_location, publication_location, host, mode)
    assert_equal(
        '/Users/ron/programming/site_builder_ftp_root/', 
        publisher.pwd)
  end

Here’s our guess at our first test. We’ve set up a new folder, /Users/ron/programming/site_builder_ftp_root, and, to mimic what we know happens on the publication site, we’re going to put everything we move into the httpdocs folder below that point. We’ve tested what the real FTP object will receive when it’s opened, and it will have pwd equal to /Users/ron. We also know that when we log in on the real site, pwd will be /. We plan not to use this information, but to save it. We do feel the need to know it so that we’ll be as confused as possible.

Note that we (well, I, over protests) decided that what we’re building is a Publisher. We think this test is close enough to drive some development. Here goes. We learn a bit and revise the test:

  def test_ftp_login
    content_location = @test_jekyll_output_folder
    publication_location = Pathname.new('../site_builder_ftp_root/httpdocs/')
    host = 'localhost'
    mode = 'TEST'
    publisher = Publisher.new(content_location, publication_location, host, mode)
    assert_equal('/Users/ron', publisher.pwd)
    publisher.close
  end

And we make it run with this class:

# Publisher

require 'net/ftp'
require '/Users/ron/programming/passwords.rb'

class Publisher
  def initialize(content_location, publication_location, host, mode)
    @content_location = content_location
    @publication_location = publication_location
    user = Object.const_get('Passwords::' + mode + '_USER')
    password = Object.const_get('Passwords::' + mode + '_PASSWORD')
    @ftp = Net::FTP.new(host, user, password)
  end

  def pwd
    return @ftp.pwd
  end

  def close
    @ftp.close
  end

end

This test runs correctly. We even pointed it at my site, and got the error we expected, because my site’s root pwd is /. So we think we have a connection and can start moving some files.

We think our next test will be to move one file, into a folder that does not exist. Shortly after that works, we’ll do one where the folder does exist, because we need to be sure we don’t wipe out existing folders. And we’ll test moving a file that exists, to be sure that we over-write it. One thing at a time.

We discuss whether this is the best first test. T points out that moving a file to the root is simpler. Yeah, it is. Small steps. Let’s do that:

  def test_ftp_root_file
    clear_output_folder
    put_file_in_root
    publisher = create_test_publisher
    publisher.publish(pathname_of_root_file)
    assert(pathname_of_published_file.exist?)
  end

There’s our sketch. Now to fill it in. Well, that took some time, and we find it easy to get confused. Here’s what we’ve got tho, and it runs:

  def test_ftp_root_file
    clear_output_folder
    put_file_in_root
    publisher = create_test_publisher
    pathname_of_root_file = @test_jekyll_output_folder + 'index.html'
    pathname_of_published_file = Pathname.new('/Users/ron') + @test_publication_location + 'index.html'
    publisher.publish(pathname_of_root_file)
    assert(pathname_of_published_file.exist?)
  end

It’s driving this class:

# Publisher

require 'net/ftp'
require '/Users/ron/programming/passwords.rb'

class Publisher
  def initialize(content_location, publication_location, host, mode)
    @content_location = content_location
    @publication_location = publication_location # relative to ftp root
    user = Object.const_get('Passwords::' + mode + '_USER')
    password = Object.const_get('Passwords::' + mode + '_PASSWORD')
    @ftp = Net::FTP.new(host, user, password)
    @ftp_root = @ftp.pwd
    @ftp.chdir(@publication_location.to_s)
  end

  def publish(content_pathname)
    # input is @content_location + 'index.html'
    # desired is @publication_location + 'index.html'
    # BUT ... there could be folders after @content_location
    # and if there are, we want those included.

    # right now our connection is in /User/ron 
    @ftp.putbinaryfile(content_pathname.to_s)
  end

  def pwd
    return @ftp.pwd
  end

  def close
    @ftp.close
  end

end

You can see some notes we wrote in the publish method, because we were so confused. The issue is, again, that we have test folders in a “convenient” location far away from the root of localhost, and our FTP object starts out pointing at the host root. Then we have a desired sub-folder where things should go. We chose httpdocs in both cases, hoping this would simplify our thinking.

We fumbled a bit and decided that the thing to do is to create our FTP connection so that it points to our target sub-folder. That’s done in the initialize, though we had at first done it on the fly as we moved our file.

We are still not creating any sub-folders under httpdocs, which we’ll leave for another day.

Tozier observes that while we want to think that there are two corresponding hierarchies in the source and target for the FTP, FTP doesn’t really think that way. For example, note this line:

    @ftp.putbinaryfile(content_pathname.to_s)

We didn’t even tell @ftp where to put the file. It puts it wherever it happens to be pointing at the moment. We had it pointing into httpdocs and since our first test file is local, that’s all good. Now it turns out that you could provide a path as the second parameter there, and if you did, FTP would put the file down that path, relative to wherever it’s pointing.

This may actually be a good thing. We will have pathnames in our source folder that are in fact supposed to wind up in the same location relative to httpdocs, so if we could prefix that putbinaryfile with something that made sure all the folders were there, it would “just work”.

This makes me believe that while Tozier and I are confused, the code is not. The code is incomplete but may well be structured just right for the next step.

In any case, we’re done for the day except perhaps for a bit more retrospecting. Or not.

We are having trouble seeing why this should be a simple tree-copy, but what we have here makes us believe that it’s within reach. Maybe we’re lumping too much into those paths. Maybe they need to be partitioned … into something like “where the operating system has ron’s files”, plus “where the project happens to be”, plus “where the file is in the logical tree”.

One way or another, however, this is for another day. At least one of us is fried. In fact, I started out fried. See you next time.

Tuesday, 24 Oct

It’s next time. Fall has arrived, if not with a vengeance then at least with what seems like malicious intent. It’s damp and cool and overcast enough to be almost dark out at 9 AM here at BAR. Tozier’s not here yet, which is good by me, as I like a little time to settle in before we get down to it.

Last time, we agreed that there’s something wonky about our Pathname choices. This has been a perennial problem right along, and this has to be considered seriously. When the same kind of issue pops up again and again, I try to treat it as a signal that there’s something wonky in my thinking. Sure, sometimes the world is wonky, but in my experience it’s far more likely that the world makes some kind of sense and I haven’t quite worked out what it is.

Our thought last time was that our paths seem to have at least two aspects. One is the detailed pathname of the file or folder in question, and the other is some kind of base or root address. Because of the way the folders and FTP objects work, the base address can be quite long. On my computer, with the tests we’re writing right now, a new FTP object has '/Users/ron' as its base, and we determined that on my web site, we’ll see '/'’ as the base.

In our tests, we want to publish files via FTP to 'programming/site_builder_ftp_root/httpdocs/', inside 'Users/ron'. On the web site, files should go to '/httpdocs', that is, to 'httpdocs' inside '/'.

Tozier describes three pieces of concern:

  1. the path where the FTP opens up, the ftp_start;
  2. the top of the area where we’re working remote_working_folder;
  3. the detailed path to a specific file we want to publish.

It seems that if we “knew” we were positioned at the second one, and if our manifests of work were based at that point, it should become “easy”.

Tozier asks whether we should change what we have or whether this is just a thing to watch for. My answer is that the code should reflect our understanding, so we should see if we can put this idea into the code now (before we forget it yet again).

With these new definitions:

  def setup
    @test_source_folder = Pathname.new('../site_builder_test_source')
    @test_jekyll_output_folder = 
      Pathname.new('/Users/ron/programming/site_builder_jekyll_output')
    @ftp_start = Pathname.new('/Users/ron')
    @remote_working_folder = 
      Pathname.new('programming/site_builder_ftp_root/httpdocs/')
  end

We modify our one file-moving test:

  def test_ftp_a_root_file
    clear_output_folder
    put_file_in_root
    publisher = create_test_publisher
    pathname_of_root_file = @test_jekyll_output_folder + 'index.html'
    pathname_of_published_file = @ftp_start + @remote_working_folder + 'index.html'
    publisher.publish(pathname_of_root_file)
    assert(pathname_of_published_file.exist?)
  end

This runs. However, the pathname_of_published_file is not following our desired new intention. Since it’s at “root” in both the source (jekyll output folder) and target (FTP remote working folder), we shouldn’t have to provide all that upstream path information. We want to say something like “I’d like to make an FTP connection between this local working folder and that remote folder”, and thereafter just say things like “move index.html” and have that automatically mean to move the file from the local working folder (and some provided subpath) to the remote folder (plus the provided subpath).

So let’s refine the test to say that and then make the code do it. Along the way, let’s try not to stuff a file named index.html into some random place in my computer. Maybe if we gave it a very unique name, that would help us find it after it inevitably happens. And we’ll be really careful.

We want our test to look like this:

  def test_ftp_a_root_file
    clear_output_folder
    put_file_in_root
    source_root = @test_jekyll_output_folder
    published_root = @ftp_start + @remote_working_folder
    publisher = create_test_publisher(some_information)
    pathname_of_file_to_move = 'index.html'
    publisher.publish(pathname_of_file_to_move)
    assert(full_pathname_of_published_file.exist?)
  end

So we’ll create a publisher that has all that separate information – we think – and then when we move a file, as in the last two statements before the assert, we’ll just send in the path that’s local to the two roots. The publisher’s responsibility is to be ready. And we’ll have to come up with one new value, the full_pathname_of_published_file, to test against.

We revise Publisher first, because we needed to think about what it needs:

  def initialize(source_root, remote_working_folder, host, mode)
    #of interest to us:
    #  source root (a full path on local computer)
    #  remote working folder (relative to what FTP system provides)
    @source_root = source_root
    @remote_working_folder = remote_working_folder # relative to ftp root
    user = Object.const_get('Passwords::' + mode + '_USER')
    password = Object.const_get('Passwords::' + mode + '_PASSWORD')
    @ftp = Net::FTP.new(host, user, password)
    @ftp_root = @ftp.pwd
    @ftp.chdir(@remote_working_folder.to_s)
  end

We have also disabled it so that it no longer moves any files. Now to set it up. We think this is what we want. I’m putting it here now so when it fails you can share our embarrassment, I mean learning.

  def test_ftp_a_root_file
    clear_output_folder
    put_file_in_root
    source_root = @test_jekyll_output_folder
    published_root = @ftp_start + @remote_working_folder
    publisher = create_test_publisher(
      source_root, @remote_working_folder)
    pathname_of_file_to_move = 'strange_index.html'
    publisher.publish(pathname_of_file_to_move)
    full_pathname_of_published_file = published_root + 'strange_index.html'
    assert(full_pathname_of_published_file.exist?)
  end

  def create_test_publisher(source_root, remote_working_folder)
    host = 'localhost'
    mode = 'TEST'
    Publisher.new(source_root, remote_working_folder, host, mode)
  end

Only a few moments later – he claimed – we have our Publisher working correctly so far. It looks like this:

# Publisher

require 'net/ftp'
require '/Users/ron/programming/passwords.rb'

class Publisher
  def initialize(source_root, remote_working_folder, host, mode)
    #of interest to us:
    #  source root (a full path on local computer)
    #  remote working folder (relative to what FTP system provides)
    @source_root = source_root
    @remote_working_folder = remote_working_folder # relative to ftp root
    user = Object.const_get('Passwords::' + mode + '_USER')
    password = Object.const_get('Passwords::' + mode + '_PASSWORD')
    @ftp = Net::FTP.new(host, user, password)
    @ftp_root = @ftp.pwd
    @ftp.chdir(@remote_working_folder.to_s)
  end

  def publish(content_pathname)
    @ftp.putbinaryfile((@source_root + content_pathname).to_s)
  end

  def pwd
    return @ftp.pwd
  end

  def close
    @ftp.close
  end

end

Seeing the close method, I have the feeling we haven’t closed our Publisher. After we do that, we need a new test, or to extend this one, to deal with something in a sub-folder.

I’m also concerned about folder creation, which we’ll probably hit in making the subfolder test run. I think we can count on our root target folder, httpdocs existing.

Moving along, we create this test:

  def test_ftp_a_non_root_file
    clear_output_folder
    put_file_in_subfolder
    source_root = @test_jekyll_output_folder
    published_root = @ftp_start + @remote_working_folder
    publisher = create_test_publisher(
      source_root, @remote_working_folder)
    pathname_of_file_to_move = Pathname.new('sub_folder/substrange_index.html')
    publisher.publish(pathname_of_file_to_move)
    full_pathname_of_published_file = published_root + pathname_of_file_to_move
    assert(full_pathname_of_published_file.exist?, "file didn't get created")
    publisher.close
  end

With a corresponding creation of a new file:

The test fails because the FTP can’t find the folder. (Actually it first failed because we hadn’t created the source file correctly.) Anyway, we extend Publisher:

  def publish(content_pathname)
    target_path = content_pathname.dirname
    @ftp.mkdir(target_path)
    @ftp.putbinaryfile((@source_root + content_pathname).to_s)
  end

We’re just going to tell FTP to create the necessary subfolders. We are, for now, assuming that it will do so recursively (which Ron doubts) and that it will do so non-destructively. (Ron expects an error saying folder already exists as soon as we do a second file in this folder.) We’ll see …

We get lots of errors saying “File Exists” from FTP, and we find two issues. First, our mkdir will try to create the root folder that we’re in, when we send our file to the root. (The dirname returns “.” in that case.) Second, ‘mkdir’ seems not to want to create a folder that already exists, which will give us problems when we put two files in the same folder.

However, our first step needs to be to remove the test FTP output folders and set up afresh. We know how to do this … sort of. We wind up with this working:

  def publish(content_pathname)
    target_path = content_pathname.dirname
    puts "target path/#{target_path}/"
    @ftp.mkdir(target_path.to_s) unless target_path.to_s == '.'
    @ftp.chdir(target_path.to_s)
    @ftp.putbinaryfile((@source_root + content_pathname).to_s)
  end

We scratch our heads, though, at why we have to go through all this crap to make FTP do what we want. And we also have one more issue to deal with, which is a second file in the folder we just made. I expect that to fail. Let’s extend our test:

  def test_ftp_non_root_files
    clear_output_folders
    put_files_in_subfolder
    source_root = @test_jekyll_output_folder
    published_root = @ftp_start + @remote_working_folder
    publisher = create_test_publisher(
      source_root, @remote_working_folder)

    pathname_of_file_to_move = Pathname.new('sub_folder/substrange_index.html')
    publisher.publish(pathname_of_file_to_move)
    pathname_of_second_file_to_move = Pathname.new('sub_folder/secondstrange_index.html')
    publisher.publish(pathname_of_second_file_to_move)

    full_pathname_of_published_file =
      published_root + pathname_of_file_to_move
    assert(full_pathname_of_published_file.exist?, "file didn't get created")
    full_pathname_of_second_published_file = 
      published_root + pathname_of_second_file_to_move
    assert(full_pathname_of_second_published_file.exist?, "second file didn't get created")
    
    publisher.close
  end

With some fiddling that we’ll see in a moment, this code now fails because we’re trying to create the subfolder twice. The first file goes OK and the second time, FTP fails trying to create the folder. We had some really strange code in our old version that dealt with that situation. It looked like this:

  def ftp_folder_exists?(ftp, folder_string)
    split = folder_string.split("/")
    proposed_folder = split.last
    ppl = split[0...-1]
    proposed_prefix = "./" + ppl.join("/")
    ftp.list(proposed_prefix).any? { |name| name.match(proposed_folder) }
  end

If we understand what that abomination does, it is looking into the proposed folder to see if there are any files there. If there are, it doesn’t try to create the folder. Seems fair enough. All that splitting is there to deal with what we now do with Pathname::dirname.

We wind up with this in Publisher:

  def publish(content_pathname)
    target_path = content_pathname.dirname
    @ftp.mkdir(target_path.to_s) unless path_exists?(target_path)
    @ftp.chdir(target_path.to_s)
    @ftp.putbinaryfile((@source_root + content_pathname).to_s)
    @ftp.chdir(@ftp_root)
    @ftp.chdir(@remote_working_folder.to_s)
  end


  def path_exists?(path)
    return true if path.to_s == '.' 
    parent = path.parent
    base = path.basename
    things = @ftp.list(parent.to_s)
    things.any? { |name| name.match(base.to_s) }
  end

This is hideous, and it works so far. I have serious doubts about sub-sub-folders. In the previous version, our “manifest” always included the names of the folders first, top to bottom, and we created them from that part of the manifest, a, then a/b, then a/b/c. Here we don’t have that. At least not now.

A test of a subfolder inside sub_folder will tell the tale. Arguably, we could parse our existing manifest and create the tree of stuff to create, if we have to. Let’s extend this test one more level and see what happens.

  def test_ftp_non_root_files
    clear_output_folders
    put_files_in_subfolder
    source_root = @test_jekyll_output_folder
    published_root = @ftp_start + @remote_working_folder
    publisher = create_test_publisher(
      source_root, @remote_working_folder)

    pathname_of_file_to_move = Pathname.new('sub_folder/substrange_index.html')
    publisher.publish(pathname_of_file_to_move)
    pathname_of_second_file_to_move = Pathname.new('sub_folder/secondstrange_index.html')
    publisher.publish(pathname_of_second_file_to_move)
    pathname_of_third_file_to_move = Pathname.new('sub_folder/child/childindex.html')
    publisher.publish(pathname_of_third_file_to_move)

    full_pathname_of_published_file =
      published_root + pathname_of_file_to_move
    assert(full_pathname_of_published_file.exist?, "file didn't get created")
    full_pathname_of_second_published_file = 
      published_root + pathname_of_second_file_to_move
    assert(full_pathname_of_second_published_file.exist?, "second file didn't get created")
    full_pathname_of_third_published_file = 
      published_root + pathname_of_third_file_to_move
    assert(full_pathname_of_third_published_file.exist?, "third file didn't get created")
    
    publisher.close
  end

Here, we’ve created a sub-sub-folder with a file in it. We publish our files in top-down order, and the Publisher handles it correctly. It works because of the coincidence that we only made a one-level jump down. If we had tried to make something like grandparent/child/grandchild/index.html without first putting something in child, that would, we think, fail. The mkdir operation of FTP is not the same as the mkpath operation of Pathname: it can just make a folder one level down and it’s not even smart enough to return happily if the folder is already there.

We are sore tempted to build all the right functionality into our Publisher, but that is for another day. For now the tests are all running and we are tired.

Summing up we note that we’re sort of back where we started, in that we still have a horrendous kludge dealing with FTP's shortcomings. But it feels a bit better, in that we now have the kludge code embedded in an object we own, rather than spread out inside our JekyllRunner. Of course, we don’t even have a JekyllRunner yet, but when we build it, we won’t have to contaminate it with FTP concerns. So that’s nice, as the meme says.

Probably it’s a good idea to extend Publisher to have a mkpath capability, like that of Pathname. If we had that, then each time we were given a Pathname to publish, we’d just use mkpath on the front of it to make sure we had a place to stand, and them stand there and move the file. This is sort of happening now, but relies on our calling Publisher with paths in the right order (lexical order top to bottom, roughly). So next time maybe we’ll look at doing a mkpath kind of thing, which I think I’d call, informally at least, “make path safely”.

See you next time!