The Case Against Writing Backlogs

If you have things you want to write about, I’ll make a case against keeping a large backlog.

Immediacy and Inspiration

It’s more useful to write about experiences at a recent conference right now instead of two months from now. The time delay not only dampens memory, it also weakens excitement. It definitely helps to write about things when I am first excited or think about them, because once the enthusiasm fades it feels more difficult.

Topics that I once thought were fascinating are no longer so after I have been exposed to them for awhile. Strike while you are inspired with learning, because after that energy passes, you are less likely to be interested in it because you have already internalized the concepts. It’s hard to get fired up about something that you see as obvious. Here is a great summary of this idea.

I internally model this as your brain being in roughly a steady state. Something comes around that shakes up your mental models, and the resultant attempt to put your brain back into equilibrium causes a great deal of energy to be put off. After this happens, it’s tough to recreate the level of energy that happened, and it’s also hard to difficult to remember what your brain state previously was.

This goes back to Steve Pavlina’s concept of writing within 48 hours of getting the idea:

I don’t maintain a list of article ideas, I don’t actively brainstorm ideas in advance, and I generally don’t ask for suggestions. I’ve done all of those things in the past, but they don’t work well for me in practice. At one point I had a list of about 200 new article ideas. When I scanned it for something to write about, I was usually bored by everything on it.

If I get a suggestion from someone for a new article, I’ll normally write about it that same day if it excites me. Otherwise I simply let it go. Ideas by themselves have no value to me. There’s an infinite supply of ideas. The present-moment inspired ideas are the ones worth exploring.

Inspirational energy has a half life of about 24 hours. If I act on an idea immediately (or at least within the first few hours), I feel optimally motivated, and I can surf that wave of energy all the way to clicking “Publish.” If I sit on an idea for one day, I feel only half as inspired by it, and I have to paddle a lot more to get it done. If I sit on it for 2 days, the inspiration level has dropped by 75%, and for all practical purposes, the idea is dead. If I try to write it at that point, it feels like pulling teeth. It’s much better for me to let it go and wait for a fresh wave. There will always be another wave, so there’s no need to chase the ones I missed.

Why Writing Backlogs Are Harmful

One problem with maintaining a backlog of things to write about is that the overhead of managing all of those ideas. There is too much work in process, and thinking about too many things causes thrashing. This reminds me of my post on limiting reading work in progress and books in progress.

Keeping a backlog of writing ideas becomes especially problematic when a long lead time between when the ideas are written down and when they are implemented occurs. Often I’ll forget what I was thinking about when I wrote a nugget for a post seed. Or I will read a book and have to re-read parts of it to get the context back. This process causes waste.

It’s certainly possible for cutting-edge ideas to become stale over time. This is also a form of waste.

Of course, it can be tough to get important things done while making time to write. Queueing by utilizing a backlog is one potential path, but a better solution seems to be more frequent and regular writing. By just sitting down and getting the ideas out there right after a new association is made, better writing can actually happen with less effort.

I think having less of a backlog contrasts a bit with the Fieldstone Method of Writing by one of my favorite authors, Jerry Weinberg. However, there seems to be some overlap, in that action with inspiration is easier than action without inspriation.

Do you find that you have too many ideas for posts, or too few? If too few, do you want some ideas? :)

Using Scalpel to Recover Lost Data in Ubuntu

So there I was, editing my personal writing journal. I realized that the file had somehow lost a large chunk of data, and only had the last few entries. My backups had the same information, so I was staring at six months of data loss on an important personal file.

This post covers how I got the data back, good as new.

Is the disk bad?

The first thing I did was to search through the hard drive to see if the text was lying around somewhere in another file or I had deleted a vim swap file. When you use vim (in my case, gvim), by default it creates a file when you are editing it. Then, if the machine goes down while you are editing it, you can restore the file from the autosaved swap file. Unfortunately, I didn’t find the data on my hard drive after searching around.

The disk itself didn’t seem to be having any problems. There were no audio indications of failure, and no other files missing to the best of my knowledge. However, I figured that I should run a disk check to ensure that I wasn’t dealing with the early stages of more widespread data loss. Tools like fsck require that you run them when the drive is not mounted, so I needed to find a way to unmount the drive and run it. Since I was running Ubuntu, I found a helpful command:

sudo touch /forcefsck

This tells Ubuntu to run fsck on the next startup, before the file system is mounted. I ran this, and the file system appeared to be the same.

So how did I lose data?

The file is my 2011 journal, and I really wanted to get it back. My best guess as to how this happened was:

  1. I somehow delete most of the file

  2. I unwittingly save the file, and go to bed for the night

  3. My mirrored backup runs, syncing the bad version of the file to my shared server

  4. I look at the file and see it’s corrupt, and that the backup is corrupt as well

An alternate version includes some sort of corruption or power failure at a critical time that did something nasty.

I knew that the nightly mirroring operation had a failure mode in the event that I deleted or modified a file and wanted to get at the previous contents, but the effort to do a more complicated backup did not seem worth it entirely. I just wanted something to protect against catastrophic data loss (fire, hard disk crash) or something to handle minor goofs. In retrospect, I probably spent as much time restoring the file as I could have just coming up with a better backup solution.

Knowing a bit about how file systems and hard drives worked, at this point I reasoned that the portion of the file that I deleted might still be around somewhere in an unindexed area of the hard drive. Basically the file system has a list of pointers to blocks, and if one of those pointers got messed up, or an old version of the file was lying around, I might be able to get at least some of the file back. Plus, I had some small snippets still in Gmail, so it wouldn’t be a complete wash either.

If it was the case that there was recoverable data on the hard drive, I didn’t want to write to it in case I overwrote that data. Hence, I needed to search the hard drive, preferably without mounting it.

Scalpel

I messed around with USB installations of Ubuntu, but this was a rabbit hole. Ended up burning an Ubuntu Live CD, and I must say, it was fantastic working with it.

From here, the tool I primarily used to recover the data was Scalpel, which is a tool to look through the hard drive for patterns of characters that delimit the start (and optionally, stop) of what I might be looking for. This allows you to go through the entire disk, not just what the operating system thinks is there.

Scalpel operates in two passes. First, it looks at the whole hard drive for any of the start markers. It remembers these, and then on the second pass it just goes to the start markers and takes the lesser of:

  • the number of bytes that you say to stop at, OR

  • an ending delimiter if you proided one

It has some clever backtracking in case you know the end of the file and not the beginning, but this is not the place for explanation of that (mostly because I didn’t use it myself.) See the documentation.

Anyway, I knew a line near the very beginning of the file, so figured this would be a good place to start. One nice thing about this file was that it had date/time stamps throughout it in the format “YYYYMMDD - HHmm” on lines by themselves. In this way, I could easily see what portions of the file that I had recovered and make sure that I didn’t miss anything.

Scalpel doesn’t search for anything at first. You need to specify what to look for in a configuration file. The global configuration file (fine for now) is at /etc/scalpel/scalpel.conf. The line below is one that was given as an example in the configuration file.

#       txt     y       100000  -----BEGIN\040PGP

I uncommented this line and let Scalpel do its work with:

$ sudo scalpel /dev/sda1

It printed a status bar, and after an hour or so dumped the results to a subdirectory of the current working directory.

I looked at the files it found, and they all started with “—–BEGIN”, so were basically PGP keys. This wasn’t what I wanted (I thought it would return me all text files.) It was a useful experiment though, as I was able to then refine the query to:

        txt     y       100000  This\sis\sthe\spersonal

The “txt” was not important at all really, it could have been

        foo     y       100000  This\sis\sthe\spersonal

The first field is basically just what file extension you want to put on these fragments. Many of them would be marked as binary files, which was not a problem. I forget what the second field is. The third is the number of bytes to look for after you find the starting point. This should be about as big as you think the file is when you are in recovery mode so that you don’t exceed your space to write the output files. This happened to me once, and I needed to change the size to be smaller. No files were outputted, so it was a waste of an hour of Scalpel time.

I ran the command again, and it output a bunch of files. At this point, I then took all of the files and found the ones with the most useful fragments, and then by hand compiled them. Obviously this was made a lot easier by having time/date stamps all through the file. Another thing that made my recovery easier was the fact that Vim had written swap files all over the place, and then I typically deleted them upon finding them. So they were around on the hard disk, but not accessible. If I only had one version of the file, it might have been tougher, but still doable.

I made it look pretty simple here, but it took awhile in aggregate due to the long running times and therefore slow feedback cycle of running Scalpel. One tip is that you can email yourself when the command completes:

$ sudo scalpel /dev/sda1 && (echo "done" | mail -s "done" your@email.com)

smtp.rb:14: [BUG] Segmentation fault

I just fixed a problem that I was running into on my Mac development machine. Things were running fine in my production environment (Heroku) but when I tried to send mail locally, I got the following error:

> rails c        
Loading development environment (Rails 3.0.8)
ree-1.8.7-head :001 > MyMailer.daily_email.deliver
~/.rvm/gems/ree-1.8.7-head@mygemset/gems/mail-2.2.19/
  lib/mail/core_extensions/smtp.rb:14: [BUG] Segmentation fault
ruby 1.8.7 (2011-02-18 patchlevel 334) [i686-darwin10.7.0]
zsh: abort      rails c

Hmm. I indeed use SMTP to send mail, but nothing too crazy. Plus it worked in production, and up until recently it was working locally. For more background, I was also using the Sendgrid plugin.

The first thing I thought of was googling for the error message that I found (“mail-2.2.19/lib/mail/core_extensions/smtp.rb:14”). This produced some useful links, but nothing that fixed the problem.

One option was to disregard the error, but I was trying to fix some email layout issues and it would have lengthened my development feedback cycle considerably (push to production, then test manually.)

Next, I thought of upgrading the mail gem because perhaps there was a patch of some sort that fixed the problem that I ran into. However, I was using the ActionMailer gem, which required a certain version of mail that was the same in the version of Rails I was using (3.0.8) and latest (3.0.9). So this was not a viable solution path. I took a look at the offending Net::SMTP line (which I probably should have done a bit earlier) and there was nothing obviously wrong with it. The whole file looked like:

module Net
  class SMTP
    # This is a backport of r30294 from ruby trunk because of a bug in net/smtp.
    # http://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=rev&revision=30294
    #
    # Fixed in what will be Ruby 1.9.3 - tlsconnect also does not exist in some early versions of ruby
    remove_method :tlsconnect if defined?(Net::SMTP.new.tlsconnect)

    def tlsconnect(s)
      verified = false
      s = OpenSSL::SSL::SSLSocket.new s, @ssl_context
      logging "TLS connection started"
      s.sync_close = true
      s.connect                **************** crash line
      if @ssl_context.verify_mode != OpenSSL::SSL::VERIFY_NONE
        s.post_connection_check(@address)
      end
      verified = true
      s
    ensure
      s.close unless verified
    end
  end
end

I then made a more general search for “ruby net/smtp segmentation fault” and about half the page down ran into this helpful post. It’s pretty lengthy and the bug submitter is a champ for following through with it and providing as much detail as he did. I feel like I understand the whole Ruby / OpenSSL situation a lot better after reading through it. Plus, figured out where Ruby crash logs are on the computer. I ran into this problem before, but couldn’t figure out a good solution, and then it went away inexplicably (not much fodder for a writeup.)

Anyway, a comment toward the end of that post led me in the right direction. I ended up adding the following to my .zshrc for the Mac platform after testing the export manually.

This fixed the problem, and mail went through as expected in the development environment. My understanding is that Ruby was using a buggy version of OpenSSL, and this points it to use the correct version.

For more info on how RUBYOPT works, check out this post on RUBYOPT.

The State of Ruby and Testing

At the May 2011 Indy.rb meetup, I suggested creating a survey to figure out what versions of Ruby people were using, and what testing stacks they use and would like to use. I created this survey and tweeted it out, and was impressed with the results! Over a hundred people filled out the information, from several continents and numerous countries. Thanks to everyone who participated!

The questions and their results

  • What versions of Ruby have you ever tried out?
  • What versions of Ruby do you currently use in production or for real apps?
  • What testing frameworks are your active projects using?
  • If you were starting a new Rails project right now, what testing frameworks would you use?
  • What mocking/stubbing frameworks are your active projects using?
  • If you were starting a new Rails project right now, what mocking/stubbing frameworks would you use?
  • What do your active projects use to populate testing data?
  • If you were starting a new Rails project right now, what would you use for populating testing data?

What versions of Ruby have you ever tried out?

Summary: a wide variety of Ruby versions used. What the heck is kiji, you might ask? This was a useful post on kiji.


What versions of Ruby do you currently use in production or for real apps?

Summary: Mostly 1.8.7 and 1.9.2 in production use right now. REE is production-ready.


What testing frameworks are your active projects using?


If you were starting a new Rails project right now, what testing frameworks would you use?

Conclusion: Expect to see MiniTest in more production apps in the future.


What mocking/stubbing frameworks are your active projects using?


If you were starting a new Rails project right now, what mocking/stubbing frameworks would you use?

Conclusion: RSpec mocks are here to stay.


What do your active projects use to populate testing data?


If you were starting a new Rails project right now, what would you use for populating testing data?

Conclusion: Factory Girl and Machinist are going to remain popular. Rails fixtures are hanging around.


The Data Source

Here is the spreadsheet of results (with contact information and bad rows stripped). I didn’t spend very much time making killer visualizations, and there are some great comments in there. Highly recommended to get a feel for what people are using. Someone else could probably create a correlation table of common tool sets. For example, showing that people who use RSpec commonly use a certain mocking framework. Also, there is geographic info there, so we can see which region is on the cutting edge… :)

Edit: A note on the charts. People could respond with multiple answers, and I just tallied the answers for each category. For example, if 50 people said they used 1.8.6 and 1.8.7 and 50 people said 1.8.7, 1.8.7 would get 2/3 of the chart (100 “votes”), and 1.8.6 would get 1/3 (50 “votes”). The graphs could have been clearer, feel free to create a better visualization with the data. Thanks to the Hacker News commenters for bringing this up.

Logistics

I took the data, and filtered it with this script:

#!/usr/bin/env ruby

types = []
File.open(ARGV[0]).each do |line|
  line.strip!
  types += line.split(", ") unless line.length == 0
end

counts = Hash.new(0)
types.each do |type|
  counts[type.downcase] += 1
end

counts.each do |key, value|
  puts "#{key}\t#{value}"
end

Then, I put it in an Open Office spreadsheet, cleaned up the data a bit and sorted by the number descending. I should have probably outsourced this task, as it took awhile to get things looking right. It was terrible getting the chart images out of OO though. Had to copy to Draw, then export selection to an image file…

What did I miss?

Is your favorite testing framework not represented here? I left out some frameworks that only one person said. Check the spreadsheet for all of the data and some great comments. If you want to fill out the survey after having read this, go ahead and do so here. I might update this blog post at some point in the future or create a second one with updates…

What else could I have done, and do you want to be notified when future surveys take place? I’d imagine something in my survey process could have been improved. Leave a comment or email me at panozzaj@gmail.com!

Ruby Filter Script

It’s pretty easy to use a ruby script as part of a Linux or Unix pipe process to filter output of another script or set of commands. You can just use something like the following:

#!/usr/bin/env ruby

while line = STDIN.gets
  puts "filtered: #{line}"
end

The STDIN.gets is the magic. It just takes whatever the output of the preceding scripts and pipes it in. Then you can make the script executable and run it. If this script is named filter.rb, then you could run something like:

$> cat log.txt | ./filter.rb

If your log.txt file looks like:

line 1
line 2
line three

Then the results of this program will be:

filtered: line 1
filtered: line 2
filtered: line three

I wrote this up because I regularly search around for this but it was hard to find exactly what I’m looking for, so hopefully it will be easier the next time I search. :)