Departing from the Script: Allen organ project - a software card reader [UPDATED]

Earlier posts in this series:

Allen organ project - introduction

Allen organ project - reverse engineering the card format

First things first. I can live without alterable voices for a short while. But these IBM cards are getting yellowed and brittle. They're 35 years old, and it's not going to be many more years until they crumble to dust. How to get the data off them? Transcribing by hand, the way that I did the first couple is not going to work. I know myself, I'll make too many errors and get eyestrain. There has to be a better way.

I don't want to make any electical modifications to the organ until I know more about it. I don't want to find myself frying some key irreplaceable part. So cobbling some interface to the card reader on the organ is Right Out. Similarly, the handful of services that are still out there who will read card decks are getting Really Expen$ive - I don't want to go that way either. So, what do I have that will read cards?

How about my document scanner? It'll read anything. (No, I haven't scanned my nether regions. It's been a long time since I was a college kid, and about as long since I'd have thought that was funny.) But I have experimented with scanning photographic negatives (only mixed success), coins (too much reflection), and pressed flowers (left residue that was hard to scrub off the glass). Maybe I'll have better luck with IBM cards.

Immediately, I make the executive decision to put the card on the glass face up. All the printing on the card will add visual clutter and confuse any image recognition that I try to do. The back of the card is cleaner, albeit not nearly pristine. But let's give it a try. I also decide to try to give the card a clean, contrasting background. I riffle through my stash of coloured papers and pull out a sheet in robin's-egg blue that contrasts nicely with the yellowed paper of the card. I set the scanner for color at 150 dpi resolution, push the button, and out comes:

Not bad, there's plenty of contrast. There are also a lot of specks of schmutz on the card. Let's hope that doesn't make for a bad read.

Believe it or not, from here things ran pretty much in a straight line. Everything I tried worked once it was debugged. (Of course, I have had to do computer vision at several points in my career, so my educated guesses about what to try are usually pretty good.) Here's an outline of how the program took shape:

1. Segment the image.
I used the k-means algorithm to divide the pixels into three categories: cardboard, background and schmutz. To seed the clusters, I started with an initial guess that cardstock is yellow, background is cyan, and schmutz is black. The algorithm converged quickly (only about ten iterations), and resulted in a segmentation that looked quite usable:

(In the image, white is cardboard, black is background, grey is everything else.) Again, it looks usable; the vast majority of pixels are classified correctly.

2. Extract edges from the image.
Given the segmentation, extracting the edges simply means highlighting pixels that differ from their neighbours above or to the left. Easy.

3. Find straight lines among the edges.
We're trying to find a frame of reference for the card so that we can lay the grid of holes over it, despite the fact that the position on the scanner glass was uncertain. A first step is to find straight lines. To do this, I use an integral transform called the Hough transform.

The Hough transform works by shooting bundles of parallel rays through the image, at angles ranging from zero to π. It counts the number of pixels marked as being 'edges' on each ray.

Update: One colleague called me on the lack of mathematical rigour in this discussion. Another praised me for an immediately-graspable physical analogy. To the first, I say: I could have begun the discussion with a sentence like, 'The Hough transform is the integral,

.'
But I didn't."

The result is a plot of pixel counts, with the horizontal axis being the angle of the ray and the vertical axis being the distance of the ray from the centre of the image. The rays that go right down the edges of the card have a great many edge pixels on them, and show up as brilliant white spots in the plot in Hough space.

In this plot of a card in Hough space, you can see the angle of the short edge of the card as a vertical column of points near the center of the image. The short edges of the card are represented by brilliant white spots near the top and bottom. You can also see the spots representing other vertical lines on the card (vertical rows of holes) in the same column, giving us a way to measure hole spacing, since the vertical axis is measured in pixels. The long edges of the card fall in a column near the right-hand border, and once again you can see other lines (the card rows) of 'edge' pixels in the same column (and therefore representing parallel rays). These are the rows of the card, and again give us a convenient ruler for laying out our grid.

4. Lay out a grid parallel to the card axes.
Given the maxima in the Hough image, which represent the card edges, it's a simple matter of trigonometry to locate the card corners and create a grid to search for holes, given the known dimensions of an IBM card.

Update:A colleague asked me how robust this process is in the face of card misalignment. I took a near-worst-case misalignment (the card canted about 30° from the correct angle), and did a quick hack to the code to overlay the computed grid in red. It's not perfect, but that's mostly because the card edges are modelled as parallel lines - and are not quite parallel (or indeed, quite straight) in real life.

5. Count up how many 'background' pixels are in each hole location, and threshold on the background-to-total-pixel ratio.
This operation gives us the location of the holes. And it worked first try! (Well, once I fixed the typos in the dimensions of the card, it worked. The algorithm was fine, the numbers were wrong. Garbage in, garbage out.) As a little bit extra, I added a decoder for the Hollerith data at the right side of the card (Allen's part number, plus some number whose significance so far eludes me.) The resulting program prints out an ASCII picture of the card, and saves the 16 words of binary data.


         11111111112222222222333333333344444444445555555555666666666677777777778
12345678901234567890123456789012345678901234567890123456789012345678901234567890
--------------------------------------------------------------------------------
      8  8  8  8  8  8  8  8  8  8  8  8  8  8  8  8               20.0 S00R1305
                                                                     ]          
                                                                           ]    
              ]        ]        ]  ]     ]        ]        ]        ] ] ]]]   ] 
     ]     ]  ]  ]     ]           ]           ]  ]        ]                ]   
     ]  ]  ]        ]  ]        ]        ]     ]           ]       ]    ]       
           ]     ]     ]  ]     ]     ]        ]  ]        ]         ]       ]  
           ]           ]     ]  ]  ]        ]     ]        ]                    
     ]           ]     ]              ]  ]        ]        ]                   ]
     ]           ]     ]        ]     ]           ]        ]                    
                                                                                
      ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]                 ]          
     ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]  ]                        ]    
--------------------------------------------------------------------------------

Beautiful, it exactly matches the punches. Now to try recovering more cards!

Departing from the Script

Monday, July 6, 2009

Allen organ project - a software card reader [UPDATED]

No comments:

About Me

Blog Archive

Followers