August 6, 2014
Playing around with Ansible

Configuration Management is all the rage, certainly makes bootstrapping machines easier.

In a previous life I used Puppet and came to hate it. Seemed like bringing a dump truck to a knife fight. Also dealing with the whole puppetmaster became a fiasco.

I’ve been hearing about Ansible and thought I would give it a spin. I like the idea of driving the operations from a machine (what I call “push”), which is in contrast to the Puppet flow where each node “pulls” down its config.

Anyways, I’ve started a repository which contains various playbooks I used to get started.

June 7, 2014
Rails & Quickbooks Web Connector (QWC)

Recently I have needed to integrate a Quickbooks Desktop instance with my Rails app. 

As of Summer 2013 Intuit has deprecated the REST API whereby a client can communicate with a Quickbooks Desktop instance via REST. This leaves either the QWC route or straight-up Windows COM programming. I’d rather hit my thumbs with a 10 lb. sledge than use COM. So QWC it is!

The QWC approach uses SOAP - which stands for Slippery Obtuse Alligator Playtime. Which is a perfectly apt description of the byzantine SOAP protocol.

Fortunately the WashOut gem does most of the heavy lifting. The hardest part was properly configuring the WashOut endpoint. Check out this Gist which contains the pertinent parts:

In my case I have Quickbooks Pro 2013 running in a VMWare instance on a OS X host. My Rails app is running on the OS X host. Thus we need to have a publicly accessible URL for the AppUrl in the QWC config. Which is really my Rails apps in development exposed to the World Wide Web. 

I have found the ngrok service to be fantastic for this. I run my Rails app locally on port 3000, then run `ngrok 3000`. I am given an https URL which I can then use in the `.qwc` file.

May 25, 2012
Saaspose: Cloud based document conversion & recognition

Appears to be a versatile service for barcode recognition and OCR. This could come in very handy.

May 17, 2012
Protip: Store file imports for better support / debugging

If you run a service which allows one to import data, say maybe by uploading a CSV file, than do yourself a favor and store the original file in a place where you can access it later for support or debugging.

Our service allows a person to upload a CSV file containing postal addresses and it has about 8 columns that need to be in a certain order.

Before our code starts to actually process the file we shove it up to S3 in its original form. Later during processing if an error is encountered than a support person can look at the original file and see any obvious errors. Maybe you expect a CSV file but get an Excel (or a Apple Numbers file)? Maybe the customer wasnt paying attention and totally re-arranged the columns. Sure, most of these “soft” errors (such as CSV vs Excel) can be caught pretty easy by inspecting the file extension, but as a whole you are guaranteed to run into trickier situations.

Either way, having the original file on hand for debugging is super helpful. Much better than emailing the customer and asking them to re-send you the file. In most cases we can be pro-active and email the customer back and say “Hi, I saw that you just attempted a data import and it failed, it looks like it was due to ….”

April 30, 2012
Postgres search: normalizing accented characters

As part of my move from Sphinx to Postgres Full-Text Search I needed a way to normalize accented characters. My data contains lots of diacritics, a common example is the varietal name “Grüner Veltliner”.

My users do not want to enter that Umlaut each time they want to search for this varietal.

Fortunately there is an awesome Postgres contribution package called “unaccent” which replaces diacritics with their plain text equivalent, effectively normalizing the data set.

Using the package is pretty straight-forward. We first install the extension and then create a new text search configuration and ensure that we use it in all indexing and searching.

When searching make sure we reference the new configuration instead of the default ‘english’:

April 26, 2012
Sidekiq: alternative to Resque for background processing

Mike Perham of Dalli fame has written an awesome library called Sidekiq which enables background processing via threads.

Resque is a popular background processing solution and its what we use at Batch. It’s battle tested and for the most part has stood up well.

However, it feels heavy-weight because each Resque worker is a distinct process. Which is where Sidekiq comes in: it uses a pool of threads to process jobs so at most you have one master process.

For smaller apps on constrained systems (read: VPS) I feel this can be a better solution. I’ve been using Sidekiq for a couple of weeks now and I really like it. Sidekiq is Resque compatible so you can write jobs with existing Resque code and then process them using new Sidekiq workers. But in my case I just went straight Sidekiq.

One major change is that in Resque your workers must implement a class method of perform, but in Sidekiq the perform method must be an instance method.

So far I am a fan of Sidekiq. Give it a shot!

April 24, 2012
Switched from Sphinx to Postgres Full Text

I’ve recently migrated two Rails projects from Sphinx search to Postgres Full Text Search. Mainly because the applications were small and I didn’t see the benefit of running another service, hence another point of failure.

In both cases the number of documents and the level of search activity were not very high, so it was not a question of load on the Postgres server.

In one of the applications searchable content has to be filtered for security and just treating Sphinx as a dumb search content repository was annoying. By moving it all into Postgres I can have the actual fuzzy text searching and the necessary security checks (by checking other tables) all in one place.

My general architecture / flow is:

* For each searchable table: add a search_content column of type tsvector

* Create a GIN index on the search_content column

* If the columns can be indexed as-is and we dont need any other searchable columns then we can use the native tsvector_update_trigger trigger to update the search index:

However, if the search content requires other tables then we need to write a manual function trigger. Don’t forget to use COALLESCE if any of your searchable columns can be NULL. If you forget this and allow NULLs to be creep in then it will make the whole tsvector  NULL and you’ll wonder why you have no searchable content.

Now we need to seed the initial search index. In this case I just leveraged the touch method in ActiveRecord.

Contact.all.each { |c| c.touch }

Finally, performing searches is pretty straight-forward. I wish there was a way that we can save from having Postgres parse and construct a query structure twice, but for these projects its a negligible cost.

March 9, 2012
Quickbooks + Ruby

Recently I needed to integrate my Rails app with Quickbooks. Looking around on GitHub I didn’t find any fresh libraries. 

Intuit has their Data Services API (currently at v2) which exposes a REST API to their customer data. Its a pretty clean API. Its annoying that it uses XML but Intuit is hard at work on v3 which will support JSON so thats nice.

But anyways, I didn’t find any good libraries so I decided to roll up my sleeves and write my own: Quickeebooks

It supports reading and writing of basic objects using the v2 API. 

I think it’s pretty easy to use, you work with plain Ruby objects which get marshaled to XML and back. 

Give it a shot and let me know what you think.

March 2, 2012
Bundler install error: ArgumentError: invalid byte sequence in US-ASCII

I upgraded my Rails app to 3.2.2 and was doing a deploy via Capistrano and during the bundler install process the deploy bombed out with the following error:

** [out] ArgumentError: invalid byte sequence in US-ASCII
** [out] An error occured while installing will_paginate (3.0.3), and Bundler cannot continue.
** [out] Make sure that `gem install will_paginate -v '3.0.3'` succeeds before bundling.

I think the reference to will_paginate is really a red-herring. Anyways, the solution is to properly set the default environment in Capistrano which gets passed to the environment that Capistrano+Bundler use in the deploy.

Specifically you need to set the LANG attribute.

Add this to your Capfile. Note that the PATH is not strictly necessary, but I had that block in there already.

February 21, 2012
Writing a basic image filter in Android using NDK

I needed to implement some image filters on bitmaps for my Android project. I first attempted the filters in pure Java but it turns out to be too slow and consume too much memory.

Bitmap handling in Android has always been a pain point and there is too much memory overhead.

To get access to the underlying pixel data from a Bitmap in Java you use Bitmap.getPixels() which returns an integer array of Color objects. There is just too much object creation there meaning too much overhead.

So the solution is to go native C.

For a proof of concept I wanted to try implementing a filter in native C using the Android Native Development Kit (NDK). My test case adjusts a bitmaps level of brightness. 

The complete source is available on GitHub:

First off you will need to download the Android NDK and place the directory somewhere handy, I put mine at /AndroidSDK/android-ndk-r7b along side my standard Android SDK.

In your Activity you will need to declare a method using the native keyword, this tells Android that the implementation for this method is in a C library, which we will generate shortly.

In the jni folder you will need any C files, I just have a single file, and a Makefile template.

Compile the library using the ndk-build command in the NDK:

If all goes well then it should compile cleanly and leave you with a shared object.

Crack open jni/imageprocessing.c and lets have a look. Notice that the method name we call is


Which is the complete Java package name plus the method that we annotated with the native keyword in the Activity. The first argument to this method is the JNI environment which Android supplies for us. Any additional arguments are the user supplied ones part of the signature in Java. Looking at the signature in Java:

public native void brightness(Bitmap bmp, float brightness);

We pass in the Bitmap and the brightness value.

Looking back at the Java Activity notice that C receives the Bitmap and writes back the modified pixels to the same object.

The Android NDK gives us some basic bitmap functions that we need to use to get at the actual pixel data:


Its important than any calls to AndroidBitmap_lockPixels are complemented with a AndroidBitmap_unlockPixels when you are done.

In the core filter method brightness the logic is the following:

1) Iterate over each row of pixel data.

2) After we grab the whole row we iterate over each column.

3) The pixel data is a packed integer which contains the actual RGB values

4) Since we need to manipulate each RGB value we extract the components out

5) We multiplty by the brightness factor and then constrain it to a value between 0 and 255

6) Then finally we write the modified pixels back into the structure.

Some before and after screenshots: