Friday, December 19th, 2008

Groovy and Python for Quick Scripting Wins

By Avi Flax

I recently had a need to convert a PDF file into a Base64-encoded string, quickly — as in, within the next few minutes.

I had recently been writing some scripts in Python, both to accomplish some actual work, and to teach myself Python. I like a lot about Python, and the “batteries included” nature of its standard library makes it great for scripting.

So I launched TextMate, started typing a script, and, after a few quick Google searches, whipped this up:

import binascii
sourceFile = open('documents.pdf')
targetFile = open('output.txt', 'w')
targetFile.write(binascii.b2a_base64(sourceFile.read()))

That worked out really well. A couple of lines of code, a few quick searches, and after a few minutes, I’m done. Nice!

The next day, it occurred to me that I probably could have just as easily accomplished my task with Groovy, another one of my favorite languages/platforms. Like Python, Groovy is intended to support scripting and application development equally well, so its out-of-the-box capabilities are quite extensive.

Another couple of quick Google searches, a new TextMate window, and I had a Groovy script which does the same exact thing as the Python script:

sourceFile = new File("documents.pdf")
targetFile = new File("output.txt")
targetFile.write(sourceFile.readBytes().encodeBase64().toString())

Some observations:

  • The Groovy version is a little higher-level than the Python code. I kinda like that, particularly for scripting.
  • The Groovy version needs one less line — no need to import anything. That’s nice, but no big deal.
  • Both scripts could have done the work with a single line, instead of creating the variables sourceFile and targetFile. But I think using those variables made it easier to write the scripts, and make them easier to read as well.

And that’s it for today, I hope this might have been of some value to readers interested in Python, Groovy, or scripting.

8 Responses

  1. 12/19/2008
    Chris Dary Said:

    This little terminal one liner to encode/decode is also pretty neat:
    http://www.macosxhints.com/article.php?story=20030721010526390
    openssl also supports a lot of different encodings/digests (man dgst and man enc for more)

  2. 12/19/2008
    Avi Flax Said:

    Good tip Chris! Thanks!
    It’s also possible to run Python and Groovy scripts directly from the command line:
    python -c “import binascii;open(‘output.txt’, ‘w’).write(binascii.b2a_base64(open(‘documents.pdf’).read()))”
    groovy -e ‘new File(“output.txt”).write(new File(“documents.pdf”).readBytes().encodeBase64().toString())’

  3. 12/25/2008
    Matt Williams Said:

    Python and Groovy are nice, but for “in the next few minutes”, don’t forget your resident php devs!

  4. 12/25/2008
    Matt Williams Said:

    Apparently, php tags cause all following text to be stripped…
    file_put_contents(
    $argv[2],
    file_get_contents($argv[1])
    );

  5. 12/25/2008
    Matt Williams Said:

    Wow, don’t post while distracted by the television, kids :P
    One more try…
    file_put_contents(
    $argv[2],
    base64_encode(file_get_contents($argv[1]))
    );

  6. 12/25/2008
    Avi Flax Said:

    Thanks Matt! Good stuff.
    I think the only language from the Arc90 family that we’re missing at this point is Ruby… I wonder how long it’ll be before Nir or Dan post up a Ruby script for this purpose. And what’s after that? LOLcode?

  7. 12/25/2008
    Matt Williams Said:

    Couldn’t resist…
    Ruby script:
    require “base64″
    File.open(ARGV[0], ‘r’) do |input|
    File.open(ARGV[1], ‘w’) do |output|
    output.write(Base64.b64encode(input.read))
    end
    end
    Ruby command line:
    ruby -e ‘print [IO.read(File.join(Dir.pwd, ARGV[0]))].pack(“m”)’ input.pdf > output.txt
    No luck on the LOLcode…

  8. 1/15/2009
    Joel Potischman Said:

    Let’s not forget C#:
    File.WriteAllText(“output.txt”, Convert.ToBase64String(File.ReadAllBytes(“input.pdf”)));

Leave a Comment