Nolan's Dev Log: August 2010

Monday, August 30, 2010

The Best Tools

Recently at work, we’ve been discussing NetBeans as our “standard” development platform. Personally, I had switched off of it several months ago to use TextMate, but I’ll get to that in a little bit.

What prompted the switch from NetBeans was all the little papercuts it had - long start up time, poor support for Python compared to other languages, weird syntax highlighting (that completely breaks when using something like Jinja templates), and a bunch of other things. So we started looking around for some alternatives.

Our two main criteria were that it must be cross platform, and it must have Subversion support.

Before I had started here, they had tried Eclipse and didn’t like it (I don’t know their reasoning) and moved to NetBeans. Now, because of the quirks listed above, and the fact that Oracle’s future support for it is questionable, we’re looking around. So far the only proposal has been Komodo, which I’m trying out before I can say too much about it.

My main issue, though, is that this ignores the point of the tools.

Any tools you use should be used because they allow you to be more productive. That’s really at the heart of the issue. If you’re more productive in Windows and Word, then you probably shouldn’t buy a Mac and get Pages.

For myself, the tools that are the best are the ones that don’t get in my way. TextMate has all the power of the IDE, but it’s not thrown in my face and put in my way. It’s extensible through bundles, supports tons of languages out of the box, and is just all around nice to work in. Hell, I’m writing in it right now.

But, it’s not cross platform, which I do admit can suck sometimes, but not enough for me to drop it entirely. Instead, I’ve made sure I learned enough of VIM to be able to survive on a machine without OS X. No, it’s not VI, but I had to draw the line somewhere.

Same goes for the command prompt - I like using that with the other tools (Subversion, Mercurial, etc) because it stays out of my way. Yes, remembering the commands is a little arcane, but it’s faster that the GUI for me and I can usually get exactly what I want done with minimal effort.

Finally, I’ve been playing around with Mercurial at home instead of with Subversion like at work. Subversion’s branching mechanism are a major headache, and merging from one branch into another is a time consuming and manual process. While I have yet to do much with branching and merging in Mercurial, I’ve already found instances where it’s better than Subversion - like monitoring for commits, updates, and changes in the entire code branch, not just the directory you happen to be in.

Ultimately, though, this is all a lot of masturbatory talk - if the tools aren’t being used to make something, all those features and comparisons don’t matter.

Thursday, August 26, 2010

Ruby vs Python

Alex Martelli, the guy who spearheaded the Python Cookbook has a nice essay on Python vs Ruby.

It hits on a lot of the things I’ve noticed between the the two languages. They’re very similar, far more similar than you’d think at first glance; both allow a great degree of metaprogramming (though admittedly, Ruby’s is far more prevalent and pervasive) , both allow people to put things into production with pretty straightforward syntax, and both kick the crap out of Perl.

The biggest differences that Martelli highlights are things I’ve noticed, too - there are little syntax nits, but the big issues are the level of metaprogramming allowed, and iterators/codeblocks vs iterators/generators (and I’d throw decorators in there, too). Ruby allows you to open anything up, and while that can be nice in some scenarios, it could be a nightmare in large scale projects. Python still lets you clobber builtins, but you can’t, for instance, open up the Python 3 str class and edit it, outside of subclassing.

It’s far more common in Ruby to crack open a class and do stuff with it; Rails is fundamentally built on that. Python lets you do that, but I just don’t run into it as much in actual code. There’s some, but it’s closer to the straight-forward C-style stuff. Perhaps it’s the projects and code I’ve encountered, I dunno. But, for contrast, Pylons and Django are far more comfortable with requiring the programmer to use them as libraries, rather than the DSL-ness of Rails.

Ultimately, between the two it’s really a matter of taste. Yes, that applies to all programming languages, but Python and Ruby are close enough to each other that it’s more down to taste than one of those compared to say, Java or C.

All that said, I’ll continue to use and learn what I can about both. Sometimes, an idea can be more easily expressed in different syntax, and that’s ok.

Wednesday, August 25, 2010

RDD

I found an excellent piece by one of the founders of GitHub talking about Readme Driven Development.

It's an interesting take on "agile" and "test-driven development" methodologies; as a response to the waterfall methods they were great. But documentation accompanying them can often leave much to be desired. That's certainly the case with our "agile" projects at work. The software's done, the code is commented, and people are ready to move on.

Preston-Werner makes a good case for getting the Readme down. The file shouldn't be long and doesn't need a lot of detail, but it helps you think about the bigger picture first. It organizes your thoughts. Because of it, you have a better idea of what to focus on first, and how pieces of the program fit together.

His other comment about getting other people up and going on the project without having seen your code is valuable, too. I'm trying to be very diligent on getting any kind of design decisions or discussions into the project's Trac wiki for anything new that we do for this exact reason. When it's there, in writing, it's far easier to remember what was agreed on, what direction we should take, and offers a springboard for new discussions.

Tuesday, August 24, 2010

What to learn

A friend sent me an article today entitled 'tactics, tactics, tactics'. I loved it.

It's about stepping back and really looking at what you're trying to learn. Are you focusing on the tools, in the case of programming, design patterns, languages, source control, IDEs, so on and so forth, or tactics? Something transferrable to any programming language, something that isn't a fad and that doesn't get in the way of actually becoming better.

The comparison of a "junior" programmer to a fledging chess player is very astute, I think. Simply being able to code does give you a rush; hell, I remember looking around at all my coworkers when I began automating tasks at my old job and thinking "What the hell is wrong with these people? Why do they think this is so hard?" Admittedly, I sort of enjoyed the attention it got me, at least for a while, but eventually I started interacting with people who were beyond my skill level, and I found that I was only scratching the surface.

I think this was a good thing. It pushed me to learn more. However, I haven't always been learning the right things. Stressing over Ruby or Python, Pylons or Django, CouchDB or PostgreSQL; while important, they're not nearly important as learning how to make loosely-coupled, highly cohesive classes. Or crafting an appropriate data solution to your application. Or understanding network communications and HTTP connections between clients and servers.

Ultimately, for me, I think this was a bit of a wake up call to get back to the stuff I get all fired up about when listening to Merlin Mann.

Get back to doing things and learning important things, not stressing over tools.

Monday, August 23, 2010

Oracle

Oracle's not done a whole lot to impress users since buying Sun. The latest jackass move was that they essentially slammed the door in OpenSolaris without saying a word. In response, the entire OpenSolaris Governing Board resigned.

I don't think this is good for anyone, even though OpenSolaris was comparatively small next to Linux or the BSDs. Having some of the Solaris tools, like DTrace and ZFS, out in the open and maintained in the open was great. FreeBSD especially was pretty gung ho about integrating these and providing support for them. I'm sure that this kind of behavior is seen as a threat to Solaris by its new keepers now, though.

Oracle seems to have gone on a killing spree since acquiring Sun, either by damaging the image of a product or by attacking former partners and users, either through their internal memos or things like the Google lawsuit.

We've even been hit directly at work; academic pricing on Sun hardware is now gone, and the Sun Grid Engine we used for scheduling the clusters is getting the open source axe, too.

In February, we began migrating any projects we could off of MySQL to PostgreSQL, simply because Oracle's intentions were such an unknown. Given what they've been doing to the portions of Sun that didn't conflict with their brainchild, I can only imagine that it will be a matter of time before they set about tearing apart MySQL.

Pretty sad that in the past year, Microsoft's gone from being the 'evil empire' to Oracle and Google competing for the crown.

Friday, August 20, 2010

Python 3 and WSGI

I like Python. I really do. It's clean, easy to read, quick to write, and is usually fun. It's been our primary choice when developing for the web at work, and in it's current state it fits that role exceptionally well.

Well, kind of.

You see the WSGI standard that most Python web frameworks and libraries adhere to right now don't work in Python 3.x. Why? It has to do with how strings are implemented in 3.x vs 2.x.

In Python 2.x, the str data type was a series of bytes with methods that made it convenient to treat them as, well, strings. There was also the unicode type that had many of the same methods. One thing that Python 2.x did that was really annoying, though, was it would implicitly change data back and forth between the two types on you. This led to all kinds of hard-to-find bugs (I ran in to TONS of them in working on my first project for my current employer, a web crawler). Especially in web programming, there were times were it was unclear whether a value was str or unicode.

Enter Python 3.x, which was largely motivated by a desire to clean up weird bugs like the above in the language. To fix the above problem, what they did was this: Python 2.x's unicode became Python 3.x's str, and Python 2.x's str became Python 3.x's byte arrays. Kind of. The Python core developers felt that giving the new byte arrays all of the old str methods would result in exactly the same problem; developers would inevitably get strs and byte arrays mixed up.

This obviously affects anything that has to deal with strings, but it especially affects WSGI. A lot of WSGI implementations seemed to rely on the implicit type changing behavior from before, and Python 3.x breaks that pretty hard. So some of the people involved with the original WSGI spec got together and tried to propose new solutions.

From what I have been able to ascertain, there are three main camps:

Make everything native (that is, unicode) strings
Make everything byte arrays
Use a combination of them (usually bytes or strings in the header, and the body being the other)

There is also a fourth view, to petition the Python core developers to re-introduce the string methods on byte arrays, or at the very least create a new data type that does so. This hasn't gotten much traction from either the web community or Python core.

Using native strings across the board sounds nice, except that HTTP isn't implemented in Unicode; it's ASCII. So, there would have to be a conversion to the byte array at the response/request boundary, when the data is leaving or entering the server. This could result in data loss, depending on the encoding used by the WSGI application(s).

And speaking of encoding, it seems unclear which would be the default. There are a lot of Unicode encodings, and without clear definition of which one is the 'default', it becomes hard for an implementation that relies on WSGI middleware-wrapping to keep them straight.

That leads us to the byte array proposal. This matches up with the HTTP spec quite well, and would put us back where we're at now in Python 2.x, right? Unfortunately, that's not quite true. Again, because the byte arrays don't have the old string methods, you can't do string operations on them inside of your application without doing an explicit conversion into Unicode, which again runs into the problem of which encoding to use.

Compounding that, when using WSGI middleware, an implementation with bytes would have to encode-act-decode in every single middleware application, which could certainly add up.

Finally, there's the mixed approaches. These carry the same problems as the first two approaches, along with the added confusion of working with two different Python types in a single response or request. And, some proposals have even had mixed data types inside of the WSGI environ dictionary, something I'm sure would be a head ache.

So where does that put us?

Right now, I don't know of any concrete attempts to build any code implementing any of these proposals. None of the library authors (Paste, BFG, Werkzeug) want to take the time to do a large-scale conversion for fear it would be wasted effort if that particular proposal lost out. At least, that's my understanding from digging through their blogs and the mailing lists.

Everyone agrees that a new standard is necessary for Python 3. After all, 2.7 was released last month, and it's the last of the 2.x line. Sure, it will still work, but it's not going to receive any new features that future 3 versions will, like Unladen Swallow.

It's hard to predict what's going to happen. My fear is that something will either just get implemented without much input, or a PEP is accepted just to break the deadlock, and we end up with a problematic standard.

Maybe that's everyone else's fear, too.

Thursday, August 19, 2010

"Weak Android Player Proves Jobs Right"

This article is great, but the comments are even funnier. Seriously, someone who claims to be from Adobe saying the author stole the software? And the people asking about TaskKiller? Yeah, real people don't work like that. They expect it the thing to work out of the box, and that's something Flash on mobile just cannot deliver right now.

But of course, like desktop Linux, any day now...

Wednesday, August 18, 2010

"The worst one in the band."

I recently read The Passionate Programmer by Chad Fowler, and gotta say, I'm a raving fan. It will probably come up a lot, largely because I think it's good, solid advice, and because I'm fairly confident most people know a lot of it already, they just need the push.

One of the essays in it has a topic that's been pretty central in my mind lately:

"Be the worst one in the band."

Sounds like really bad advice, huh? I mean, who would want to be the worst in their group? Not really an enviable position.

But Chad makes a clarification that completely turns this on it's head. If you're always striving to be the worst person in the band, you're constantly going to be surrounded by great people.

I mean, think about it - would you rather be the most talented person in Nickelback, or the least talented person in The Beatles? The best person in your local quartet, or the worst violinist in the London Symphony Orchestra? Best player on a minor league team, or the worst in a major league?

Chad points out that this almost necessitates you change bands; after all, you're not trying to stay the worst. So, as your skills grow, it makes sense to move on.

To me, this is incredibly inspiring. I faced pretty much this exact choice around October of last year: I could either stay on at the company I was with and be a 'senior network engineer' (if not in title, definitely in practice) straight out of college or I could start applying to other places and expand my skills. A big deciding factor for me was that I was already doing the work I would be doing after school; I had gotten as high as I could there.

In reflecting on it, I had realized that in terms of skill set, I was at the point where I had wanted to be upon entering college. So, I decided to make the leap and look around.

There were a few places I applied, one of them being thesixtyone. I had loved their mission, and I loved the site itself for a while at that point. When I had heard they were looking for candidates, I was thrilled. Not to mention, they were pitching a wanted ad that spoke to me - no resumes, no cover letters, no degrees. Here's a set of problems, work on them and send us your code.

So I did.

I got turned down, and in looking at my code now, nearly a year later, I can definitely see why. But, even in that attempt, I enjoyed what I was doing, and there was nowhere to go but up in terms of programming skill. Thankfully, my current employer recognized that, and things have been going well. I get to work on interesting problems, and I'm surrounded by incredibly intelligent people. At this point in time, it's an excellent band for me to explore and learn.

So, the next time an opportunity comes up to join a 'better band', don't be put off because you think you'll be the worst. In reality, you'll probably surprise yourself and find out you aren't. And even if you are, so what? That only means you're at the ideal place to take risks and grow.

Tuesday, August 17, 2010

Managing Trac

Today I 'launched' the first release of an internal tool for managing our various Trac instances at work. We've been trying to keep all the projects contained in there for issue tracking and source control, and I ended up being the one administrating the whole thing. Mostly because I asked 'Hey, have you guys given any thought to version control and bug tracking?' Hopefully I can use a similar approach to get more people doing unit testing, too...

So, in order to get people off my back, I worked on a relatively small Flask application to create the database, VCS directory (with a choice of Bazaar and Subversion; no one wanted git or mercurial, though I'm liking them a lot) and finally the Trac instance.

It manages security through some Apache group files and PAM users hooked up to our central Kerberos server. I don't like having the local users, but it's less painful to manage with the web interface now. We have a CAS system that would work for this, except the Subversion and Bazaar repos require authentication, too, and the standard clients break when they hit CAS's 403 redirects. Oh well.

I might try to release the code as open source, but for now it's going to have to stay internal for a few reasons. Not to mention, some of the stuff is dependent on our environment and wouldn't move very well.

Feels good to have my first self-directed project up and going; I've finished a few other things since starting in February, but nothing where I had pretty much free reign in managing the whole project. Admittedly, a lot of time was wasted while I bikeshedded over whether to stick with Python, Ruby (with something like Sinatra rather than Rails), or try a server-side JavaScript implementation. Other than node.js, though, I couldn't find one that excited me. Most were based on the Rhino JVM implementation; it'd be cool to see some other server-side JS implementations use V8.

Monday, August 16, 2010

Gosling on Java's version of 'freedom'

So James Gosling is weighing in some more on the whole Oracle vs. Google lawsuit. His take on the 'freedom' side of this isn't surprising, though it is rare to see. He sees Java as providing the freedom to run applications 'anywhere', instead of the common open-source licenses. And that's fine. Though, I have to say, I think languages like Python, Ruby, and Perl have done a far better job, and in a much less annoying fashion.

I personally haven't had a pleasant experience with Java on any OS, though Linux and Windows have by far been the most annoying. He does point out that Apple wrangled a deal to have their own distribution, but I don't think that makes it any better. At best, Java apps on OS X still have a weird, not-quite-native feel on OS X, NetBeans being the best of breed that I've personally tried. However, it still doesn't blend in well to the overall environment. That's not as much of a problem on Linux and Windows, where there really *isn't* a sense of continuity between all the applications on a system. For their part, getting Java to work on Linux is an exercise in frustration and tedium (even a simply plugin for Firefox 3, where these directions didn't work), and on Windows it becomes a maintenance nightmare, balancing many different versions in order to retain compatibility with legacy systems.

Now, I know there are people who love Java. That's fine; if you're more productive in it, I'm not trying to change your mind. But for myself, I haven't gotten any of the benefit out that Gosling is talking about. The more traditionally 'free' programming languages have served me far better in this respect, and on UNIX-like environments feel far more...native. There's certainly benefit to having highly portable code, but the power of consistency on a given platform just cannot be ignored.

One thing's for sure - this lawsuit was shopped by Jonathan Schwartz from the beginning of Sun's end, and I don't think it bodes well for anyone, but especially the Java/JVM ecosystem. I have to wonder what this will mean for all the hot new platforms that run on the JVM, like Clojure and Scala - from what I can tell, they look genuinely exciting, but is it worth risking Oracle's patent gun?

Your Red Zone

Seth Godin's got a great little snippet up on his blog about the 'red zone' of overcoming the hassle of learning something compared to the join of learning new things about it. I love Godin's stuff, his book, Linchpin, is fantastic and has helped me with my career.

Check it out at How big is your red zone?

Sunday, August 15, 2010

Flask

I started using Flask soon after it was announced back in April, but I've been really digging into it in the last 2-3 weeks. It's for a small project at work, building an interface to generate Trac and Subversion/Bazaar repositories for new projects. Hopefully I'll be deploying the first iteration of the interface to the production server this week.

What I like about Flask is that it's got a few things that make it ideal for small projects over something like Rails, Django, or Pylons:

Locality of reference
All my routes and view functions are in one place, no need to track them down between a controller, model, and view folder. Obviously, this doesn't necessarily scale well, given that the main file will be a pain in the ass to edit before long. That being said, we're looking into using Flask and it's Module architecture for a larger project.

Simplicity
Given that it's a 'microframework', Flask doesn't come with the huge file structure and extra utilities that big frameworks do. I like that. It also cuts out the controller; routes are applied directly to the view functions, which manipulate the model. The repoze.bfg framework does this as well, and I think it's a sensible approach, at least for small projects. Again, I'm not sure how this will scale, but it definitely cuts down the learning curve.

Amazing documentation
While Rails has great documentation, I have to say that Flask stands alone in documentation among the Python frameworks I've looked at. Django's pretty close, but also really big for what I was planning on doing. Also, their 'hello world' example on the main page is awesome, because it really does give a flavor of how the thing works in only a few lines of code. None of the other major ones can say that.

Loose structure
Flask seems to be largely hands-off in determining how your code should be structured, again unlike the larger and more well-known frameworks. This also means there's mode code you have to write, but you're not forced to program in someone else's style.

Active community
A really active community has sprung up around Flask. There are plenty of extensions getting written, and people are experimenting with larger projects. It will be interesting to see where they go.

So, while I'm sure I'll end up pulling out one of the existing, bigger frameworks for something, I'm going to work with Flask on my projects, both in my free time and at work. We've started on doing some REST-based peer-to-peer work this week, and it'll be interesting to see how that plays out.