What do 100 strangers think? (A science experiment on ethnic perception)

Yesterday in the car, Abby and I happened upon a conversation about how other people will perceive our daughters’ ethnicity. They’re both half White / half Chinese, something that in Los Angeles is hardly rare, and we were split on how people would see them as they grow up (Zoe is 4 years old and mostly the focus of the discussion).

Rather than leave it as a hypothetical discussion, something we’d wait years to see how it played out mostly through what Zoe chooses to tell us, I decided to do something more scientific. I had Abby pick a picture of Zoe and we wrote up a Mechanical Turk job to have people answer what ethnicity they thought the person in the picture was.

Mechanical Turk, for those who haven’t heard of it, is an Amazon service that allows you to ask a set of humans to perform some work on their computer and report back on the results. You pay them a small reward (we paid them $0.05 each) and pick how many answers you’d like to get (we picked 100). You formulate the work, in our case in the form of a picture with a multiple choice question, and Amazon gives it out to people that make money performing the work.

To my delight, it only took about an hour to get 100 replies, making this science experiment quite fulfilling. The results were also really interesting.

So without further ado, the picture:11270640_10153835836635744_2082581986329784894_o

And the results:


I gave people the ability to pick multiple, that’s why the numbers don’t add up to 100. Of the people that picked multiples, no one picked the correct one either.

Ethnicity identification is very much a regional effect. It’s not possible to know where the answers were from, but if we assume they were mostly American (given the time that I was using Mechanical Turk), this provides an interesting insight into how people will see Zoe and Kira in their lives.

It was also a fun mini-experiment.


On Portable Computation…

Since the first time two computers communicated directly over a network, programmers have mused about the programs running on those computers. Back in those early days, there was careful external synchronization of programs “Bob, will you load number 3 while I’ll do the same over here?” We like to think that we’re well past that, but in reality, it’s basically still the same situation today. We’ve invented versioning schemes to make that external synchronization easier to swallow, but we’re still just manually putting 2 programs on different computers and hoping they can talk to each other.

And so the industry has slowly tried to evolve solutions to this problem. Java, for instance, strove to allow a server to push down the client version of a program to the browser to run, allowing the server to decide at load time which to push (you remember applet’s don’t you?). Modern web programmers effectively worked their butts off to avoid the problem and instead push a very heavy standards drivin’ approach (hi there W3C!). And because web programming has dominated so much of the industry for the past decade, there hasn’t been a big push to work on portable computing.

Disclaimer: I’m sure people will comment and say “OMG Evan! This person and that person have been working on it for 20+ years! I’m sure thats true, but almost nothing has gone mainstream, such that it has no effect on your everyday programmers.

There are 3 really interesting attempts at bringing portable computing to the everyday programmer I’ve been thinking about this weekend that I thought I’d do a quick post about.

  1. NaCL – Google’s approach to applets, basically. With a heavy emphasis on C++ as the target language, the aim is squarely at game developers. But I think if you look past immediate browser usage, you can see a separate idea: The ability to generate managed machine code that only runs inside a special harness and can be easily transported. I’m sure the irony of that last sentence NOT being applied to Java will be lost on no one. And in that irony, a nugget of truth: this approach isn’t new. The exact same thing could be applied to Java or really any VM. The place that NaCL deviates is that because it’s the generation of machine code and that the toolkit is a normal C/C++ compiler, it’s possible to compile a VM itself with NaCL to deploy it. Thus NaCL can be seen as a safe way to distribute binaries, which makes it pretty interesting. I’ll call this approach to portable computing the specialized environment approach.
  2. Docker – I’m talking about docker rather than generically containers or other similar technologies because it’s Dockers pairing of containers with images that are delivered over a network. Certainly a service to download a binary and run it would be trivial, thusly it’s usage of containers that Docker into a proper portable computing environment. The interesting aspect to Docker is that it’s approach is squarely on the computing “backend”. NaCL and it’s spiritual father Java applet focus getting new computation units to and end user directly, where as Docker is about the management of backend services. It’s just that the code for those backend services is delivered remotely on demand, making it a portable computing environment. This is the constrained environment type of portable computing.
  3. Urbit – Urbit is, well, it feels like the implementation of an idea from a scifi novel. Part of that is on purpose, it’s founder is known to be quite esoteric. Cutting through all the weirdness, I find a kernel of an idea for portable computing. They’ve defined a VM that can run a very simple bytecode and a whole ecosystem around it. The service I found most interesting is the fact that all programs are in their bytecode and all programs are stored in their global filesystem service. This means that it’s trivial for programs written and stored by another computer to be run by an other computer. Add in versioning, public identities, and encryption and you’ve got yourself a really nice little portable computing environment. The fact that the VM so effectively stateless and the services promote storing immutable data, it’s easy to see how a distributed system would work. The problem? It’s all in a really esoteric environment that it going to be incredibly difficult for even the most motivated programmers to get into. This is the isolate environment type.

So, that’s the point of cataloging these here? Well, I started looking at Urbit and thinking how difficult it is for it to get uptick. But the kernel of the idea is super interesting. So could that kernel be transplanted to something like NaCL or Docker to provide a meaningful environment for distributed programming? Urbit’s immutable global file system seems perfect to apply CRDT to, allowing multiple nodes to share a meaningful view of some shared data. Wire in a discovery mechanism that will tie environments together and I think there could be something big there.

A new door opens…

I’m excited to announce that as of today (March 28th), I’ve accepted a new position at LivingSocial! I’m an Engineering Director, managing a few teams that work on backend architecture such as email, scaling, etc.

For the past 5 years, Engine Yard has been an amazing employer. Back in 2007, Tom Mornini took a chance on hiring me, enabling Rubinius to progress in a way it never would have otherwise. I can’t say enough about how much I appreciate everything they’ve done for me. It has been an embarrassment of riches.

So what does all this mean for Rubinius? Rubinius started as a passion project and continues to be. I’ll keep working on the project in all the same roles I have in the past. Because I won’t have the same amount of time, I’ll be doing more project management so that the amazing Rubinius community has a clearer picture of the work that is outstanding and can apply their talents to help move things forward.

I know some of you will read this and say “Evan left EY? Rubinius is dead.” I ask that you reserve judgement. The Rubinius community is an amazing group and I know that we all will continue to build a great platform. Brian Ford will continue working on Rubinius full-time and doing his amazing work getting Rubinius 2.0 finished.

I want to close by saying again how truly amazing Engine Yard is. I’ll always call this experience one of the most amazing opportunities of my life. For me, this decision was driven entirely by the chance to help LivingSocial build their architecture. I wish Engine Yard all the best and I’ll continue to help them with Rubinius related ventures in the future.

Helpful syntax errors

Syntax errors are a part of life for programmers. The language of the computer, no matter how flexible the language, is very picky.

And thus how the language communicates back to the user about what it didn’t understand is important, because time is spent in this phase, no matter the skill level of the programmer.

In Ruby specifically, MRI’s parser (and by extension the melbourne parser Rubinius uses) use yacc, and thus suffer from syntax errors which can be particularly difficult to understand. One particular syntax error that commonly occurs is when there is an ‘end’ missing from an expression. This results in the dreaded syntax error, unexpected $end, expecting kEND message.

Here is a quick example:

class Spaghetti
  class Sause
     def add(plate)
        while more?
           plate << self

Now, this is a short example and so spotting the error is fairly easy. But this error typically occurs when you’re working on a 600 line file with multiple classes inside classes and with complicated logic, making it quite difficult to find.

This evening, I decided to try and at least help make this easier to find. So now in Rubinius, rather than

syntax error, unexpected $end, expecting kEND

you get

missing ‘end’ for ‘class’ started on line 1.

Thats a big improvement, because now first off, it’s fairly clearly communicated what is wrong, i.e. that you’ve forgotten an ‘end’. In addition, it tells you what element still required a ‘end’, in this case, a ‘class’ on line 1.

Now, this is far from perfect. It’s pointing you to the element that was unclosed, rather than the one that you actually forgot the ‘end’ on. But it at least is now pointing you to the chunk of code that is the offending code. In practice, this can be a big help.

In the future, it might be possible to try and use indentation to try and narrow down where the missing ‘end’ should be. But for now, every little bit helps.

Fixing colors in Terminal.app on 10.6

It’s Snow Leopard day zero, so of course I had to upgrade. All in all, everything is great. (Especially the multiple monitor window migration fix!)

But the #1 thing that annoys me about all OS X releases in the colors in Terminal.app. They’re pretty much unusable on a dark background (especially the blue). For some time, there have been hacks to fix the problem. Well of course these hacks didn’t work on 10.6 anymore.

Never one to shy away from the problem, I dove in. And we have success!

Here’s how to make it work:

  • Find Terminal.app in Finder (/Applications/Utilities), right click, “Get Info”
  • There is a checkbox “Open in 32-bit mode”, Check it!
  • Install SIMBL. Plugsuit was installed on my machine before, it freaks out because of the 10.6 changes. SIMBL silently just works or doesn’t.
  • Get My updated TerminalColours SIMBL plugin. See the original post for details on how to install it.
  • Restart Terminal.app
  • Enjoy your readable colors!

This works because InputManagers still work in 32bit mode, but not 64bit mode. So by forcing Terminal.app to run in 32bit mode, SIMBL can still hook in. I just had to update TerminalColours to swizzle a new method that 10.6 uses to pick colors.

Hope you enjoy!

Update: I’ve changed the tar.gz download link to one that should work better.

Doing a Time Machine full restore even if it doesn’t want to

Got a new harddrive for my wifes MacBook. Couldn’t find her 10.5 upgrade DVD, so I tried to use the install DVD for my new MacBoor Pro. The DVD promptly told me it would not install (Apple cripples the OEM DVDs to only reinstall on the same make of machine), but it did not exit. So I select Restore from Time Machine Backup from the Utilities menu.

Nothing happens.

I try a few times, nothing. I give up for a while. Upon returning and seeing it still doesn’t work, I open the Install Log under Window.

I notice there are number of entries listed: Unable to load XIPanel_RestoreIntroduction nib file. Ok, well something is wrong with the Installer.

Terminal to the rescue. Open it up, do find / -name “XIPanel*” and find a number of them in a Resource folder. Ok, so it’s there, this must be a bug in the installer. Now we get serious. If you’ve gotten this, here’s what you do to get the Time Machine Restore to launch:

  • Open Terminal
  • Run ps ax and find the listing for Mac OS X Installer.app, note the number of the far left for it (the Pid)
  • Run kill pid_of_installer
  • You should now get a dark grey background, you’re doing great.
  • Run export LANG=en_US.UTF-8. I’m not sure if this matters, but I did it, so you should too
  • Run cd “/System/Installation/CDIS/Mac OS X Installer.app/Contents/MacOS/”
  • Run ./Mac\ OS\ X\ Installer “/System/Installation/Packages/OSUpgrade.pkg”
  • The installer should popup and say Welcome and such (likely with no graphics, thats ok)
  • Click Yes/Continue enough for the Utilities Menu to appear at the top and select Restore from Time Machine Backup
  • With luck, it will load! You can now do the restore!

Thanks for playing! Booo to Apple for having bugs in their Installer, Yay to Apple for leaving Terminal available in the Installer!

NOTE: Be sure to leave the Installer in the foreground while it restores, otherwise it will stall!

Rumors of our Demise are Greatly Exaggerated

We’ve been pretty quiet with Rubinius developments for a while, so I thought I’d bring people up to speed.

The previous year has seen a lot for the project. We were sad a number of developers were laid off the project, but that has only increased our desire to get the project to a usable state.

Some of the highlights include, but are not limited to:

  • Rewriting the VM in C++
  • Experimenting and building multiple JIT compilers
  • Pushing RubySpec completeness and compliance levels
  • Getting large scale libraries like Rails and RubyGems running

All those things are available today in our git repo on github.

Recently, Brian Ford and I published a roadmap, laying out the activities of most importance over the next few months. We’re going to try and be more vigilant about updating blogs and roadmaps in the coming months, to keep people more up-to-date.

Finally, a lot of people ask me “How can I help on Rubinius? I don’t have a lot of time.” The answer is simple:

  • Download your favorite library
  • Try it under Rubinius:
    • bin/rbx test/test_whatever.rb or
    • bin/rbx gem install rspec; bin/rbx -S spec my_spec_dir
  • Report bugs that you find to our github Issue tracker.

The more people start to report bugs, the more coverage we get over the vastness of the ruby landscape. So while we’re hard and work getting the performance up, you can help out getting the compliance up.

Thanks again to the ruby community for all the patience you have shown the team over the years. Rubinius has been a long road, but I really feel like we’re onto something big.

In the coming months, I’m going to try and post more posts about technical aspects of Rubinius, so look for those.

The Ruby Community Rocks

The last time we checked in, I was delivering the bad news about having to let a bunch of my team go. I received a lot of kind words of encouragement during the hard time, which I want to thank everyone for.

In addition to kind words, a number of people stepped up and indicated they had positions available that my newly unemployed friends would be great fits for.

One such offer was from Daniel Yoder and Charles Hornberger at AT&T Interactive in the R&D department, the makers of the yellowpages.com. I’m extremely happy to announce that Eero Saynatkari (rue on IRC) has recently been hired by them and even given time to continue work on Rubinius!

This development makes me so happy. To see the community pull together in a tough time and even continue to make an external investment in Rubinius.

Thanks again guys! You’re what make me love being a Rubyist.

A Sad Day

There have been some sad developments within the Engine Yard Rubinius team that I’d like to address head on.

Earlier today, I had the unfortunate task of reducing the team size to 2 people, which meant laying off the rest of the team.

I’m sure this comes as a shock to many, as it did to my friends to whom I had to give walking papers. This was certainly never a scenario that I had ever hoped to find myself in when Engine Yard offered me this dream job early in 2007.

The reason for the layoffs is not Engine Yard divesting interest in Rubinius,
but rather a necessary reorganization of budget priorities. That’s a fancy way of saying that EY could no longer afford to sustain the large team we had.

This is a sad day for me, one that I’ve been dreading. It stings not only
because of what it means to Rubinius but also because of what it means to my friends with whom I will no longer be working. They’ve put blood, sweat, and tears into Rubinius and their everyday presence will be sorely missed. I hope that they do not think badly of me or Engine Yard.

When Engine Yard gave me the go ahead to hire a team, they did it with the best
intention: to help Rubinius grow. And we have definitely done that. In the
last year, we’ve achieved amazing goals within the project:

  • We went from running very little ruby code to running rails.
  • We got rubygems up and running well.
  • We got a parser entirely in ruby integrated.
  • We wrote a whole new VM to build on.

We’ve had our fair share of setbacks, but the team has always rallied.

Rubinius will continue to move forward, continually bolstered by the awesome group of people who give up their free time to help on the project.

Tom Mornini has posted on the EY blog as well; you should read his take.

CPP work branch change

Hi everyone. I’m super happy to announce that we’ve gotten the C++ branch stable enough that we’re making in the default branch. This means that those of you with existing clones are going to likely do a little work to get them sane though.

Here is what was done:

  • The old master branch was rename shotgun.
  • The cpp branch was copy to the name master.
  • The cpp branch was then deleted.

Anyone that has up to now been working on the cpp branch has a couple of options.

  1. Delete your clone and re-clone. This is the easiest. The default checkout will be code in the cpp branch and you’re off and going.
  2. Fix up your current repo. I did this by doing the following commands:
    1. git checkout master
    2. git reset --hard origin/master
    3. git branch -D cpp

    This will get your local master branch repointed and properly checked out. In addition, the old cpp local branch can be deleted.

Hopefully no one experiences much pain due to this change. It’s been a long time coming and I’m really excited.
If you do run into problems, post a comment or stop on by IRC and we’ll work it out for ya.