<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
 
 <title>Ted Dziuba</title>
 <link href="http://widgetsandshit.com/teddziuba/atom2.xml" rel="self"/>
 <link href="http://widgetsandshit.com/teddziuba/"/>
 <updated>2012-09-15T12:01:52-07:00</updated>
 <id>http://widgetsandshit.com/teddziuba/</id>
 <author>
   <name>Ted Dziuba</name>
   <email>tjdziuba@gmail.com</email>
 </author>
 
 
 <entry>
   <title>Dan Lyons vs. The Valley</title>
   <link href="http://widgetsandshit.com/teddziuba/2012/02/butthurt-in-the-valley.html"/>
   <updated>2012-02-12T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2012/02/butthurt-in-the-valley</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/teddziuba/images/bring-back-that-numa-numa-kid.jpg&quot; class=&quot;post-lead-image&quot;&gt;
Evidently Dan Lyons is on the warpath now, against MG Siegler and Mike Arrington, formerly of TechCrunch, now of CrunchFund. You remember Dan Lyons, he was entertaining for about five minutes a couple of years ago as Fake Steve Jobs? This sort of depraved cannibalism is common in Silicon Valley tech media, so it should surprise nobody.
&lt;/p&gt;

&lt;p&gt;But I missed it all, I was too busy building a product.&lt;/p&gt;

&lt;p&gt;See, when people spend all day making absolutely nothing any more valuable, they leave fingerprint smudges all over shit that used to be nice. That's a general observation: tech journalists, the government, anywhere that people are comfortable.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Who Needs Process?</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/12/process.html"/>
   <updated>2011-12-27T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/12/process</id>
   <content type="html">&lt;p&gt;
&lt;img src=&quot;/teddziuba/images/if-there-was-a-god-strippers-would-be-his-chosen-people.gif&quot; class=&quot;post-lead-image&quot; style=&quot;position: relative; top: -80px; left: 0; margin-bottom: -80px;&quot;&gt;

Software development methodology is organizational Valtrex. Sure, it treats a symptom, but the only cure for the underlying disease is to &lt;em&gt;never have contracted it in the first place&lt;/em&gt;. This is not to say that process and methodology are bad. They are means to an end. But the ability of your team to execute on a goal is inversely proportional to the amount of process you have in place. It's not a direct correlation, though. The underlying cause is that the variance of developer skill on your team is too high, which means your team can't execute well, and you need process to wrangle the laggards.&lt;/p&gt;

&lt;p&gt;
Software development processes exist to manage the bell curve of ability in developers.  It's simple mathematics. The more people you collect on a team, the more likely it is that the team's average skill is the average skill of software developers as a whole. This is the Strong Law of Large Numbers, and it is non-negotiable. There's not a meeting you can hold to make it go away. Most organizations simply accept their fate, and design policies and procedures to keep the back half of the bell curve from causing damage. Unfortunately, policies and procedures irritate top performers, and are just more grievances on their list that will some day metastasize into a resignation letter. (Side note: if someone keeps such a list, there's a high probability they are a top performer.)
&lt;/p&gt;

&lt;p&gt;So, how do you produce good software?&lt;/p&gt;

&lt;h4&gt;Rule 1: Resist Process from the Start&lt;/h4&gt;
&lt;p&gt;Anti-process needs to be in your blood from the beginning. Growing the team causes problems, but adding people who like to make process is catastrophic. Busybodies are toxic. When we were making Milo, the development process was, for a long time, loosely coordinated chaos. Of course this caused problems, and there are a few bodies in the codebase to prove it. When there was some big fuckup, a bunch of us would sit in a room (&lt;em&gt;mistake 1&lt;/em&gt;), try to &quot;trace the root&quot; of the problem, which is code for &quot;place blame&quot; (&lt;em&gt;mistake 2&lt;/em&gt;), and then figure out what process we need to put in place to make sure it never happens again (&lt;em&gt;mistake 3&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;Every time we tried to force process on the team, it failed, because process was not part of the corporate culture. Here's a concrete example:&lt;/p&gt;

&lt;p&gt;Around the time we were raising our Series A financing, I was doing some maintenance work on our one and only database machine, Zeus. Because I had overconfigured Nagios, it was going apeshit as I was working, so I disabled all alerts for that machine. Sure enough, I forgot to re-enable alerts when I was done, and sure enough, that night, Zeus's RAID controller decided he'd had enough of our bullshit and up and died.&lt;/p&gt;

&lt;p&gt;The site was down for 6 hours and nobody noticed because we were all asleep. Do all the root cause analysis you want on that one, I fucked up, and everyone knew it. We tried policy: don't ever disable Nagios alerts, just tell people when you're working on something so they know to ignore the alerts. But nobody followed the policy because &lt;em&gt;homie don't play dat&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It turned out that nobody ever made that mistake ever again. Maybe it was just by chance, but maybe it's because of the Old Testament type stomping that the next person to do that would get.&lt;/p&gt;

&lt;p&gt;Learn from your mistakes, but don't flagellate yourself with them.&lt;/p&gt;

&lt;h4&gt;Rule 2: Grow Headcount as a Last Resort&lt;/h4&gt;

&lt;p&gt;&lt;em&gt;Headcount is not a virtue&lt;/em&gt;. Functionality is a virtue. Ability to execute is a virtue. But having a lot of people working on the project? No. It's a liability.&lt;/p&gt;

&lt;p&gt;The Strong Law of Large Numbers burns you when the team grows, so be reluctant to grow the team. When you think there is too much work for your current team to handle, do what any good programmer does: profile. Quantify the amount of time your team spends on every task, then figure out what you can optimize.&lt;/p&gt;

&lt;p&gt;When you do hire, do it carefully. Really carefully. ... More careful than that. At Milo, we do &quot;trial periods&quot;, where we invite a candidate to work with us for a few days so we can judge their work. Here's a really simple trial period task: &lt;em&gt;make this thing better&lt;/em&gt;. Use your judgment and your programming skills, just make it better. This will keep a common vision in your team.&lt;/p&gt;

&lt;h4&gt;Rule 3: Use The GitHub Workflow (And Other Good Stuff)&lt;/h4&gt;

&lt;p&gt;For brevity:&lt;/p&gt;

&lt;img class=&quot;centered&quot; src=&quot;/teddziuba/images/github-branch.png&quot;&gt;

&lt;p&gt;versus:&lt;/p&gt;

&lt;img class=&quot;centered&quot; src=&quot;/teddziuba/images/othergit.png&quot;&gt;

&lt;p&gt;Seriously, what the fuck is going on in that branch model? Policy and procedure, that's what. We used the latter branch model at Milo, as GitHub was not yet ready enough to support their awesome workflow. That complicated but popular workflow is vicious, there is just too much state to track in your head as a developer. It's not something you need to spend your time on.&lt;/p&gt;

&lt;p&gt;It's not specific to the GitHub workflow: your tooling should have the same attitude toward policy that your team does. GitHub's (the &lt;em&gt;software&lt;/em&gt;, not the &lt;em&gt;company&lt;/em&gt;) opinion is that process is unnecessary, and having the tools to support process will only beget process.&lt;/p&gt;

&lt;p&gt;If you leave enough handguns hanging around, eventually someone's going to get shot.&lt;/p&gt;

&lt;p&gt;If a tool is forcing process on you, or even encourages you to follow some process, ditch the tool and find another. Ticket tracker has a lot of fields on the ticket form? Dump it. Code review tool has a lot of shit going on? Find a new one. You get the idea. Don't let your tools infect you.&lt;/p&gt;

&lt;h4&gt;Rule 4: Just Let Go&lt;/h4&gt;

&lt;p&gt;This is the hardest one, and, if you're serious about doing away with process, is the one from which all others follow. Just stop having process. Cancel your weekly planning meetings. Get rid of your prioritized list of tasks. Stop having your daily stand-ups. No more status report e-mails.&lt;/p&gt;

&lt;p&gt;Just stop doing that stuff, and get rid of anyone who can't cope.&lt;/p&gt;

&lt;p&gt;Nothing terrible will happen.&lt;/p&gt;

&lt;p&gt;My current project, which will be launching in the first half of 2012, is as close to anarchy as a project can get. And it's working. I trust my teammates to get their work done and do it well. Sure enough, they do.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Dirty Words</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/11/dirty-words.html"/>
   <updated>2011-11-09T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/11/dirty-words</id>
   <content type="html">&lt;p&gt;Anyone who ever told you that swear words have no place in technical discussion is
right. They're right, and sadly, they're part of the problem because they
miss the point. The sterile word placement that's supposed to support an argument
makes any true motivation indistinguishable from all the hired bullshit.
&lt;/p&gt;

&lt;p&gt;Objective technical discussion is a God damned lie, and it's the most rotten kind of lie
because it's a way to stick your nose in the air, disguised as altruism.
Every time you post benchmarks, you're not moving any discussion forward.
When you compare web frameworks on one useless dimension or another, you don't
bring any value to the world. What you've done is relieve me of a task that's kind of a
pain in the ass, but by no means insurmountable. Ass scratching is not value.
You want to be heralded as a great visionary for your work,
and you think that getting on the front page of Hacker News or
Reddit means that people respect your opinion.&lt;/p&gt;


&lt;p&gt;No. Your link is just space between the ads, and fuck-all if mine aren't too.&lt;/p&gt;

&lt;p&gt;The skill it takes to write objectively about technology can be automated, and publishing
it yourself is disingenuous because it lacks passion.&lt;/p&gt;


&lt;p&gt;However, when someone starts swearing in technical discussion, showing emotion,
that's a strong indicator that I'm about to receive wisdom.  Wisdom is
earned the hard way, and it is permanent, not like some statistically shaky performance
benchmark that we'll all forget about next week.&lt;/p&gt;

&lt;p&gt;Anyone who has ever told you that swear words are a cheap way to get an audience is
right, too. I've been on both the amateur and professional side of technical journalism,
and I'll tell you this: every way to get an audience is cheap. Let's take Paul Graham, for
example. Any emotion you detect in his essays is purely by accident, but he conveys a
message, and has a following. He would not have that following if he were some guy off
the street, so rattling off any damn thing and putting a Paul Graham byline on it is a
cheap way to get an audience.&lt;/p&gt;

&lt;p&gt;But admit it, somewhere in the back of your gut is a rebellious nerve that wonders what
happens when Paul gets pissed off.&lt;/p&gt;

&lt;p&gt;People like me, Zed Shaw, and Zach Holman will give you a brutally honest answer if
you ask for it. People like Paul won't. You will get a response, but it's in
newspaper words. The same newspaper words, that, by the way, with their self-imposed
emotional blockade, allow the nicest haircut to slither into the White House every couple
of years.&lt;/p&gt;

&lt;p&gt;That's not to say that Paul is so proper in private. I don't know ﬁrst hand, but I have met
other leaders in technology who are more than willing to give you an earful over
cocktails. In public, however, they cultivate a persona.&lt;/p&gt;

&lt;p&gt;Just like I do.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Straight Talk on Event Loops</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/10/straight-talk-on-event-loops.html"/>
   <updated>2011-10-04T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/10/straight-talk-on-event-loops</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/teddziuba/images/the-node-js-community-is-now-on-auto-troll.jpg&quot; class=&quot;post-lead-image&quot;&gt;Two days ago, I &lt;a href=&quot;/2011/10/node-js-is-cancer.html&quot;&gt;pointed out&lt;/a&gt; how Node.js, an event-driven web framework,
will eat it hard if it's given any nontrivial amount of CPU work to do in its request handler. After I published that,
it seemed that the &lt;em&gt;point&lt;/em&gt; of the article went sailing right past the Node.js camp, who proceeded to see how
fast they can make a Fibonacci number generator.&lt;/p&gt;

&lt;p&gt;The Fibonacci function was arbitrary. It was inefficient on purpose. I needed a function that would use CPU time,
and chose that because it's familiar and easy to implement. So, now I offer a more formal analysis of what CPU usage
does to the throughput of an event-driven system as compared to a threaded system.&lt;/p&gt;

&lt;p&gt;Since it's now clear that reading comprehension and critical thinking are not strong suits of the Node.js programmer, I would suggest
that all Noders reading this article read it aloud, slowly and loudly, like an American tourist trying to find a train
station in Tokyo.  Furthermore, to assist the Node camp, I will highlight the important parts in large lettering,
like this:&lt;/p&gt;

&lt;h5&gt;&lt;img src=&quot;/teddziuba/images/nodejs.png&quot;&gt;When the weather is threatening rain, bring an umbrella with you.&lt;/h5&gt;

&lt;h3&gt;A Math Model of Throughput&lt;/h3&gt;
&lt;p&gt;Assume we've got a request handler that processes an HTTP request and sends back the result. Let's see how threads and event loops differ on processing these requests. Note that we're measuring
&lt;em&gt;throughput&lt;/em&gt; here, not &lt;em&gt;latency&lt;/em&gt;. That's an article for another day.&lt;/p&gt;

&lt;h5&gt;&lt;img src=&quot;/teddziuba/images/nodejs.png&quot;&gt;This is an analysis of Queries Per Second (QPS) only.&lt;/h5&gt;

&lt;p&gt;Let's start with some definitions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Let &lt;strong&gt;&lt;em&gt;C&lt;/em&gt;&lt;/strong&gt; be the amount of CPU time used by the handler, in milliseconds.&lt;/li&gt;
  &lt;li&gt;Let &lt;strong&gt;&lt;em&gt;I&lt;/em&gt;&lt;/strong&gt; be the amount of I/O time used by the handler, in milliseconds.&lt;/li&gt;
  &lt;li&gt;Let &lt;strong&gt;&lt;em&gt;W&lt;/em&gt;&lt;/strong&gt; be the wall clock time it takes for a handler to execute.
    By definition, &lt;strong&gt;&lt;em&gt;W = I + C&lt;/em&gt;&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Let &lt;strong&gt;&lt;em&gt;N&lt;/em&gt;&lt;/strong&gt; be the number of threads running in the threaded system.&lt;/li&gt;
  &lt;li&gt;Let &lt;strong&gt;&lt;em&gt;E&lt;/em&gt;&lt;/strong&gt; be the throughput of the event driven system.&lt;/li&gt;
  &lt;li&gt;Let &lt;strong&gt;&lt;em&gt;T&lt;/em&gt;&lt;/strong&gt; be the throughput of the threaded system.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Given that the times are measured in milliseconds, we can define &lt;img src=&quot;/teddziuba/images/node/eq1.png&quot;&gt; and &lt;img src=&quot;/teddziuba/images/node/eq2.png&quot;&gt;.&lt;/p&gt;

&lt;p&gt;Since the wall time &lt;em&gt;&lt;strong&gt;W&lt;/strong&gt;&lt;/em&gt; is expressable in terms of CPU time &lt;em&gt;&lt;strong&gt;C&lt;/strong&gt;&lt;/em&gt; and I/O time &lt;em&gt;&lt;strong&gt;I&lt;/strong&gt;&lt;/em&gt;,
and considering that both &lt;em&gt;&lt;strong&gt;C&lt;/strong&gt;&lt;/em&gt; and &lt;em&gt;&lt;strong&gt;I&lt;/strong&gt;&lt;/em&gt; are positive, nonzero, it is helpful define &lt;img src=&quot;/teddziuba/images/node/eq3.png&quot;&gt;, with the factor &lt;em&gt;&lt;strong&gt;k&lt;/strong&gt;&lt;/em&gt; expressing the relationship between &lt;em&gt;&lt;strong&gt;C&lt;/strong&gt;&lt;/em&gt; and &lt;em&gt;&lt;strong&gt;I&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It follows then that &lt;img src=&quot;/teddziuba/images/node/eq4.png&quot;&gt; and &lt;img src=&quot;/teddziuba/images/node/eq5.png&quot;&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;THEOREM 1.&lt;/strong&gt; When a handler takes more CPU time than I/O time, an event-driven system has greater throughput than a threaded system if and only if the threaded system has exactly one thread.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;PROOF (partial).&lt;/em&gt; &lt;small&gt;(note: for brevity, I will only prove one direction. The other direction is an exercise left for the reader.)&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;Suppose &lt;img src=&quot;/teddziuba/images/node/eq6.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Simplifying the inequality, &lt;img src=&quot;/teddziuba/images/node/eq7.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Given that &lt;img src=&quot;/teddziuba/images/node/eq8.png&quot;&gt;, we can bound the inner term &lt;img src=&quot;/teddziuba/images/node/eq9.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;Further simplifying &lt;img src=&quot;/teddziuba/images/node/eq10.png&quot;&gt;&lt;/p&gt;

&lt;p&gt;Since &lt;em&gt;&lt;strong&gt;N&lt;/strong&gt;&lt;/em&gt; is integral and nonzero, it follows that &lt;img src=&quot;/teddziuba/images/node/eq11.png&quot;&gt;.&lt;/p&gt;

&lt;h5&gt;&lt;img src=&quot;/teddziuba/images/nodejs.png&quot;&gt;If you do more CPU than I/O, use threads.&lt;/h5&gt;

&lt;p&gt;&lt;strong&gt;THEOREM 2.&lt;/strong&gt;When the handler takes more I/O time than CPU time, an event-driven system has greater throughput than a threaded system if and only if &lt;img src=&quot;/teddziuba/images/node/eq12.png&quot;&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;PROOF (partial).&lt;/em&gt;&lt;small&gt;(note: again for brevity, I will prove one direction).&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;Given our previous construction, &lt;img src=&quot;/teddziuba/images/node/eq7.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;and the alternate expression &lt;img src=&quot;/teddziuba/images/node/eq13.png&quot;&gt;&lt;/p&gt;
&lt;p&gt;it follows that &lt;img src=&quot;/teddziuba/images/node/eq14.png&quot;&gt;.&lt;/p&gt;

&lt;h5&gt;&lt;img src=&quot;/teddziuba/images/nodejs.png&quot;&gt;If you do more I/O than CPU, use more threads.&lt;/h5&gt;

&lt;h3&gt;A Practical Example&lt;/h3&gt;

&lt;p&gt;Let's suppose you have a request handler that does 10 milliseconds of CPU work and 50 milliseconds of database I/O. Would you choose threads or events?&lt;/p&gt;

&lt;p&gt;I this case, the theoretical maximum throughput of the event driven system is &lt;em&gt;&lt;strong&gt;1000/10 = 100 QPS&lt;/strong&gt;&lt;/em&gt;, where as a threaded system with 50 threads has a theoretical maximum throughput of &lt;em&gt;&lt;strong&gt;50,000/60 = 833.33 QPS&lt;/strong&gt;&lt;/em&gt;. Of course, in the threaded case, you need to worry about being bound by the CPU, but given the number of cores on modern hardware, threads seems like a winner here.&lt;/p&gt;

&lt;h3&gt;Multiple Event Workers&lt;/h3&gt;

&lt;p&gt;The Noders got really into this one: forking &quot;workers&quot; from your event loop to do the heavy CPU work, and having them call back to the event loop when they're done. One parent process coordinates work among many children? Where have I heard that before?&lt;/p&gt;

&lt;p&gt;Anyhow, let's extend the model to that case. Just for funsies.&lt;/p&gt;

&lt;p&gt;Since your asynchronous processes do not block on I/O, at full utilization, they will theoretically take 100% of the CPU. Therfore, the number of worker processes to spawn must be equal to the number of CPUs in the system to avoid oversubscribing the machine. Let's introduce a new variable, &lt;em&gt;&lt;strong&gt;M&lt;/strong&gt;&lt;/em&gt;, to represent the number of CPUs in the computer.&lt;/p&gt;

&lt;p&gt;The throughput formula for the event driven system therefore becomes &lt;img src=&quot;/teddziuba/images/node/eq15.png&quot;&gt;&lt;/p&gt;

&lt;p&gt;Now, with threads, we also need to avoid oversubscribing the CPU. Considering that during a single handler execution, only &lt;em&gt;&lt;strong&gt;C&lt;/strong&gt;&lt;/em&gt; milliseconds of CPU are used, it follows that the number of threads that will achieve theoretical maximum utilization is &lt;img src=&quot;/teddziuba/images/node/eq16.png&quot;&gt;.&lt;/p&gt;

&lt;p&gt;Our formula for the threaded system's throughput is therefore &lt;img src=&quot;/teddziuba/images/node/eq17.png&quot;&gt;&lt;/p&gt;

&lt;p&gt;...but look at this: &lt;img src=&quot;/teddziuba/images/node/eq18.png&quot;&gt;&lt;/p&gt;

&lt;h5&gt;&lt;img src=&quot;/teddziuba/images/nodejs.png&quot;&gt;At full utilization, threads and events have the same theoretical throughput.&lt;/h5&gt;

&lt;p&gt;This makes intuitive sense, as if the CPUs are working as hard as they can, all else equal, they should yield the same performance regardless of the framework used.&lt;/p&gt;

&lt;h5&gt;&lt;img src=&quot;/teddziuba/images/nodejs.png&quot;&gt;Hold up, this does not prove that Node is good.&lt;/h5&gt;

&lt;p&gt;Of course, in a practical setting, threads have a greater memory overhead, and programming an event loop with multiple workers just seems silly, as if you're doing that much CPU work in an event looped system, you've already fucked up somewhere, so why add to it?&lt;/p&gt;

&lt;h3&gt;Node.js Is Still Cancer&lt;/h3&gt;
&lt;p&gt;So, let's review.&lt;/p&gt;

&lt;p&gt;Suppose you're a less-than-expert programmer, which Node seems to attract in droves for some reason. You are using Node for the supposed &quot;scalability&quot; of it, but as we have just seen, threaded programming, which is easier to understand than callback driven programming, meets or exceeds the asynchronous model in the vast majority of cases. Chances are, you're not going to be forking worker processes to do CPU jobs, what with the less-than-expert and all.&lt;/p&gt;

&lt;p&gt;Therefore, the reason you're using Node is not a lack of technical ability, it's because all the cool kids are doing it.&lt;/p&gt;

&lt;h5&gt;&lt;img src=&quot;/teddziuba/images/nodejs.png&quot;&gt;Node.js is a danger to novice programmers.&lt;/h5&gt;

&lt;p&gt;Next, suppose you're an expert programmer, and you've got some CPU bound work that you fork off to child processes to keep your event loop trucking. OK man, how complicated do you want to make this thing? At full capacity, you're at par with threads, provided it's not memory bound. At this point, you are less focused on solving the problem at hand than you are on coming up with something you can blog about and get on programming Reddit.&lt;/p&gt;

&lt;h5&gt;&lt;img src=&quot;/teddziuba/images/nodejs.png&quot;&gt;If you're forking workers in Node, you've got bigger problems.&lt;/h5&gt;

&lt;p&gt;Plus, it's fucking &lt;em&gt;JavaScript&lt;/em&gt; ... on the &lt;em&gt;server&lt;/em&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Node.js is Cancer</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/10/node-js-is-cancer.html"/>
   <updated>2011-10-01T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/10/node-js-is-cancer</id>
   <content type="html">&lt;p&gt;
&lt;img class=&quot;post-lead-image&quot; src=&quot;/teddziuba/images/a-toddler-is-like-a-raccoon-that-knows-how-to-lie.jpg&quot;&gt;
If there's one thing web developers love, it's knowing better than conventional wisdom, but conventional wisdom is conventional for a reason: &lt;em&gt;&lt;strong&gt;that shit works&lt;/strong&gt;&lt;/em&gt;. Something's been bothering me for a while about this node.js nonsense, but I never took the time to figure it out until I read this &lt;a href=&quot;https://plus.google.com/115094562986465477143/posts/Di6RwCNKCrf&quot; rel=&quot;nofollow&quot;&gt;butthurt post&lt;/a&gt; from Ryan Dahl, Node's creator. I was going to shrug it off as just another jackass who whines because Unix is hard. But, like a police officer who senses that something isn't quite right about the family in a minivan he just pulled over and discovers fifty kilos of black horse heroin in the back, I thought that something &lt;em&gt;wasn't quite right&lt;/em&gt; about this guy's aw-shucks sob story, and that maybe, just maybe, he has no idea what he is doing, and has been writing code unchecked for years.
&lt;/p&gt;

&lt;p&gt;Since you're reading about it here, you probably know how my hunch turned out.&lt;/p&gt;

&lt;p&gt;Node.js is a tumor on the programming community, in that not only is it completely braindead, but the people who use it go on to infect other people who can't think for themselves, until eventually, every asshole I run into wants to tell me the gospel of event loops. &lt;em&gt;Have you accepted epoll into your heart?&lt;/em&gt;
&lt;/p&gt;

&lt;h3&gt;A Scalability Disaster Waiting to Happen&lt;/h3&gt;
&lt;p&gt;Let's start with the most horrifying lie: that node.js is scalable because it &quot;never blocks&quot; &lt;em&gt;(Radiation is good for you! We'll &lt;a href=&quot;http://www.orau.org/ptp/collection/quackcures/toothpaste.htm&quot;&gt;put it in your toothpaste!&lt;/a&gt;)&lt;/em&gt;. On the Node home page, they say this:


&lt;blockquote&gt;
Almost no function in Node directly performs I/O, so the process never blocks. Because nothing blocks, less-than-expert programmers are able to develop fast systems.
&lt;/blockquote&gt;

This statement is enticing, encouraging, and completely fucking wrong.&lt;/p&gt;

&lt;p&gt;Let's start with a definition, because you Reddit know-it-alls keep your specifics in the pedantry. A function call is said to &lt;strong&gt;block&lt;/strong&gt; when the current thread of execution's flow waits until that function is finished before continuing. Typically, we think of I/O as &quot;blocking&quot;, for example, if you are calling &lt;code&gt;socket.read()&lt;/code&gt;, the program will wait for that call to finish before continuing, as you need to do something with the return value.&lt;/p&gt;

&lt;p&gt;Here's a fun fact: every function call that does CPU work also blocks. This function, which calculates the n'th Fibonacci number, will block the current thread of execution because it's using the CPU.

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;js&quot;&gt;&lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;fibonacci&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;fibonacci&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;fibonacci&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;em&gt;(Yes, I know there's a closed form solution. Shouldn't you be in front of a mirror somewhere, figuring out how to introduce yourself to her?.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Let's see what happens to a node.js program that has this little gem as its request handler:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;js&quot;&gt;&lt;span class=&quot;nx&quot;&gt;http&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;createServer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;req&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;writeHead&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;&amp;#39;text/plain&amp;#39;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;res&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;fibonacci&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;40&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;listen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1337&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;127.0.0.1&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
On my older laptop, this is the result:

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;console&quot;&gt;&lt;span class=&quot;gp&quot;&gt;ted@lorenz:~$&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;curl http://localhost:1337/
&lt;span class=&quot;go&quot;&gt;165580141&lt;/span&gt;
&lt;span class=&quot;go&quot;&gt;real	0m5.676s&lt;/span&gt;
&lt;span class=&quot;go&quot;&gt;user	0m0.010s&lt;/span&gt;
&lt;span class=&quot;go&quot;&gt;sys	0m0.000s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


5 second response time. Cool. So we all know JavaScript isn't a terribly fast language, but why is this such an indictment? It's because Node's evented model and brain damaged fanboys make you think everything is OK. In really abusive pseudocode, this is how an event loop works:
&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;cpp&quot;&gt;&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;ready_file_descriptor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;event_library&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;poll&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;handle_request&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ready_file_descriptor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
That's all well and good if you know what you're doing, but when you apply this to a server problem, you've pluralized that shit. If this loop is running in the same thread that &lt;code&gt;handle_request&lt;/code&gt; is in, any programmer with a pulse will notice that &lt;em&gt;the request handler can hold up the event loop&lt;/em&gt;, no matter how asynchronous your library is.&lt;/p&gt;

&lt;p&gt;So, given that, let's see how my little node server behaves under the most modest load, 10 requests, 5 concurrent:

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;console&quot;&gt;&lt;span class=&quot;gp&quot;&gt;ted@lorenz:~$&lt;/span&gt; ab -n 10 -c 5 http://localhost:1337/
&lt;span class=&quot;go&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;go&quot;&gt;Requests per second:    0.17 [#/sec] (mean)&lt;/span&gt;
&lt;span class=&quot;go&quot;&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;



&lt;em&gt;0.17 queries per second&lt;/em&gt;. Diesel. Sure, Node allows you to fork child processes, but at that point your threading/event model is so tightly coupled that you've got bigger problems than scalability.&lt;/p&gt;

&lt;p&gt;Considering Node's original selling point, I'm God Damned terrified of any &quot;fast systems&quot; that &quot;less-than-expert programmers&quot; bring into this world.&lt;/p&gt;

&lt;h3&gt;Node Punishes Developers Because it Disobeys the Unix Way&lt;/h3&gt;

&lt;p&gt;A long time ago, the original neckbeards decided that it was a good idea to chain together small programs that each performed a specific task, and that the universal interface between them should be text.&lt;/p&gt;

&lt;p&gt;If you develop on a Unix platform and you abide by this principle, the operating system will reward you with simplicity and prosperity. As an example, when web applications first began, the &lt;em&gt;web application&lt;/em&gt; was just a program that printed text to standard output. The &lt;em&gt;web server&lt;/em&gt; was responsible for taking incoming requests, executing this program, and returning the result to the requester. We called this CGI, and it was a good way to do business until the micro-optimizers sank their grubby meathooks into it.&lt;/p&gt;

&lt;p&gt;Conceptually, this is how any web application architecture that's not cancer still works today: you have a web server program that's job is to accept incoming requests, parse them, and figure out the appropriate action to take. That can be either serving a static file, running a CGI script, proxying the connection somewhere else, whatever. The point is that the HTTP server isn't the same entity doing the application work. Developers who have been around the block call this &lt;em&gt;separation of responsibility&lt;/em&gt;, and it exists for a reason: loosely coupled architectures are very easy to maintain.&lt;/p&gt;

&lt;p&gt;And yet, Node seems oblivious to this. Node has (and don't laugh, I am not making this shit up) its own HTTP server, &lt;em&gt;and that's what you're supposed use to serve production traffic&lt;/em&gt;. Yeah, that example above when I called &lt;code&gt;http.createServer()&lt;/code&gt;, that's the preferred setup.&lt;/p&gt;

&lt;p&gt;If you search around for &quot;node.js deployment&quot;, you find a bunch of people putting Nginx in front of Node, and some people use a thing called Fugue, which is another JavaScript HTTP server that forks a bunch of processes to handle incoming requests, as if somebody maybe thought that this &quot;nonblocking&quot; snake oil might have an issue with CPU-bound performance.&lt;/p&gt;

&lt;p&gt;If you're using Node, there's a 99% probability that you are both the developer and the system administrator, because any system administrator would have talked you out of using Node in the first place. So you, the developer, must face the punishment of setting up this HTTP proxying orgy if you want to put a real web server in front of Node for things like serving statics, query rewriting, rate limiting, load balancing, SSL, or any of the other futuristic things that modern HTTP servers can do. That, and it's another layer of health checks that your system will need.&lt;/p&gt;

&lt;p&gt;Although, let's be honest with ourselves here, if you're a Node developer, you are probably serving the application directly from Node, running in a screen session under your account.&lt;/p&gt;

&lt;h3&gt;It's Fucking &lt;em&gt;JavaScript&lt;/em&gt;&lt;/h3&gt;

&lt;p&gt;This is probably the worst thing any server-side framework can do: be written in JavaScript.

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;js&quot;&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;typeof&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;my_var&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!==&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;undefined&amp;quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;my_var&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!==&lt;/span&gt; &lt;span class=&quot;kc&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;// you idiots put Rasmus Lerdorf to shame&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


What is this I don't even...
&lt;/p&gt;

&lt;h3&gt;tl;dr&lt;/h3&gt;
&lt;p&gt;Node.js is an unpleasant software library and I will not use it.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>The Craigslist Reverse Programmer Troll</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/07/the-craigslist-reverse-programmer-troll.html"/>
   <updated>2011-07-13T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/07/the-craigslist-reverse-programmer-troll</id>
   <content type="html">&lt;p&gt;&lt;img class=&quot;post-lead-image&quot; src=&quot;/teddziuba/images/you-can-lead-a-whore-to-culture-but-you-cant-make-her-think--also-applies-to-software-developers.jpg&quot;&gt; Stop me if you have heard this before. I'm a business guy, not so good on the technical side, and I've got a great idea that I need a programmer to develop for me. I don't have any funding yet, but I've got a really nebulous connection to the venture capital world. That being said, I'll start paying you once we get funding or we start making a lot of money from the project! All you need to do is write a Facebook clone in 2 weeks. For a smart programmer like you, that should be easy, right? I'll also cut you in on a little bit of equity. Let's get started!&lt;/p&gt;

&lt;p&gt;This kind of shit lands on Craigslist so often that it makes you wonder what they actually teach at business schools (Side note: I recently learned that if you earn $1, you get to multiply that by about 20, the price/earnings ratio, meaning that the $1 you've earned is actually $20 in value. Makes me wonder how MBAs do differential equations.). It's time for we programmers to take revenge. So, a couple of months ago, I did a reverse-programmer troll on Craigslist. It went something like this:&lt;/p&gt;

&lt;p&gt;
&lt;blockquote&gt;
&lt;p&gt;
Title:(computer gigs) Looking for Tech Idea Person
&lt;/p&gt;
&lt;p&gt;
I am a computer programmer looking for a top-notch idea person to help build the next great internet company.
Being a good programmer, I don't have many business ideas of my own. That's where you come in.
&lt;/p&gt;
&lt;p&gt;
The perfect idea person to work with me will have:
&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A great business idea!&lt;/li&gt;
  &lt;li&gt;At least a passing knowledge of computers and the internet.&lt;/li&gt;
  &lt;li&gt;A vague reference to knowing somebody in the venture capital industry.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;
I have whipped up peoples' ideas very quickly in the past.
 Here is a list of some of the things I've built, and how long it's taken:
&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;A facebook clone (4 days)&lt;/li&gt;
  &lt;li&gt;A flickr-like photo sharing web site (3 days)&lt;/li&gt;
  &lt;li&gt;A Google-like seach engine (2 weeks - longer because you have to stop spam!)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;
And hopefully, I can add YOUR idea to the list!
&lt;/p&gt;

&lt;p&gt;
Note that I have a structured settlement from a lawsuit that I covers my basic expenses,
so I don't need to be paid, but of course once we start making money, I'd like to be paid.  Since I put the code together for these web sites so quickly,
it's not fair to do a 50/50 split of ownership,
but I would like to have at least a couple of percent.
&lt;/p&gt;
&lt;/blockquote&gt;


&lt;p&gt;Too obvious, right? This can't possibly generate any responses, I thought. Nope. 31 replies in about 2 hours, before Craigslist pulled the post. Here are some of the highlights:&lt;/p&gt;


&lt;blockquote&gt;
Hi there,
&lt;br/&gt;
I have been looking for a Man like you for years now.
&lt;br/&gt;
i have a few superb ideas which have big potential.
&lt;br/&gt;
please share you phone number.
&lt;br/&gt;
we will discuss more about it.
&lt;br/&gt;
thanks,
&lt;br/&gt;
SP.
&lt;/blockquote&gt;

&lt;p&gt;Swingers have a term, 'unicorn'. Look it up. It's called a unicorn for a reason.&lt;/p&gt;




&lt;blockquote&gt;
We are introducing a similar site to Groupon.com. We are currently speaking with [VC I have never heard of] of [fund I have never heard of], who is interested in funding our project after beta is up. We businesses people who have been extremely successful in our past positions, looking for someone like you. We to get our site up and running, I already have one programmer. I would love to speak with you more, can we arrange a meeting at my Fremont office?
&lt;br/&gt;&lt;br/&gt;
[Business Dude]
&lt;/blockquote&gt;

&lt;p&gt;How refreshingly original. Do go on.&lt;/p&gt;




&lt;blockquote&gt;
Hi,
&lt;br/&gt;
I'm in the middle of a business plan and looking for co-founders to develop a web appliance for the medical industry.   Do you know how to develop on a LAMP platform?   Also, please let me know where you are located so we can arrange a meeting.
&lt;br/&gt;
Cheers
&lt;br/&gt;
--j
&lt;/blockquote&gt;

&lt;p&gt;LAMP? Web appliance? Medical industry? Excuse me, I'm getting flustered by how &lt;em&gt;awesome&lt;/em&gt; this idea probably is. Either that or it's the taste of bile. Can't tell.&lt;/p&gt;


&lt;blockquote&gt;
People are afraid of lonely but dont't want to go out to street. People want to share and want
communicate.  That why social network is sucess such as facebook, youtube ...
&lt;/blockquote&gt;
&lt;p&gt;Has anyone really been far even as decided to use even go want to do look more like?&lt;/p&gt;


&lt;blockquote&gt;
Hello ,
&lt;br/&gt;
I am happy to contact you with a set of fresh offerings.&lt;br/&gt;
- Saas&lt;br/&gt;
- Web 2.0&lt;br/&gt;
- Enterprise 2.0&lt;br/&gt;
- Open source consulting and Implementation&lt;br/&gt;
&lt;br/&gt;
We are sure you will find it compelling. We have full time web developers and designers working with us in various technology stacks &amp; moreover you can also hire them according to your need and get your stuff completed from your virtual team working dedicatedly for you fulltime (We can arrange for a telephonic Interview also with these designers &amp; developers).  We do have a global presence catering clients in US, UK, Australia and Canada.  We have tons of expertise in developing  Web 2.0, Action Script, AJAX, CSS, C, C++, JavaScript, XML, PHP, Joomla,Drupal,Wordpress, E commerce, SEO and .NET Framework, Mobile application  based projects.
&lt;br/&gt;&lt;br/&gt;
We can help with designing team who as expertise on designing tools such as ADOBE Suites,CS3,CS4,Flash,HTML etc
&lt;br/&gt;&lt;br/&gt;
It will be great if you can have a short call/chat for a better understanding. As we work round the clock, time zone will not be a problem. Please let me know your time of convenience and a number / Skype id through which I can reach you..
&lt;br/&gt;&lt;br/&gt;
For Web Application development we work for $12/hr and for Mobile application development we work for $15/hr.
&lt;br/&gt;&lt;br/&gt;
Price will never be a deal breaker its always negotiable depending upon the project requirement.
&lt;br/&gt;&lt;br/&gt;
Looking forward to hear from you for a win win business relationship.
&lt;/blockquote&gt;

&lt;p&gt;Dammit, I knew I should have listed &lt;em&gt;reading comprehension&lt;/em&gt; as a requirement.&lt;/p&gt;



&lt;blockquote&gt;
You Can't Be Serious.&lt;br/&gt;
You sound like what every CL poster's dream, a poster dream&lt;br/&gt;
You should have added links.
&lt;/blockquote&gt;
&lt;p&gt;This was from a fellow programmer who apparently got the joke. Jim F, keep on keepin' on, my brother.&lt;/p&gt;


&lt;p&gt;Not long after that, Craigslist pulled the post, or, &lt;em&gt;flagged and removed&lt;/em&gt;, which is the jargon for &lt;em&gt;troll detected&lt;/em&gt;. Oh well, it was fun for a while.&lt;/p&gt;



</content>
 </entry>
 
 <entry>
   <title>The Most Important Concept in Systems Design</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/06/most-important-concept-systems-design.html"/>
   <updated>2011-06-30T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/06/most-important-concept-systems-design</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/teddziuba/images/the-other-knowledge-i-will-bestow-is-to-never-squander-an-opportunity-to-piss.jpg&quot; class=&quot;post-lead-image&quot;/&gt;

About a year ago, I hit a turning point in my career as a programmer after reading &lt;em&gt;&lt;a href=&quot;http://www.catb.org/~esr/writings/taoup/html/&quot;&gt;The Art of Unix Programming&lt;/a&gt;&lt;/em&gt; by Eric S. Raymond. It was written before web apps became popular, but everything in it still applies. ESR tries to communicate the Unix Way to programmers, and if you can read and understand that book, you'll have an &lt;em&gt;A-HA!&lt;/em&gt; moment where, for just for a second, everything will make sense. It's kind of like the first time you smoke pot and realize that everyone and everything are made out of atoms, or the first time you figure out how fuckin' magnets work.
&lt;/p&gt;

&lt;p&gt;Then, of course, the understanding is gone, and you spend a long time trying to get it back.&lt;/p&gt;

&lt;p&gt;If you don't feel like reading it, the most important take-away for web programmers is the &lt;strong&gt;Single Point of Truth Rule&lt;/strong&gt;, that is, &lt;em&gt;&quot;every piece of knowledge must have a single, unambiguous, authoritative representation within a system&quot;&lt;/em&gt;. If you design a system that violates this rule, you are setting yourself up for endless headaches and disasters.&lt;/p&gt;

&lt;h3&gt;Stuff You Need to Stop Doing&lt;/h3&gt;

&lt;p&gt;There are a couple of really common SPOT violations I've seen in the web world, and they almost all revolve around misconceptions about databases. No, wait, strike that. Almost every SPOT violation I've seen stems, in one way or another, from MySQL's failure at being a database. Actually, most of the NoSQL catastrophe is a product of MySQL's perverse representation of the SQL spec, but that's not the drum I'm here to beat.&lt;/p&gt;

&lt;h4&gt;1. Using Solr to Search a Database&lt;/h4&gt;

&lt;p&gt;I know you have all gone through this same thought process at least once. I've gone through it twice. You need to add search to your database-backed website, and decide to use Apache Solr. You set up an &quot;indexing pipeline&quot; that either sucks information from your database regularly and throws it into Solr, or you have a single entry point for inserts/updates that updates the DB as well as Solr.&lt;/p&gt;

&lt;p&gt;The problem is that your application doesn't know what the single point of truth for records is.  If results come back from Solr, do you display the data that came from the search index? Do you take those row IDs, query the database, and then display the DB-backed data? Any way you try to solve this problem, it gets painful.  And, oh Jesus, what happens if your indexing bridge program fails? How long will it go on failing before you notice that something is wrong?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution: &lt;/strong&gt; &lt;em&gt;Don't do that.&lt;/em&gt; PostgreSQL actually has pretty awesome full-text search capabilities. You can layer your query tuning on top of it if you like; that doesn't violate SPOT. Hell, you could even use Solr as your database, if it will satisfy the requirements.&lt;/p&gt;

&lt;h4&gt;2. Precaching SQL Results in a NoSQL System&lt;/h4&gt;
&lt;p&gt;This is somewhat less common, but still happens. You've got all your data in MySQL with a great object-relational model, but querying it from your web app involves a 3 or 4 way JOIN, which causes MySQL to choke, as it hasn't yet learn how to open its throat all the way. So, to fix it, you precompute some data structures and store them in something like Memcached or Redis.&lt;/p&gt;

&lt;p&gt;Durr, what if something changes? Do you recompute the data structure on the fly? Update the NoSQL and queue the SQL write for later? You can rig up something so that it appears to work, but as soon as your &quot;sync&quot; script stops running, you're proper fucked, because you can't be sure which representation of the data is authoritative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution: &lt;/strong&gt; &lt;em&gt;Don't do that.&lt;/em&gt; If your authoritative DB is too slow, either re-jigger your data model or change databases to one that will work. If you truly want to add a caching layer to your application, do it at the outer-most point, that is, in front of your web servers. In my experience, any attempt to cache below the presentation layer just leads to consistency disasters. Plus, caching at the edge is easy: you can use Squid or a commercial content delivery network.&lt;/p&gt;

&lt;h4&gt;3. Frankendatabases&lt;/h4&gt;
&lt;p&gt;OK, I've spit enough truth about MySQL. This is a problem that I've recently had to design around: you have information from multiple databases that you want to query in one database. If you work with a legacy system, you know what I'm talking about. There's such a sweaty-palmed temptation to write a sync script to pull data from database A, database B, and update records in your database.&lt;/p&gt;

&lt;p&gt;You get yourself into logic like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;pre&gt;
my_record = query(&quot;SELECT * FROM records WHERE id = x&quot;)
if not my_record:
  insert_into_my_db(their_record)
elif their_record != my_record:
  update_my_db(their_record)
&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a consistency nightmare, among other things. If the sync script breaks, blah blah blah, if you need to write &lt;em&gt;new&lt;/em&gt; records to your DB, you're just making the problem worse for the next guy who needs to do something similar, and of course, this will turn into a shit-ton of queries at update-o-clock.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution:&lt;/strong&gt; You guessed it: &lt;em&gt;Don't do that.&lt;/em&gt; Do your best to canonically consolidate the databases. Sometimes this isn't practical in legacy systems, so avoid the aggregation DB if you can - query both A and B directly. If you can't that, then hide your shame somewhere out of the way.&lt;/p&gt;

&lt;h3&gt;A Sign That You're Doing it Wrong&lt;/h3&gt;

&lt;p&gt;In general, if you find yourself writing a lot of plumbing code, or &quot;updater&quot; scripts, or if your crontab is longer than like 10 lines, chances are you've fucked up and have a SPOT violation somewhere in your architecture. Writing code like that that is tedious and painful, and pain is generally a sign that you should stop doing something.&lt;/p&gt;

&lt;p&gt;So stop doing that.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Come to OSCON Data</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/06/oscon-data.html"/>
   <updated>2011-06-15T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/06/oscon-data</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/teddziuba/images/they-asked-for-something-edgy-so-30-minutes-before-the-show-im-going-to-drop-2-hits-of-high-powered-blotter-acid.jpg&quot; class=&quot;post-lead-image&quot;/&gt;Hey, what are you doing July 25th at around 1:30PM? I know what &amp;mdash; you're watching my live-action troll at OSCON Data in Portland, Oregon: deep inside bat country.&lt;/p&gt;

&lt;p&gt;The formal title of my talk is &quot;What Every Programmer Needs to Know About Disks&quot;, an overview of why everything you know about disk I/O is wrong, how vendors lie to you, and how a little knowledge of how disks work down to the hardware will make EC2 customers think &lt;em&gt;Jesus, this neighborhood's really gone to shit. I've got to get out of here before those fucking brutes throw a flower pot through the window and make off with my TV.&lt;/em&gt;

&lt;p&gt;It's going to be geared toward Linux platform programmers, as Linux will do most of your job for you if you point it in the right direction. Bad systems administrators may learn a thing or two. If you want to show up to troll me, that's fine too, just remember &lt;em&gt;I'm the one with the microphone.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;You can &lt;a href=&quot;https://en.oreilly.com/oscon2011/public/register&quot;&gt;buy tickets here&lt;/a&gt;, and use the code &lt;strong&gt;os11fos&lt;/strong&gt; for 20% off. They're not paying me, I'm in it for the luls.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Amazon &amp;mdash; The Purpose of Pain</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/04/amazon-the-purpose-of-pain.html"/>
   <updated>2011-04-23T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/04/amazon-the-purpose-of-pain</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/teddziuba/images/juggalos-welcome-but-kings-of-leon-fans-need-not-apply.jpg&quot; class=&quot;post-lead-image&quot;/&gt;Pain is nature's way of telling you that you have just fucked up.  It's a hint to your future self that maybe you should never do that again. Yet, you dumbasses continue to host things full-bore in Amazon. Since its inception, EC2 has gone down, S3 has dropped off the face of the earth, and Amazon's Elastic Block Store bludgeons Reddit to death every few weeks, but yet every time, the &lt;a href=&quot;http://broadcast.oreilly.com/2011/04/the-aws-outage-the-clouds-shining-moment.html&quot; rel=&quot;nofollow&quot;&gt;apologists&lt;/a&gt; line up. (Since it's foolish to waste a perfectly good crisis, some jackass is even selling an &lt;a href=&quot;http://www.ablebots.com/ec2enabled/&quot;&gt;eBook&lt;/a&gt; about how to design your service around EC2 failures).&lt;/p&gt;

&lt;p&gt;There's a point where &quot;I told you so&quot; doesn't quite fit, so in light of Amazon's most recent aristocrats joke, let's explore some common myths about Cloud Computing that developers actually believe.&lt;/p&gt;


&lt;h3&gt;Myth 1: SLAs Are Meaningful&lt;/h3&gt;
&lt;p&gt;Amazon EC2 has a stated service level agreement of 99.95% uptime, yearly. As of right now, EC2's uptime is 99.23%, well below the SLA. Since computer programmers like to take a pathologically literal interpretation of the law and contracts, they usually don't understand the reality of such matters.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&quot;But, but, EC2 is violating their SLA! That can't happen!&quot;&lt;br/&gt;
&quot;It just did.&quot;&lt;br/&gt;
&quot;But, but...Segmentation Fault (core dumped)&quot;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The trouble with SLAs is that &lt;strong&gt;&lt;em&gt;shit happens&lt;/em&gt;&lt;/strong&gt; is not yet in the vernacular of modern jurisprudence.  You should never try to compare hosts based on SLA, compare them based on how they respond to downtime, because it will happen everywhere you go, without fail. For example, the machine that is serving you this web page is a physical box hosted by SoftLayer at a data center in Seattle. Last week, I had about an hour worth of downtime because of some networking problems in their data center. Whatever, like I said, &lt;em&gt;shit happens&lt;/em&gt;. What I'm really looking for is communication. I logged a ticket with support, and in six minutes they updated me about the situation, how widespread it was, and an ETA on the fix. The tech also asked if there was anything else he could do for me. They restored connectivity quickly, but did not keep me in the dark about what was going on.&lt;/p&gt;

&lt;p&gt;Try that with Amazon. There's a &lt;a href=&quot;https://forums.aws.amazon.com/thread.jspa?threadID=65649&amp;tstart=0&quot;&gt;thread&lt;/a&gt; on the AWS forum where some genius decided to host safety critical software on EC2, and can't get his data up. The thread was posted on Friday, it's now Saturday, and with Sunday coming afterward, I'm pretty sure that nobody whose safety depends on EC2 is lookin' forward to the weekend. Now, maybe it's a troll, but not even a &quot;we're working on it&quot; reply?&lt;/p&gt;

&lt;h3&gt;Myth 2: Architecture Will Save You from Cloud Failures&lt;/h3&gt;

&lt;p&gt;Fault-tolerant architecture a centerpiece of the NoSQL dog and pony show, but by and large, the programmers using it don't understand that the software depends on hardware. Note, I said &lt;em&gt;hardware&lt;/em&gt; not &lt;em&gt;virtual machines&lt;/em&gt;. The trouble with using virtual machines is that your visibility into the actual metal of the device ends at the hypervisor.  There are certain things that software packages must hold sacrosanct, for example, the &lt;code&gt;fsync()&lt;/code&gt; system call, that instructs the kernel to make sure that data is written to physical disk. In virtual machine land, whether or not &lt;code&gt;fsync()&lt;/code&gt; does what it should is a bit of a mystery. This gets even more entertaining with Amazon Elastic Block Store, which, as the Reddit administrators have found, will happily accept calls to &lt;code&gt;fsync()&lt;/code&gt;, and lie to your face, saying that the data has been written to disk, when it may not have been.&lt;/p&gt;

&lt;p&gt;No amount of architecture is going to save you from lying virtual hardware. Applications, especially databases, are built on the assumption that there is an atomic way to commit data to disk. Sure, there are problems with disk writeback caches sometimes, but anybody who knows what they are doing can check to see if it's actually going to be an issue. If you're running on Amazon's virtual machines, take a guess; it's turtles all the way down.&lt;/p&gt;


&lt;h3&gt;Myth 3: A Virtual Machine is an Appropriate Gift for All Occasions&lt;/h3&gt;

&lt;p&gt;This, perhaps, is the cause of such widespread service downtime &amp;mdash; developers who are hosting entire services full-bore on virtual machines. VMs have their place, sure, but they are by no means the solution to every hardware problem. In my experience, you should use VMs for:

&lt;ul&gt;
  &lt;li&gt;Web application servers&lt;/li&gt;
  &lt;li&gt;Offline data processing&lt;/li&gt;
  &lt;li&gt;Squid/Memcache servers&lt;/li&gt;
  &lt;li&gt;One-off utility computing&lt;/li&gt;
&lt;/ul&gt;

The general rule is that if the machine eats shit, nothing of value will be lost. Remember what I said about pain in the beginning? If you're hosting a database on a VM, well, at some point it will become abundantly obvious.&lt;/p&gt;

&lt;p&gt;The trouble with &quot;commodity&quot; computing is that servers are not really something that should be commoditized. There is so much variability in these devices that a &quot;six sizes fit all&quot; offering is insulting. The things you can do with disk controllers alone are more than worth the effort to colocate hardware in a data center. Personally, I prefer to solve problems with hardware than software, for example, throwing SSD drives in a database machine that is dogging it on disk I/O. It's much more business-efficient to throw money at performance problems than it is to throw code at them, but I guess some of you guys just really like to type.&lt;/p&gt;

&lt;p&gt;Personal preference, I suppose.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>MacOS X is an Unsuitable Platform for Web Development</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/03/osx-unsuitable-web-development.html"/>
   <updated>2011-03-27T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/03/osx-unsuitable-web-development</id>
   <content type="html">&lt;p&gt;&lt;img class=&quot;post-lead-image&quot; src=&quot;/teddziuba/images/you-never-have-questions-about-whether-linux-users-are-male-or-female.jpg&quot;&gt;Part of the process of becoming a new eBay employee is selecting your company laptop. I was offered a choice: Lenovo Thinkpad or MacBook Pro. Coming from a Linux development world, I picked the Mac, thinking it would be closer to what I am used to. &lt;/p&gt;

&lt;p&gt;Man, did I fuck up.&lt;/p&gt;

&lt;p&gt;Thankfully, I still have my Ubuntu workstation to get &lt;em&gt;real work&lt;/em&gt; done on, but the Mac does it duty &amp;mdash; running Outlook, maybe Firefox or Google Chrome every now and then. Oh, I also have VMWare installed on it so I can boot Windows to browser test in Internet Explorer.  I should have picked the PC, at least then I would save myself the step of booting VMWare.&lt;/p&gt;

&lt;p&gt;So what's wrong with using the Mac as a development machine for Milo, a Python application backed by PostgreSQL and Redis (or any web project, for that matter)? Well, sacred cow, here come the spears.&lt;/p&gt;

&lt;h3&gt;Horrific package management&lt;/h3&gt;
&lt;p&gt;I've only poked around a little, but so far I've found three separate package managers for OS X: Fink, MacPorts &amp;amp; Homebrew. Each is heinous in its own special way, but the fact that you have three competing package managers, &lt;em&gt;that don't talk to each other&lt;/em&gt; has convinced me that Mac users, in the typical hipster fashion, brutally raped the Unix culture, throwing away everything that made it unique because they did not understand it. These Visigoths are the single best case for government mandated licensing for computer programmers.&lt;/p&gt;

&lt;p&gt;What's wrong with having disparate package managers? Being completely comfortable with the risk of sounding like your grandfather, the problem is &lt;em&gt;I'll tell you when we get to production.&lt;/em&gt; No, really. Installing your packages in production is going to be a pain in the balls if you are using any or all of these OS X package managers. If it's not, then you haven't been doing this long enough. I realize that if you're a Mac web developer, your deployments probably consist of &lt;code&gt;ssh&lt;/code&gt; and &lt;code&gt;git pull&lt;/code&gt;, but when you are older, you will understand the value of automated version dependency satisfaction. Better not tell you now, it would spoil the surprise.&lt;/p&gt;

&lt;p&gt;The scary part about having many general use package managers is that it pushes programmers toward using programming language specific package managers like &lt;code&gt;gem&lt;/code&gt; and &lt;code&gt;pip&lt;/code&gt;, which only serve to metastasize the problem. I generally make Debian packages for everything, even if it means repackaging a &lt;code&gt;pip&lt;/code&gt; package, but thankfully, there are scripts for that.&lt;/p&gt;

&lt;p&gt;One of the unfortunate trends in OS X package management is the idea that the user should be compiling everything. This is being perpetrated mostly by the Homebrew package manager, whose basic building block is the &lt;em&gt;formula&lt;/em&gt;, basically a Ruby script that tells it how to download, compile, and install the package. Well congratulations, dipshit, you've reinvented &lt;code&gt;dpkg&lt;/code&gt;, poorly. I am simply trying to develop an application, is there a good reason why I am compiling &lt;code&gt;libxml2&lt;/code&gt; and all of its dependencies? What is this shit, Gentoo?&lt;/p&gt;

&lt;h3&gt;You don't deploy to BSD&lt;/h3&gt;
&lt;p&gt;So why the hell are you developing on it? You can't possibly expect software packages to behave similarly across MacOS X and Linux, which is probably your production environment. OS X and Linux have different kernels, which means different I/O &amp;amp; process schedulers, different file systems, and a whole host of other implementation details that you'll write off as having been abstracted away until you have your first serious encounter with &lt;em&gt;It Works On My Machine&lt;/em&gt; (confrontations with this beast that were resolved by installing packages are nothing compared to the unwelcome violation you'll get when you discover a &lt;em&gt;real&lt;/em&gt; operating system difference). These problems won't come often, but when they do, you'll conveniently forget that I ever warned you.&lt;/p&gt;

&lt;p&gt;This is one of those lessons that it will take you a few catastrophic failures to learn. Your development platform should be as close as possible to your production platform.&lt;/p&gt;


&lt;h3&gt;Textmate sucks&lt;/h3&gt;
&lt;p&gt;Sooner or later, you have to face facts. Man up and learn Emacs.&lt;/p&gt;

&lt;h3&gt;The hardware is overpriced&lt;/h3&gt;
&lt;p&gt;A basic 15&quot; MacBook Pro will run you $1,800 while you can buy a comparable model from Lenovo for $1,200. An extra $600 for the privilege of running a shitty operating system, where do I sign up? You could put that extra cash into time saving hardware like SSD drives, time-altering hardware like tequila, or perhaps to pay a few months worth of project hosting.&lt;/p&gt;

&lt;p&gt;While the flesh has certainly rotted off this horse's bones by now, the price issue for Mac developers is more of an indicator of other problems: you're spending too much money on a device that looks nice, but ultimately makes your job harder. If you're a startup, this is called &lt;em&gt;dick swinging&lt;/em&gt; and doesn't serve anybody.&lt;/p&gt;

&lt;h3&gt;'Lost' apologists are almost always Mac users. Scientists are baffled.&lt;/h3&gt;
&lt;p&gt;I used to be a big time Mac fanboy. In fact, I even had a &lt;a href=&quot;http://www.macworld.com/article/15516/1999/11/letters.html&quot;&gt;letter published in Macworld magazine&lt;/a&gt; when I was 15. However, at some point, I started writing code to put food on my table, and found that the Mac just does not cut it. I'm generally all for developers using the tools that they want to use and feel the most productive with, but tools like MacOS X cause more problems for the rest of the development team, and are a net negative. Bring your pet Rhesus monkey to work if you want, but the first time that little fucker throws feces at me, he's going to the hot dog factory.&lt;/p&gt;

&lt;p&gt;But really, Mac developers, stay out of the command line. You'll hurt yourselves.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Devops Is a Poorly Executed Scam</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/03/devops-scam.html"/>
   <updated>2011-03-20T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/03/devops-scam</id>
   <content type="html">&lt;p&gt;&lt;img class=&quot;post-lead-image&quot; src=&quot;/teddziuba/images/and-this-is-a-poorly-executed-troll.png&quot;/&gt;I've got to hand it to the Agile development guys &amp;mdash; they were really good at liberating money out of organizations that all had trouble with something inherently difficult. The geniuses who developed Scrum and Extreme Programming executed masterfully; selling books and training; and they made some serious bank doing it. If you hang around Silicon Valley long enough, you know to applaud the hustle.  It's the classic &lt;em&gt;Rainmaker&lt;/em&gt; scam.  You pay a man to make it rain on your crops, and when it rains, he takes the credit. If it doesn't rain, he comes up with an excuse that involves you paying more money.&lt;/p&gt;

&lt;p&gt;So, given that, I'm befuddled by the Devops movement. It's got the potential to make a handful of people a lot of money in the same way that Agile did, but nobody is really executing on it. It's proper snake oil with all the trimmings: prescription of &quot;culture change&quot;, few and vague concrete steps for implementation, and most of all, the promise to solve an age old problem.&lt;/p&gt;

&lt;h3&gt;How do you implement Devops?&lt;/h3&gt;

&lt;p&gt;Point one. Nobody seems to know. At least with Scrum, you could buy the book and take the course. From what I have gathered by reading blogs, if you want to apply Devops to organization, you do any or all of these things:

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;Automate configuration with Puppet or whatever.&lt;/em&gt; You should be doing this anyway. Not earth shattering.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;One-step build and deploy.&lt;/em&gt; I'm still waiting for you tell me how these steps will solve my problems.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Culture of respect &amp;amp; trust, good attitude toward failure.&lt;/em&gt; How about &quot;culture of &lt;em&gt;stop fucking up&lt;/em&gt;&quot;? This is one of those obvious happy-horseshit type statements that makes you believe the salesman is benevolent. A developer who consistently ships broken code or a sysadmin who consistently pushes out broken configuration aren't going to get any better with respect or trust.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Have cross-functional team members to facilitate communication&lt;/em&gt; Every technical person you hire should be cross functional enough that you don't need this.&lt;/li&gt;
&lt;/ul&gt;

These things are all the basics you pick up by reading &lt;em&gt;Learn How Not to be a Complete Failure at Software Development in 24 Hours&lt;/em&gt;. None of it will make your developers any less prone to do stupid shit, and none of it will prevent your systems administrators from roadblocking developers just for funsies.&lt;/p&gt;

&lt;p&gt;One of the things I read frequently is that &lt;em&gt;Devops is about building bridges and communication&lt;/em&gt;. What the shit does that even mean? Cute, but not useful. Clearly everybody deserves to be treated with respect in the workplace, but you can't make two different groups work together just by telling them to, or even by having cross-functional team members to coordinate. If you've hired people explicitly as peacemakers between development and ops, you fucked up somewhere in your hiring process; it's fixing a self-inflicted problem.&lt;/p&gt;

&lt;p&gt;If you are going to pimp this stuff as &quot;the new way of doing things&quot;, at least try to sell me a book.&lt;/p&gt;

&lt;h3&gt;What is the problem you want to solve?&lt;/h3&gt;
&lt;p&gt;The main issue with the Devops movement is that it treats symptoms, not problems. Yes, everybody wants to ship new code frequently and keep it stable, but the dev vs. ops feud is as old as the phrase &quot;it's 98% done, I just have to test it&quot;. The symptoms of the problem are these:

&lt;ul&gt;
  &lt;li&gt;Developers write code on their workstations and it doesn't work in production.&lt;/li&gt;
  &lt;li&gt;Systems administrators are slow and reluctant to change production configurations.&lt;/li&gt;
&lt;/ul&gt;

As a result, it takes longer to ship features than it should.&lt;/p&gt;

&lt;p&gt;The underlying problem, however, is that dev and ops have different goals, and each group's problem solving skills is a product of those goals. The Devops movement does try to cultivate some kind of understanding that developers and systems administrators are both working toward the same end, to put food on the table, but you will never be able to effect cultural change just by saying so. &lt;em&gt;Surly's only looking out for one person, and that's Surly.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;What condition your condition is in&lt;/h3&gt;

&lt;p&gt;You will always have problems between development and operations if the two groups think so differently about technical problems. So, I offer a test. One technical question that will show you how different development and operations really are:

&lt;blockquote&gt;
Devise a caching infrastructure for responses from the Google Maps Geocoder API.
&lt;/blockquote&gt;

The end. Gather dev in one room, and ops in the other, and have them each come up with an answer. If they come up with the same answer, there is hope for your organization. If they don't, put them in one room and have them work it out until there is a unanimous solution, and everyone agrees that it is the best. If they can't agree on a solution, you have problems that no methodology can fix. (For bonus points, make this a universal interview question.)&lt;/p&gt;

&lt;p&gt;After that exercise, development and operations should reasonably be on the same page. Next, you need to implement policy that will force convergence between development and operations. This is my prescription:

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;&lt;strong&gt;Developers are in the on-call rotation&lt;/strong&gt;&lt;/em&gt; If you ship a feature, you help support it. This one is first because it's the most important. If you architect a system, you write the Nagios alerts for it, and they page your phone. Believe me, you will get a crystal clear understanding of why ops throws up so many roadblocks after doing this for a few months.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;&lt;strong&gt;Developers develop in the same environment production runs in&lt;/strong&gt;&lt;/em&gt; If you deploy to Linux, you develop on Linux. No more of this coding on your Macbook Pro and deploying to Ubuntu: that is why you can't have nice things.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;&lt;strong&gt;Downtime never happens twice&lt;/strong&gt;&lt;/em&gt; After problems are fixed short-term, you make it first priority to ensure that the same failure does not happen again.&lt;/li&gt;
&lt;/ul&gt;

That's it. When developers are woken up during downtime, they will adjust their attitude toward operations in a hurry. Yes, the site is down because your architecture sucks. No, cosmic rays did not flip bits in RAM. Clearly, you want development and ops to solve problems collaboratively, and it just won't work if the two groups are too different.&lt;/p&gt;

&lt;p&gt;Since no methodology peddler ever wants to say this, I will: &lt;em&gt;there's a point where you're simply fucked.&lt;/em&gt; Meaning, you can't solve the problem with the tools available. Sometimes, you have to fire people who aren't working out. Sometimes, you're too deep in technical debt and too pressed for time to do it the &quot;right way&quot;. And sometimes, projects fail. It is what it is. This isn't defeatist, it's realist.&lt;/p&gt;

&lt;p&gt;I am not trying to sell you a book, I am just being honest about the problems you face. None of this amounts to a &lt;em&gt;methodology&lt;/em&gt;, as the Devops people would have you believe. If your developers and your sys admins are so culturally different that they can't agree on a solution to a simple technical problem, then your organization will not be fixed by some sunshine-up-your-ass methodology you read about in a blog or hear about at a conference. You need to change the culture the hard way, or replace people as necessary until the culture works.&lt;/p&gt;
&lt;p&gt;The Devops movement smells of a scam in the making, not that I have any problem with that, after all, don't knock the hustle. However, I'd rather not see people with real problems get roped in, thinking that there's a magical 12-step program that will solve deep rooted problems. It just doesn't work that way. Even so, the Devops people have a bit of traction, and they're failing to capitalize on it. You've got a good thing going here, &lt;em&gt;&lt;strong&gt;profit&lt;/strong&gt;&lt;/em&gt; from that shit. Books, training, conferences, the whole bit. Get down to it.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Monitoring Theory</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/03/monitoring-theory.html"/>
   <updated>2011-03-11T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/03/monitoring-theory</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/teddziuba/images/cheezburger-inc-tries-to-copyright-this-shit-but-i-cut-out-their-watermark-because-fuck-them.jpg&quot; class=&quot;post-lead-image&quot;/&gt;Around the time in my life when I stopped ordering drinks made with more than one ingredient, I was woken up for the last time by a hypochondriac Nagios monitoring installation. If you are on-call long enough, you cultivate a violent reaction to the sound of your cell phone's text message alert. If your monitoring is overconfigured, that reaction boils over hastily, as it will interrupt you during meals, sex, sleep &amp;mdash; all of the basics &amp;mdash; with the excruciating operational details of your web site.&lt;/p&gt;

&lt;p&gt;I've since developed, with the help of some noble systems administrators, a theory around service monitoring: monitors can be &lt;strong&gt;informative&lt;/strong&gt;, &lt;strong&gt;actionable&lt;/strong&gt;, or both. By &lt;em&gt;informative&lt;/em&gt;, I mean that the alert must tell you categorically that there is a problem. By &lt;em&gt;actionable&lt;/em&gt; I mean that receiving the alert must prompt some kind of immediate response. Alerts can therefore break down like this:&lt;/p&gt;

&lt;hr class=&quot;space&quot;/&gt;
&lt;h4&gt;Neither Informative nor Actionable&lt;/h4&gt;
&lt;p&gt;I call these types of alerts &lt;strong&gt;Cool Story, Bro&lt;/strong&gt; for short. These are bits of information that do not indicate any sort of problem state, and do not prompt any action. Cool Stories are things that you should not even have alerts for. They waste your time and make you paranoid. &lt;strong&gt;If you want to track a metric whose state is neither informative nor actionable, make it a graph, not an alert.&lt;/strong&gt; Cool Story Bro alerts are things like:

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;The load average on a server is above 20.&lt;/em&gt; This doesn't actually indicate a problem. In Linux, the load average is simply the number of processes in the kernel's run queue, and as long as the CPUs, disk channels, network interfaces, and memory are less than 100% capacity, the machine is not busy enough. A high load is nothing to worry about. I don't even have alerts on load average. I have production machines with load averages well over 100 that are working just fine.&lt;/li&gt;

  &lt;li&gt;&lt;em&gt;A job queue has more than X work units in it.&lt;/em&gt; Congratulations, dipshit, your queue is doing exactly what it is supposed to do. It's a subjective decision whether or not this is a failure state. One of the reasons that I hate queues.&lt;/li&gt;

  &lt;li&gt;&lt;em&gt;Some metric is greater than an empirically determined mean.&lt;/em&gt; I get personally offended by shit like this. &quot;Have it alert if it's 10 more than the average&quot;; first of all, read Zed Shaw's &lt;a href=&quot;http://zedshaw.com/essays/programmer_stats.html&quot;&gt;Programmers Need To Learn Statistics Or I Will Kill Them All&lt;/a&gt;, and second, alerts based on empirical data will frequently give false positives, as the measurements the alert takes are &lt;em&gt;new empirical data you haven't seen before&lt;/em&gt;. Furthermore, any up/down check program that needs to keep state about the return results of previous checks is a recipe for tears.&lt;/li&gt;
&lt;/ul&gt;
&lt;/p&gt;

&lt;hr class=&quot;space&quot;/&gt;

&lt;h4&gt;Informative, but not Actionable&lt;/h4&gt;
&lt;p&gt;Informative but unactionable alerts are ones that indicate an abnormal state, but are not things that you need to handle immediately, so they should be e-mail alerts, not pager messages. They are things that you can handle during the workday, and don't need your surly, undivided attention at four o'clock in the morning. For example:


&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;The primary disk on your database server is at 90% capacity.&lt;/em&gt; Is the site down? No? Then fuck off, I'll get to it.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Memory on your MongoDB server is at 80% capacity.&lt;/em&gt; For those of you working at Foursquare, when a critical server is approaching its maximum physical memory capacity, you should be aware, as it means bad shit may happen soon if it keeps growing.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;One out of three load-balanced web servers is down.&lt;/em&gt; Good to know, but I don't have to get off my ass just yet.&lt;/li&gt;
&lt;/ul&gt;
&lt;/p&gt;

&lt;hr class=&quot;space&quot;/&gt;
&lt;h4&gt;Informative and Actionable&lt;/h4&gt;
&lt;p&gt;This is your meat and potatoes. When one of these fuckers goes off, it's &lt;em&gt;battlestations&lt;/em&gt;. Drop your cocks and grab your socks, we got shit to fix. Yes, these are the alerts that should be waking you up in the middle of the night. Milo's production Nagios config has roughly 10 of these, double that in e-mail only alerts, and quite a few cool stories to satisfy some paranoid delusions. Some example alerts that should get your lazy ass out of bed:


&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;The search handler of your public site serves HTTP 500.&lt;/em&gt; Your users probably weren't looking for that.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;Your API's latency is outside of SLA.&lt;/em&gt; Welcome to losing-moneyville. Population: you.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;The CSS statics on your home page fail to load.&lt;/em&gt; It's a really obscure HTTP status code, you probably haven't heard of it.&lt;/li&gt;
&lt;/ul&gt;

&lt;/p&gt;

&lt;hr class=&quot;space&quot;/&gt;
&lt;h4&gt;Systems Design Considerations&lt;/h4&gt;

&lt;p&gt;When you design a new system, design it to be monitorable. The basic criterion is this: &lt;em&gt;there must be a &lt;strong&gt;stateless&lt;/strong&gt;, &lt;strong&gt;deterministic&lt;/strong&gt; way to check the system's health&lt;/em&gt;.

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;stateless:&lt;/em&gt; The check run at time &lt;code&gt;t&lt;/code&gt; must not depend on the outcome of the check run at time &lt;code&gt;t - 1&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;deterministic:&lt;/em&gt; There is no random variability or subjective judgment to determine whether or not the system is healthy. Health is a binary value.&lt;/li&gt;
&lt;/ul&gt;

This seems obvious but I've seen a lot of smart people fuck it up. I keep harping on message queues, but they are a good example of poor system design that is unmonitorable. Consider this system:

&lt;blockquote&gt;
Producer &amp;rarr; Queue &amp;rarr; Blocking Consumers
&lt;/blockquote&gt;

End to end, how do you check that this system is OK? The logical entry point for a monitor is the producer, send a job through the system and check that it gets processed by a consumer, but the asynchronous queue makes that determination a judgment call. What if it takes a minute? What if it takes an hour? Is the system still OK if the time from job production to job completion is a day?&lt;/p&gt;
&lt;p&gt;If you need the asynchronous model, what you generally want is a &lt;em&gt;spool&lt;/em&gt;, where you say that it is not a requirement that work be processed as soon as possible, but rather, in an offline batch job. In this case, you can simply monitor that the spool size has not overflowed the capacity of the physical device it's on, and that your processing batch job has run successfully.&lt;/p&gt;

&lt;p&gt;If you want work processed as it comes in, think about this:

&lt;blockquote&gt;
Producer &amp;rarr; Load Balancer &amp;rarr; Consumers
&lt;/blockquote&gt;

You still have a fixed number of consumers, except work is being done synchronously, and the load balancer decides which of the consumers gets the work. If all consumers are busy, the load balancer refuses incoming work, because your system is &lt;strong&gt;over capacity&lt;/strong&gt;.  This is a failure state, and can be deterministically measured.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;But Ted&lt;/em&gt; you say, &lt;em&gt;what if there's a lot of traffic and the consumers get backed up?&lt;/em&gt; Aye, welcome to the world of capacity planning. In this case, using an asynchronous queue is a crutch that helps you avoid thinking about your system's actual resource utilization, and it will come back to burn you because it us unmonitorable. In the face of increased traffic, if you're so confident that you can &quot;spin up more queue workers&quot;, then you can sure as hell &quot;spin up more synchronous workers&quot;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Stupid Unix Tricks: Workflow Control with GNU Make</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/02/stupid-unix-tricks-workflow-control-with-gnu-make.html"/>
   <updated>2011-02-26T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/02/stupid-unix-tricks-workflow-control-with-gnu-make</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/teddziuba/images/this-is-my-morning-work-face.png&quot; class=&quot;post-lead-image&quot;/&gt;
I was hacking around on part of a project last week that looked something like this:
&lt;/p&gt;

&lt;blockquote&gt;
Fetch an API Response &amp;rarr; Validate response &amp;rarr; Munge response &amp;rarr; Write response to a DB
&lt;/blockquote&gt;


&lt;p&gt;If any of the steps fail, then the whole production should stop or else Bad Things happen. Furthermore, I would like to be able to pick up where I left off in the process, should one of the steps fail - the validate step was somewhat CPU intensive, so I'd rather not lose that work if it succeeds. This is a pretty common workflow, so I wanted to apply as much of the Unix Way to it as I could, in hopes that my solution would be easier and more robust. That turned out to be true.&lt;/p&gt;

&lt;h4&gt;Makefile Abuse&lt;/h4&gt;

&lt;p&gt;As chance would have it, GNU Make solved this problem for me without a whole lot of effort. Here's what my makefile looked like:&lt;/p&gt;

&lt;pre&gt;
api_response.json:
    curl -o api_response.json http://api.company.com/endpoint

validated_response.json: api_response.json
    validate_response -o validated_response.json api_response.json

munged_response.json: validated_response.json
    munge_response -o munged_response.json validated_response.json

update_database: munged_response.json
    copy_response_to_db munged_response.json

clean:
    rm -f munged_response.json
    rm -f validated_response.json
    rm -f api_response.json

.PHONY: update_database clean
&lt;/pre&gt;

&lt;p&gt;To execute the workflow, I invoke &lt;code&gt;make -f workflow.mk update_database&lt;/code&gt;, which will do the following:

&lt;ol&gt;
  &lt;li&gt;Compute the dependency tree: &lt;code&gt;munged_response.json&lt;/code&gt; depends on &lt;code&gt;validated_response.json&lt;/code&gt; which depends on &lt;code&gt;api_response.json&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;If any of these files is not on disk, it will recursively execute the &lt;code&gt;make&lt;/code&gt; targets to build the missing ones.&lt;/li&gt;
  &lt;li&gt;Update the database&lt;/li&gt;
  &lt;li&gt;If any of the recursive execution fails (command returns nonzero), freak the fuck out and print an error message.&lt;/li&gt;
&lt;/ol&gt;

The &lt;code&gt;.PHONY&lt;/code&gt; line tells &lt;code&gt;make&lt;/code&gt; that the &lt;code&gt;clean&lt;/code&gt; and &lt;code&gt;update_database&lt;/code&gt; targets are always out of date, and need to be run every time.
&lt;/p&gt;

&lt;h4&gt;Graceful and Robust&lt;/h4&gt;
&lt;p&gt;There are a couple of things that I really like about this gadget:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;&lt;em&gt;Fail-fast execution:&lt;/em&gt;&lt;/strong&gt; if any of the steps before &lt;code&gt;update_database&lt;/code&gt; fail, the database doesn't get updated.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;em&gt;Pick up where you left off:&lt;/em&gt;&lt;/strong&gt; if &lt;code&gt;munge_response&lt;/code&gt; fails after the fetch and validate steps succeed, the next time it executes, it won't fetch &amp; validate again unless I &lt;code&gt;make clean&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;em&gt;Small programs that do small things:&lt;/em&gt;&lt;/strong&gt; instead of one monolithic program, there are 4 independent programs that perform the work: &lt;code&gt;curl&lt;/code&gt;, &lt;code&gt;validate_response&lt;/code&gt;, &lt;code&gt;munge_response&lt;/code&gt;, and &lt;code&gt;copy_response_to_db&lt;/code&gt;. This modular system is more debuggable and and robust than a single program that does everything.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;em&gt;Free parallelization where available:&lt;/em&gt;&lt;/strong&gt; since the workflow is a linear dependency chain, &lt;code&gt;make&lt;/code&gt; can't parallelize it. However, if there were another step that only depended on &lt;code&gt;munged_response.json&lt;/code&gt;, say, &lt;code&gt;publish_munged_response&lt;/code&gt;, &lt;code&gt;make&lt;/code&gt; would be able to parallelize &lt;code&gt;publish_munged_response&lt;/code&gt; and &lt;code&gt;update_database&lt;/code&gt;, as they are not linearly dependent on one another.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;I could have used a pretty standard pipeline to solve this problem, for example:

&lt;pre&gt;
curl http://api.company.com/endpoint | validate_response | munge_response | copy_response_to_db
&lt;/pre&gt;

But that would not satisfy the &quot;pick up where you left off&quot; requirement without some Aristocrats joke within each of the processing programs to track state, and pipelines are linear - it's hard to get the free parallelization without doing some shameful things.
&lt;br/&gt;&lt;br/&gt;
Feels good, man.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>The Case Against Queues</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/02/the-case-against-queues.html"/>
   <updated>2011-02-06T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/02/the-case-against-queues</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;/teddziuba/images/nine-readers-out-of-ten-have-no-fucking-idea-who-this-man-is.jpg&quot; class=&quot;post-lead-image&quot;/&gt;Some people, when confronted with a problem, think &quot;I know, I'll use a queue.&quot; Now they have an unbounded number of problems.&lt;/p&gt;

&lt;p&gt;Networked message queues like ActiveMQ, RabbitMQ, ZeroMQ, and a host of other Java inspired software tumors are crutches of systems design.  I love asynchronous stuff as much as the next guy, but think of a queue like Java: it encourages large swaths of mediocre programmers to overengineer shit, and it keeps systems administrators in business by giving them something to respond to at 4AM.&lt;/p&gt;

&lt;p&gt;Here is some of the dumb stuff that queues enable:&lt;/p&gt;

&lt;h4&gt;The Blocking Consumer&lt;/h4&gt;
&lt;p&gt;You have some work that is sometimes produced faster than it can be done; a common problem. One of the commonly problematic solutions to it is to stick the work in a message queue, and have one or more consumers that block on the queue, picking work off as soon as it's available and doing it.&lt;/p&gt;

&lt;p&gt;What's wrong with this? First of all, it blurs your mental model of what's going on. You end up expecting synchronous behavior out of a system that's asynchronous. One of the concrete outcomes of that is the question &lt;strong&gt;how do you determinstically monitor this system?&lt;/strong&gt; If the queue size is greater than zero, is this a failure state? If the queue size is greater than zero, it means that your system is over capacity, but what is your response to that? Spin more workers or let it ride? If your answer is &quot;spin more workers&quot;, then you should be doing the work synchronously because the implication is that you care about the amount of time it takes for a worker to get to the work.  If your answer is &quot;let it ride&quot;, then how do you know when your system is in trouble: if there are ten jobs in the queue? Ten thousand?&lt;/p&gt;

&lt;p&gt;If you are designing a system that relies on a blocking queue consumer, you should likely be doing the work synchronously, without the queue. System gets overloaded? I've got a solution for that, too: &lt;em&gt;capacity planning&lt;/em&gt;.&lt;/p&gt;

&lt;h4&gt;Collecting Data for Offline Processing&lt;/h4&gt;

&lt;p&gt;Say you've got some events that you want to record, and then process offline in a batch job. Using a message queue for this will only lead to tears.&lt;/p&gt;

&lt;p&gt;In such a system, you've usually got multiple data producers, and you want the data aggregated in a single place. As chance would have it, UNIX ships with a facility that can do this consistently and reliably. We call it syslog.&lt;/p&gt;

&lt;p&gt;Depending on your queue implementation, when you pop a message off, it's gone. The consumer acknowledges receipt of it, and the queue forgets about it. So, if your data processor fails, you've got data loss. Collecting messages syslog, your processor program is just processing a text file, and can process it again if something goes wrong. Throw in some &lt;code&gt;split&lt;/code&gt; and &lt;code&gt;xargs&lt;/code&gt;, and you've got parallel processing. Event messages aren't text? You fucked up. Go buy a subscription to the Microsoft Developer Network.&lt;/p&gt;


&lt;h4&gt;Everybody Loves System Complexity&lt;/h4&gt;

&lt;p&gt;Obviously I have been generalizing thus far. There are host of situations where you need to separate the production of work from the consumption of work, and fully understand the consequences. I'm not hating on that asynchronous pattern, I'm hating on introducing more software into your stack unnecessarily. Can you use a database table for it? Can you use files on disk or a named pipe? Syslog? (modern syslog implementations will write to a database) Bringing new services in should be the absolute last resort, because every new service is an unknown that needs to be configured and maintained. Adding a queue to your stack isn't just adding a service. It's ancillary maintenance code, libraries, monitoring scripts - all more things that can and will fail.&lt;/p&gt;

&lt;p&gt;Liabilities, as it were.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Multiple Concurrent Linux Distributions</title>
   <link href="http://widgetsandshit.com/teddziuba/2011/01/multiple-concurrent-linux-distros.html"/>
   <updated>2011-01-01T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2011/01/multiple-concurrent-linux-distros</id>
   <content type="html">&lt;p&gt;&lt;img class=&quot;post-lead-image&quot; src=&quot;/teddziuba/images/sometimes-all-you-need-is-a-solid-puke-to-set-yourself-right.png&quot;&gt;
You can run multiple Linux distributions at the same time, on the same computer, without a virtual machine. Milo's production environment is a mix of Ubuntu Hardy and Lucid, while eBay's production Linux is Red Hat. Eventually, this will all converge on one environment, but in the mean time while we port, we need a way to rapidly iterate changes on a handful of Linux distributions. A virtual machine seems like the obvious answer, but that's overkill for this situation.
&lt;/p&gt;

&lt;p&gt;A Linux distribution is a kernel, some libraries, binaries, and a package manager. The kernel is the lowest level of abstraction over the hardware; everything else is fairly interchangeable. Apache running on Red Hat will make the same system calls as Apache running on Ubuntu. In theory, as long as shared library paths are managed correctly and the package managers don't trample on each other, you can have multiple distributions &quot;running&quot; under one kernel, no virtual machine needed.&lt;/p&gt;

&lt;p&gt;Unix provides the &lt;code&gt;chroot&lt;/code&gt; mechanism to keep all of the distribution files in order. There are some tools that build on &lt;code&gt;chroot&lt;/code&gt; to support these virtual environments so that you don't have to do any bookkeeping or stupid shit with &lt;code&gt;/dev&lt;/code&gt; or &lt;code&gt;/proc&lt;/code&gt;. I am using Ubuntu as my &quot;host&quot; operating system, and a Debian tool called &lt;code&gt;schroot&lt;/code&gt; to manage it all.&lt;/p&gt;

&lt;h3&gt;Ubuntu In Ubuntu&lt;/h3&gt;
&lt;p&gt;I use Ubuntu Maverick as my desktop operating system, while production uses either Hardy or Lucid because they are long-term support releases. Here's how you install one Ubuntu distribution within another:&lt;/p&gt;

&lt;p&gt;
  &lt;ol&gt;
  &lt;li&gt;&lt;code&gt;sudo apt-get install debootstrap schroot&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;sudo mkdir -p /var/chroot/lucid&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;sudo debootstrap --variant=buildd --arch [amd64|i386] lucid /var/chroot/lucid http://archive.ubuntu.com/ubuntu/&lt;/code&gt;&lt;/li&gt;
  &lt;/ol&gt;

Now open up &lt;code&gt;/etc/schroot/schroot.conf&lt;/code&gt; in a text editor and add a configuration stanza that looks like this:

&lt;blockquote&gt;
&lt;pre&gt;
[lucid]
directory=/var/chroot/lucid
description=Ubuntu Lucid
root-users=root,ted
users=ted
type=directory
&lt;/pre&gt;
&lt;/blockquote&gt;

In practice this can be any Ubuntu release you want, just change the argument to &lt;code&gt;debootstrap&lt;/code&gt; and your configurations accordingly. I have one stanza for Lucid and one for Hardy, but you can have as many as you've got disk space for.&lt;/p&gt;

&lt;p&gt;Now you can use your virtual environment with &lt;code&gt;schroot -c lucid&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;
&lt;blockquote&gt;
&lt;pre&gt;
ted@duke:~$ lsb_release -dc
Description:	Ubuntu 10.10
Codename:	maverick
ted@duke:~$ schroot -c lucid
(lucid)ted@duke:~$ lsb_release -dc
Description:	Ubuntu 10.04 LTS
Codename:	lucid
(lucid)ted@duke:~$
&lt;/pre&gt;
&lt;/blockquote&gt;

At this point you should probably &lt;code&gt;apt-get install ubuntu-minimal&lt;/code&gt; as &lt;code&gt;debootstrap&lt;/code&gt; doesn't install a whole lot.
&lt;/p&gt;

&lt;p&gt;Neat.&lt;/p&gt;

&lt;h3&gt;Red Hat in Ubuntu&lt;/h3&gt;

&lt;p&gt;Now that we've got ourselves bothered with &lt;code&gt;schroot&lt;/code&gt;, let's make it jump. Installing Red Hat in a &lt;code&gt;chroot&lt;/code&gt; environment is going to be tricker because we don't have &lt;code&gt;debootstrap&lt;/code&gt;, which is a Debian tool for bootstrapping an installation. Fortunately, some enterprising programmer has unfucked the Red Hat bootstrap system with a utility called &lt;code&gt;rinse&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;
&lt;ol&gt;
  &lt;li&gt;&lt;code&gt;sudo apt-get install rinse&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;sudo mkdir -p /var/chroot/centos&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;sudo rinse --distribution centos-5 --arch [amd64|i386] --directory /var/chroot/centos&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

If CentOS isn't your game, there are others to choose from. Run &lt;code&gt;rinse --list-distributions&lt;/code&gt; to see what is available.
&lt;/p&gt;

&lt;p&gt;Add a &lt;code&gt;schroot.conf&lt;/code&gt; stanza for Red Hat like so:

&lt;blockquote&gt;
&lt;pre&gt;
[centos]
description=CentOS 5 amd64
directory=/var/chroot/centos
root-users=root,ted
users=ted
type=directory
&lt;/pre&gt;
&lt;/blockquote&gt;

And now I can jump into CentOS 5 with &lt;code&gt;schroot -c centos&lt;/code&gt;, and I'll prove it, too:

&lt;blockquote&gt;
&lt;pre&gt;
ted@duke:~$ schroot -c centos
(centos)ted@duke:~$ lsb_release -dc
Description:	CentOS release 5.5 (Final)
Codename:	Final
&lt;/pre&gt;
&lt;/blockquote&gt;

Of course, CentOS uses RPM and some derelict dependency manager called yum. You will have to &lt;code&gt;yum install lsb&lt;/code&gt; to get the Linux Standard Base.
&lt;/p&gt;

&lt;h3&gt;This is Not Virtualization&lt;/h3&gt;

&lt;p&gt;If you're asking yourself how you can get a private network interface for a chroot environment, then you've got to take a little more time to understand the problem we are solving. When I first started digging into this, my goal was to be able to run flymake in a hermetic environment. Just the tip, just for a little bit, just to see how it feels. And look where it lead.&lt;/p&gt;

&lt;p&gt;This doesn't solve a whole lot of production problems, in fact, it will probably create more than it solves. If you're interested in chrooting your way out of a jam in production, check out FreeBSD jails. This trick is useful for development, when you need to target more than one Linux distribution, or you want a clean environment to test in without booting up a VM.&lt;/p&gt;

&lt;p&gt;Ain't Unix grand?&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>The 3 Basic Tools of Systems Engineering</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/12/the-3-basic-tools-of-systems-engineering.html"/>
   <updated>2010-12-07T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/12/the-3-basic-tools-of-systems-engineering</id>
   <content type="html">&lt;p&gt;&lt;img class=&quot;post-lead-image&quot; src=&quot;/teddziuba/images/if-you-ever-want-your-wife-to-come-to-a-room-turn-on-the-xbox-she-will-be-there-within-minutes-to-tell-you-to-turn-it-off.jpg&quot;&gt;One of the most important things I learned when programming for a startup is how to design reliable systems. A startup programmer needs to understand the business economics of systems design: that the goal is to create the desired functionality, &lt;em&gt;not&lt;/em&gt; to write code. Code is only incidental, and it should be the last tool you use to solve a problem.&lt;/p&gt;

&lt;p&gt;There are three basic tools you can use to solve a technical problem: money, time, and code. This seems obvious, but the critical point is that &lt;strong&gt;you must try them in that order.&lt;/strong&gt; Out-of-order execution of these tools leads to Very Bad Things, which we will discuss later.&lt;/p&gt;


&lt;h3&gt;Money&lt;/h3&gt;
&lt;p&gt;Money is by far the best way to solve a problem because it saves time and helps you avoid writing code. You can usually use money to solve performance and scalability problems, either by buying more hardware or faster hardware. My favorite example is how solid state disk drives make disk I/O problems go away because there is no penalty for disk seek.&lt;/p&gt;

&lt;p&gt;At Milo, we did this when we had a database performance problem: read queries were running slowly, so we spent the money to buy a really powerful server for our database: 24 cores, 64 gigabytes of RAM, and solid state disk drives. This solved the problem for the life of the company until we were acquired. It was absolutely worth the money because we then had more time to spend building the product, and no liabilities that would come from re-architecting the data model.&lt;/p&gt;

&lt;p&gt;It is rare that money can completely solve the problem (or that you have enough money), but it is an easy tool to try first.&lt;/p&gt;

&lt;h3&gt;Time&lt;/h3&gt;
&lt;p&gt;If money doesn't work, invest time to research existing pieces of functionality that do. As I have said before, basic Unix literacy can help you know what tools are available to solve a given problem. For systems design, it helps to know what larger services are available for different classes of problems. To name a few:

&lt;ul&gt;
  &lt;li&gt;Load balancing/redundancy: HAProxy&lt;/li&gt;
  &lt;li&gt;Caching: Squid, Varnish (not Memcache because it forces you to write too much code)&lt;/li&gt;
  &lt;li&gt;Database: PostgreSQL or Oracle if you can afford it. If you're using a NoSQL database, you fucked up somewhere.&lt;/li&gt;
  &lt;li&gt;Database replication: Slony-I&lt;/li&gt;
  &lt;li&gt;Full-text search: PostgreSQL, Solr (warning: if you use Solr the way I think you will, you will have multiple points of truth in your system)&lt;/li&gt;
  &lt;li&gt;Queueing: if you're using a queue, again, you fucked up somewhere.&lt;/li&gt;
  &lt;li&gt;Logging: syslog, and nothing else. Ever.&lt;/li&gt;
&lt;/ul&gt;
&lt;/p&gt;

&lt;p&gt;It seems obvious, but sometimes it needs to be stated: use other peoples' work to accomplish your goals. Well-known open source packages are very high quality, and are far more reliable than anything you could build yourself. Even a 90% solution off the shelf is worth it because of the time you save in maintenance.&lt;/p&gt;

&lt;h3&gt;Code&lt;/h3&gt;
&lt;p&gt;Writing code is the last resort for solving a problem. Code is a versatile enough tool that you can make it solve just about any problem, but every line is a liability. It's design, future maintenance, monitoring, testing and profiling. Write code only when you have proven categorically that money and third party software don't work.&lt;/p&gt;

&lt;p&gt;As a side note, when you are testing the code, I have found that unit testing is a losing investment. Acceptance tests, however, are the most cost effective way to manage the risk that new code introduces, in terms of time spent developing.&lt;/p&gt;

&lt;h3&gt;Using the Tools Out of Order&lt;/h3&gt;
&lt;p&gt;The worst thing you can do is to try the &lt;em&gt;code&lt;/em&gt; tool first, without considering money or time. This is called &lt;strong&gt;technical masturbation&lt;/strong&gt; and it can sink a project in a hurry.&lt;/p&gt;

&lt;p&gt;When the first thing you do is dive into code, you are dooming yourself to either designing an unmaintainable system, or to reinvent existing tools poorly. This may be acceptable in an academic or research setting, but in a startup, it's downright foolish. You may be able to deploy your system faster if you code it all yourself, but it will be a monkey on your back for its entire lifetime. PostgreSQL has never woken me up in the middle of the night with a segmentation fault or NullPointerException, but databases I've written myself have.&lt;/p&gt;

&lt;p&gt;Functionality is an asset, but code is a liability. I will say this until you like it.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Taco Bell Programming</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/10/taco-bell-programming.html"/>
   <updated>2010-10-21T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/10/taco-bell-programming</id>
   <content type="html">&lt;img class=&quot;post-lead-image&quot; src=&quot;/teddziuba/images/a-sysadmin-is-like-a-cocktail-waitress-at-a-strip-club-underappreciated-but-critical-to-business-success.jpg&quot;&gt;Every item on the menu at Taco Bell is just a different configuration of roughly eight ingredients.  With this simple periodic table of meat and produce, the company pulled down $1.9 billion last year.&lt;/p&gt;

&lt;p&gt;The more I write code and design systems, the more I understand that many times, you can achieve the desired functionality simply with clever reconfigurations of the basic Unix tool set. After all, &lt;em&gt;functionality is an asset, but code is a liability&lt;/em&gt;.  This is the opposite of a trend of nonsense called DevOps, where system administrators start writing unit tests and other things to help the developers warm up to them - Taco Bell Programming is about developers knowing enough about Ops (and Unix in general) so that they don't overthink things, and arrive at simple, scalable solutions.&lt;/p&gt;

&lt;p&gt;Here's a concrete example: suppose you have millions of web pages that you want to download and save to disk for later processing.  How do you do it?  The cool-kids answer is to write a distributed crawler in Clojure and run it on EC2, handing out jobs with a message queue like SQS or ZeroMQ.&lt;/p&gt;

&lt;p&gt;The Taco Bell answer? &lt;code&gt;xargs&lt;/code&gt; and &lt;code&gt;wget&lt;/code&gt;. In the rare case that you saturate the network connection, add some &lt;code&gt;split&lt;/code&gt; and &lt;code&gt;rsync&lt;/code&gt;. A &quot;distributed crawler&quot; is really only like 10 lines of shell script.&lt;/p&gt;

&lt;p&gt;Moving on, once you have these millions of pages (or even tens of millions), how do you process them? Surely, Hadoop MapReduce is necessary, after all, that's what Google uses to parse the web, right?&lt;/p&gt;

&lt;p&gt;Pfft, fuck that noise:&lt;/p&gt;

&lt;blockquote&gt;&lt;code&gt;find crawl_dir/ -type f -print0 | xargs -n1 -0 -P32 ./process&lt;/code&gt;&lt;/blockquote&gt;

&lt;p&gt;32 concurrent parallel parsing processes and zero bullshit to manage. Requirement satisfied.&lt;/p&gt;

&lt;p&gt;Every time you write code or introduce third-party services, you are introducing the possibility of failure into your system.  I have far more faith in &lt;code&gt;xargs&lt;/code&gt; than I do in Hadoop. Hell, I trust &lt;code&gt;xargs&lt;/code&gt; more than I trust myself to write a simple multithreaded processor. I trust syslog to handle asynchronous message recording far more than I trust a message queue service.&lt;/p&gt;

&lt;p&gt;Taco Bell programming is one of the steps on the path to Unix Zen. This is a path that I am personally just beginning, but it's already starting to pay dividends. To really get into it, you need to throw away a lot of your ideas about how systems are designed: I made most of a SOAP server using static files and Apache's &lt;code&gt;mod_rewrite&lt;/code&gt;. I could have done the whole thing Taco Bell style if I had only manned up and broken out &lt;code&gt;sed&lt;/code&gt;, but I pussied out and wrote some Python.&lt;/p&gt;

&lt;p&gt;If you don't want to think of it from a Zen perspective, be capitalist: you are writing software to put food on the table.  You can minimize risk by using the well-proven tool set, or you can step into the land of the unknown. It may not get you invited to speak at conferences, but it will get the job done, and help keep your pager from going off at night.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Programming Things I Wish I Knew Earlier</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/09/programming-things-i-wish-i-knew.html"/>
   <updated>2010-09-04T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/09/programming-things-i-wish-i-knew</id>
   <content type="html">&lt;img src=&quot;/teddziuba/images/racial-slurs-make-for-a-pretty-awesome-server-naming-convention.jpg&quot; class=&quot;post-lead-image&quot;&gt;
&lt;p&gt;Programming in a startup is much different than programming at a big company. At a startup, not only are you the developer, but you are also the systems administrator for the most part.  I've been startupping for three years now, and have had my ass kicked enough times to step back and think that maybe I should learn how to do things the right way rather than try to bludgeon my way through with raw intellect.&lt;/p&gt;

&lt;p&gt;These are the things I wish I had known in the beginning, or at least I wish I hadn't been too subborn to learn.&lt;/p&gt;


&lt;h3&gt;How To Avoid Over Complicating&lt;/h3&gt;
&lt;p&gt;Like most software people, I have a natural tendency to over-engineer things. To help fight this urge, I've come up with two simple rules to help myself avoid most of it:&lt;/p&gt;

&lt;h4&gt;If you are writing a program that touches more than two persistent data stores, it is too complicated.&lt;/h4&gt;
&lt;p&gt;And the the hard disk counts as a persistent data store. This is kind of a reinterpretation of the Unix way of having small tools with a single input and single output.  Tracking state across many different persistent stores and handling failure cases is just too much for one program to do.&lt;/p&gt;

&lt;h4&gt;If Linux can do it, you shouldn't.&lt;/h4&gt;
&lt;p&gt;Don't use Hadoop MapReduce until you have a solid reason why &lt;code&gt;xargs&lt;/code&gt; won't solve your problem. Don't implement your own lockservice when Linux's advisory file locking works just fine. Don't do image processing work with PIL unless you have proven that command-line ImageMagick won't do the job.  Modern Linux distributions are capable of a lot, and most hard problems are already solved for you. You just need to know where to look.&lt;/p&gt;


&lt;h3&gt;Parallelize When You Have To, Not When You Want To&lt;/h3&gt;
&lt;p&gt;I know it seems obvious, but sometimes I need to tell myself explicity: if the physical machine is not the bottleneck, do not split the work to multiple physical machines. It is usually pretty apparent when you have to parallelize a CPU bound job, but for I/O bound stuff, you have to do some more in-depth measurement.&lt;/p&gt;

&lt;p&gt;For example, if you are doing web crawling, and you have not saturated the pipe to the internet, then it is not worth your time to use more servers. &lt;a rel=&quot;nofollow&quot; href=&quot;http://measuringmeasures.com/blog/2010/8/16/clojure-workers-and-large-scale-http-fetching.html&quot;&gt;This guy&lt;/a&gt; got me thinking about it, he's doing &quot;Large-scale HTTP fetching&quot; in Clojure. He talks about parallelizing with some queueing silliness, but never mentions how much data is moving down the pipe on any one machine. If you have a 100 megabit connection to the internet, and your fetcher is using 700 kilobits, then figure out why your fetcher sucks. (As a side note, I was talking about that post with Milo's prolific systems administrator, and we could not figure out whether the author was an incredibly elaborate troll or just a run-of-the-mill idiot.)&lt;/p&gt;


&lt;p&gt;This tidbit also goes for data storage. I know Cassandra is all neat and whiz-bang, but I can pretty much guarantee that you don't need it. Multi-terabyte drives are cheap, and PostgreSQL is a known quantity. It's just not worth the risk.&lt;/p&gt;

&lt;h3&gt;How to Babysit a Process&lt;/h3&gt;
&lt;p&gt;This is one I only recently learned. If you have a process running and you want it to be restarted automatically if it crashes, use &lt;a href=&quot;http://upstart.ubuntu.com/&quot;&gt;Upstart&lt;/a&gt;. Upstart is a replacement for the init daemon that can do a lot of cool things, one of which is restart a process if it crashes. An example Upstart config to do this would look like this:&lt;/p&gt;

&lt;pre&gt;
respawn              # Respawn this process if it dies

respawn limit 10 600 # If you have to respawn 10 times 
                     # in 10 minutes, give up

exec python /path/to/my/program.py
&lt;/pre&gt;


&lt;h3&gt;NoSQL is NotWorthIt&lt;/h3&gt;
&lt;p&gt;I've &lt;a href=&quot;/2010/03/i-cant-wait-for-nosql-to-die.html&quot;&gt;talked a lot of shit on NoSQL&lt;/a&gt; in the past, but I recently decided to see if I had been living a lie. I tried to use Redis for some not-as-mission-critical systems at Milo - applications where data loss is not that big of a deal. Redis, even though it's an in-memory database, has a virtual memory feature, where you can cap the amount of RAM it uses and have it spill the data over to disk. So, I threw 75GB of data at it, giving it a healthy amount of physical memory to keep hot keys in.&lt;/p&gt;

&lt;p&gt;For the most part things went smoothly, until it hit the wall. The Redis server would hang, accepting new sockets, but not servicing any requests. Replication stopped working.  When re-starting the server, it would take forty-five minutes (!) from invocation time before it was ready to serve requests.&lt;/p&gt;

&lt;p&gt;To nobody's surprise, I was right. Redis was an unknown quantity, both in how much data it could store reliably and how performance degraded. Yes, maybe things could have been different if I used Cassandra or MongoDB, but the point is the same: &lt;em&gt;newfangled stuff is not worth the risk&lt;/em&gt;, especially if something like PostgreSQL can do the same job.&lt;/p&gt;


&lt;h3&gt;Event Loops are Just Okay&lt;/h3&gt;

&lt;p&gt;One of the most positively retarded things I've ever read comes out of the node.js home page, describing why nonblocking I/O is so great:&lt;/p&gt;

&lt;blockquote&gt;
Almost no function in Node directly performs I/O, so the process never blocks. Because nothing blocks, less-than-expert programmers are able to develop fast systems.
&lt;/blockquote&gt;

&lt;p&gt;Statements like this give me The Fear. Nothing ever blocks, huh? What about the callback that Node runs for new requests? If that does any CPU work, it sure as hell blocks. If each callback does 100 milliseconds of CPU work, then the Node server will only be able to handle 10 requests/sec as a theoretical maximum, because the event loop doesn't pick up new requests until the callback is complete.  Scalability indeed.&lt;/p&gt;


&lt;p&gt;Nothing is more dangerous than a programmer who doesn't know what he doesn't know. Event loops work well if your server is heavily I/O bound, whereas if the server needs to do some nontrivial CPU work, you may be better off with threads. Hell, you can even use both, like Nginx does (well, worker processes, at least), to hold lots of sockets open but still do CPU work asynchronously.&lt;/p&gt;

&lt;p&gt;The point is, evented I/O is not magic scalability pixie dust, and like anything, there is a tradeoff.&lt;/p&gt;


&lt;h3&gt;Hardware Matters&lt;/h3&gt;
&lt;p&gt;Cloud computing was built for suckers by hustlers. The physical machine your programs run on can make all the difference in the world when it comes to performance and reliability. I am not talking about using an Extra Large EC2 instance for your database because it's &quot;beefier&quot;, I am talking about understanding down-to-the-metal, what the performance characteristics of a system are.&lt;/p&gt;

&lt;p&gt;For example, in a heavy write throughput application, you want &lt;code&gt;fsync()&lt;/code&gt; to return as quickly as possible. (For those of you using MongoDB, &lt;code&gt;fsync()&lt;/code&gt; is the system call a program makes to synchronize writes to disk). To the software, &lt;code&gt;fsync&lt;/code&gt; is a black box that you can't muddle with, that is, unless you have your shit together in the hardware. If you spend the extra money on a battery-backed RAID controller, &lt;code&gt;fsync&lt;/code&gt; can return almost immediately, because the controller can hold writes in battery-backed memory and guarantee that they will be flushed to disk, even in the event of a power failure. On a database machine, you will see a significant write performance increase if you use the correct hardware.&lt;/p&gt;

&lt;p&gt;Suppose again that you are serving data from disk in a heavy read throughput application. Data is accessed randomly, so to service those requests, the disk's read heads are scurrying about the platters constantly.  With EC2, when Amazon says &quot;I/O performance: High&quot;, what does that even mean? Is that suitable for a heavy random read scenario?  Again, knowing your shit when it comes to hardware is valuable here. Solid-state hard disks, while expensive, have unbelievable random read performance. (Their sequential read performance isn't amazingly better than spinning disks, though) Spending the extra money on SSD drives is almost always a win, if you have an disk bound problem.&lt;/p&gt;

&lt;p&gt;Hardware is one of the best places to put capital to work. It is far more efficient to buy your way out of a performance problem than it is to rewrite software. When running your app on commodity hardware, don't expect anything better than commodity performance.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Too Smart for Git</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/08/too-smart-for-git.html"/>
   <updated>2010-08-01T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/08/too-smart-for-git</id>
   <content type="html">&lt;img class=&quot;post-lead-image&quot; src=&quot;/teddziuba/images/the-fundamental-thought-that-drives-a-womans-every-action-is-that-somewhere-a-man-is-relaxing.jpg&quot;&gt;

&lt;p&gt;The worst part about Git is the &quot;Git-SVN Crash Course&quot; tutorial, because it will convince a newcomer that their understanding of how source control works transfers smoothly to Git.  If you combine that with the fact that most smart people simply refuse to read documentation and will just mash on the keyboard until it &lt;em&gt;appears&lt;/em&gt; to work, it means that teams who are starting out using Git are on the yellow brick road to a world of hurt.&lt;/p&gt;

&lt;p&gt;Git follows Linux's philosophy of refusing to protect you from yourself. Much like Linux, Git will sit back and watch you fuck your shit right up, and then laugh at you as you try to get your world back to a state where up is up and down is down. As far as source control goes, not a lot of people are used to this kind of free love.&lt;/p&gt;

&lt;p&gt;Milo has been using Git from the get-go, and the git pull/git push with a central repository works well enough when you only have a handful of developers.&lt;/p&gt;

&lt;img class=&quot;mt-image-center&quot; src=&quot;/teddziuba/images/good-luck-untangling-that-commit-history.png&quot;&gt;

&lt;p&gt;This fits with the Subversion worldview that a benevolent central server will never lead you astray.  The first time you get some error about not being able to push non-fast-forward commits to origin, you kind of gloss over it, and a &lt;code&gt;git pull&lt;/code&gt; fixes you right up. If you understand Git as you understand SVN, you can easily mistake this error for something like &quot;oh, I just have to do a merge because someone else's commits beat me to it&quot;. Well, technically that's right, but for the wrong reasons.  Using this model, if you think of the developers as different threads sharing the origin repository resource, Git is putting lock on the &lt;em&gt;whole repository&lt;/em&gt;, as opposed to individual files. With a lot of threads sharing a coarsely-locked resource, there's always going to be contention. However, in a small enough team, it's workable to think of it this way.&lt;/p&gt;

&lt;p&gt;This is where smart people run into trouble.  We know in the backs of our minds that something is just a little off about that non-fast-forward-commit error message, but everything looks fine, so why go read about what's going on? I am generally not a fan of reading documentation until I am debugging a problem, because I expect software to behave as it would if I wrote it. Good documentation is condescendingly terse, and Git's documentation is like an art critic who giggles at you. (&lt;em&gt;The description of git-rebase is &quot;Forward-port local commits to the updated upstream head&quot;. Oh, fuck off.&lt;/em&gt;)&lt;/p&gt;

&lt;p&gt;In the last couple of weeks, we started using Gerrit for code reviews, and anyone who was using Git like a translation table for SVN commands just heard the bell ring at the School of Hard Knocks.&lt;/p&gt;

&lt;img class=&quot;mt-image-center&quot; src=&quot;/teddziuba/images/rebase-ooh-that-sounds-fun.png&quot;&gt;

&lt;p&gt;Gerrit acts as an intermediary between developers and the origin repository. You send commits to Gerrit and it holds them in purgatory until they are signed off on by another developer. Then, if the commit applies cleanly to the branch, Gerrit applies it. Otherwise, it asks you to upload a merge commit, which is where the fun &lt;em&gt;really&lt;/em&gt; starts.&lt;/p&gt;

&lt;p&gt;The upside to this system is that it prevents other peoples' &lt;code&gt;git-fuckups&lt;/code&gt; from becoming your &lt;code&gt;git-fuckups&lt;/code&gt;. The downside is that because commits are held in this transient state, it's very easy to lay a path of destruction through your local repository with &lt;code&gt;rebase&lt;/code&gt; and &lt;code&gt;merge&lt;/code&gt; if you don't completely grok what's going on. Like old skid marks on a highway, our codebase is peppered with merge commits and cherry-picks that are the only living records of a series of oh-shit moments.&lt;/p&gt;

&lt;p&gt;The problem isn't that Git is to hard, it's that smart developers are impatient and have exactly zero tolerance for unexpected behavior in their tools. While Git is the trendy thing right now, perhaps some day you will come across a grizzled developer who is using SVN, and when you ask him why, his answer won't make sense, because it's a Zen thing.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>SEO Is Mostly Quack Science</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/06/seo-is-mostly-quack-science.html"/>
   <updated>2010-06-12T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/06/seo-is-mostly-quack-science</id>
   <content type="html">&lt;img src=&quot;/teddziuba/images/set-phasers-to-troll.jpg&quot; class=&quot;post-lead-image&quot;&gt;&lt;p&gt;Most college mathematics or computer science departments have a
&quot;crank bin&quot;: a box with collected papers that people have sent in for
review. There are all sorts of gems in there: a two-page proof of the
Riemann hypothesis, a drawing that demonstrates P = NP, and of course,
a draft of a patent application for a free energy machine.  Many
professors just throw this crap out, but some collect it because it
makes for a good read when you're feeling discouraged by a hard
problem.&lt;/p&gt;

&lt;p&gt;Me, whenever I need a pick me up, I go read some of the latest new
techniques for SEO.  There are a handful of fundamentals about
page design and other nitty things like URL structure that are
generally accepted as &lt;em&gt;good SEO&lt;/em&gt;, and you can derive all of
this from the principles of not completely failing at web design.
Non-brain-damaged web design and link building are 100% of SEO.&lt;/p&gt;

&lt;p&gt;Anyone who tells you different is a quack that is only trying to separate you
from your money.&lt;/p&gt;

&lt;p&gt;Quackery in medicine is pretty easy to spot, and quackery in
computing is pretty similar:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&quot;Research&quot; performed by people with slim technical
  backgrounds&lt;/li&gt;
  &lt;li&gt;Suspect experimental controls or no experimental controls&lt;/li&gt;
  &lt;li&gt;Little investigation of alternative explanations for
  phenomena&lt;/li&gt;
  &lt;li&gt;Little to no data reported from findings&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let's look at my favorite example: SEOMoz. It's a wealth of
collected information about SEO, almost completely anecdotal, and of
course you can get access to more &quot;professional&quot; information for a
fee. There probably a place on the site where you can buy Acai
berries, but I haven't found it yet.&lt;/p&gt;

&lt;p&gt;SEOMoz recently &lt;a rel=&quot;nofollow&quot;
href=&quot;http://www.seomoz.org/blog/google-vs-bing-correlation-analysis-of-ranking-elements&quot;&gt;published&lt;/a&gt;
a correlation analysis of ranking factors for Google and Bing. First
of all, out of all the factors they measured ranking correlation for,
nothing was correlated above .35. In most science, correlations this
low are not even worth publishing. As a warm-up, the author explains a
graph that shows &lt;em&gt;negative&lt;/em&gt; correlation with rank for URL
length and .com TLD extension, meaning that longer URLs were less high
up in the search results as were URLs that came from .com
domains:&lt;/p&gt;

&lt;img clas=&quot;mt-image-center&quot; src=&quot;/teddziuba/images/bing-v-google-negative-corr.gif&quot;&gt;

&lt;p&gt;This is the explanation, verbatim:&lt;/p&gt;

&lt;blockquote&gt;The data for URL length shows that longer URLs are
negatively correlated with ranking well. This isn't particularly
shocking, and it probably iswise to limit the length of our URLs if we
want to perform well in the engines. However, the second data point on
.com TLD extensions shouldn't necessarily suggest that using .com as
your top-level domain extension will actually negatively affect your
rankings, but merely that all other things being equal, .com domains
didn't perform as well in the dataset we observed as other domain
extensions.&lt;/blockquote&gt;

&lt;p&gt;That is not how science works. You can't discount data just because
you feel like it.  Also notice that the most negative correlation metric they
found was -.18. A correlation of zero suggests that the two variables
are completely independent of one another. Such a small correlation on
such a small data set, again, is not even worth publishing.&lt;/p&gt;

&lt;p&gt;There is no hypothesis being tested here. It's just graphs, and
misleading graphs at that. The sad part is, SEOMoz is as close as the
SEO industry comes to real science. They may be presenting specious
results in hopes of looking like they know what they're talking about,
but at least they are collecting some sort of data.&lt;/p&gt;

&lt;p&gt;Everything else in the field is either anecdotal hocus-pocus or a
decree from Matt Cutts. When you hire an SEO consultant, what you are
really paying for is domain experience in the
not-failing-at-web-design field. It's fine to pay for this kind of
service, but beware of anyone who claims to have studied the effects
of different techniques. They might give you skin failure.&lt;/p&gt;

&lt;p&gt;Update: It looks like I am not the first one to notice this. Here is a good article with more &lt;a href=&quot;http://irthoughts.wordpress.com/2010/04/23/beware-of-seo-statistical-studies/&quot;&gt;statistical formalism&lt;/a&gt; on SEOMoz quackery.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>The Future of Apple's Curated Computing</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/05/the-future-of-apples-curated-computing.html"/>
   <updated>2010-05-15T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/05/the-future-of-apples-curated-computing</id>
   <content type="html">&lt;img src=&quot;/teddziuba/images/step-1-load-the-gun-step-2-kill-yourself-step-3-theres
-no-step-3.jpg&quot; class=&quot;post-lead-image&quot;&gt;

&lt;p&gt;
&lt;em&gt;Here’s to the crazy ones. The misfits. The rebels. The troublemakers. The round pegs in the square holes. The ones who see things differently. They’re not fond of rules. And they have no respect for the status quo. You can praise them, disagree with them, quote them, disbelieve them, glorify or vilify them. About the only thing you can’t do is ignore them. Because they change things. They invent. They imagine. They heal. They explore. They create. They inspire. They push the human race forward.&lt;/em&gt; -- Apple ad, 1997.
&lt;/p&gt;

&lt;p&gt;My first computer was a 33MHz Macintosh Performa 637-CD. It had 8
megabytes of memory. It was one of the newer Macs that featured a
CD-ROM drive. We bought Apple's 2400 baud modem as an accessory, and
signed up for eWorld.&lt;/p&gt;

&lt;p&gt;The one thing that got me into exploring computers in the first
place was learning how to hack the game Escape Velocity using ResEdit,
Apple's pistol-without-a-safety tool that was supposed to be for
developers but available to anyone. (I became an expert at
fresh-installing MacOS 7.5)&lt;/p&gt;

&lt;p&gt;At the time, I was about as fanboy as they came. This was around
the era when Apple was long considered the walking dead, only kept
standing by the mercy of Adobe continuing to release Photoshop, and
Microsoft, possibly to keep the antitrust litigators out of their ass,
publishing Office for the Mac.  Being a Mac user in the late nineties,
you had this feeling that you were on the side of right &amp;mdash; that
the competition wasn't about megahertz or gigabytes, but that the
counterculture was the spark that would give way to the natural order
of things.  We thought we were on the ground floor of the inevitable.&lt;/p&gt;

&lt;p&gt;Apple now stands as a monument to the failure of that free-thinking
counterculture.  We thought that freedom from the tie-wearing,
meeting-holding, memo-dictating corporate world was going to be the
catalyst for utopian computing, and that Steve Jobs had the vision for
how it was all going to work. Maybe he did, maybe he didn't, but that
utopian endgame is quickly de-evolving into a dictatorship.&lt;/p&gt;

&lt;p&gt;It started with Apple's tight control on the iPhone app market, the
approvals process, and the well-manicured app store. Now, Apple is not
only dictating what applications may or may not run on the iPhone or
iPad, but they are also dictating &lt;em&gt;the language in which apps must
be written&lt;/em&gt;. Their justification for all of this is &quot;for the good
of the user&quot;, but it might just be the capstone delusion of an aging
hippie who never got a chance to run for Congress. I predict that
within five years, Apple will begin &lt;em&gt;telling&lt;/em&gt; development shops
what kinds of apps they should make. Why? Because it will be &quot;good for
the user&quot;, and you know, Mr. iPad developer, apps that are good for
users usually sail right through the approvals process. Apple's
iPhone/iPad department will be renamed Central Planning, and may God
help you if you cross them.&lt;/p&gt;

&lt;p&gt;I could be wrong, though. The backlash had been pretty severe, to
the point where it may be getting to Steve Jobs. Take, for example, a
recent e-mail exchange he had with a Gawker reporter, in which Jobs
took a shot:&lt;/p&gt;

&lt;blockquote&gt;By the way, what have you done that’s so great? Do you
create anything, or just criticize others work and belittle their
motivations?&lt;/blockquote&gt;

&lt;p&gt;Back when I was writing Uncov, I would see this particular flavor
of ballache pretty frequently, and it was a very good indicator of
the person's nerves. The thing is, being a CEO, you need to be able to
let the critics roll off your back. We talk shit, it's our job, and
the bigger the shit we talk, the more we get paid. Most executives
know this, and don't respond to us trolls. It's only when they're
starting to wear down do they bust out the Teddy Roosevelt
man-in-the-arena speech. (By the way, Theodore Roosevelt was shot in
the chest once, and proceeded to deliver a speech with the bullet
still in him. He left the bullet in his body until his death seven
years later. Executives: That shit's hard core...you are not TR.)&lt;/p&gt;

&lt;p&gt;I will still probably buy an iPhone some day because they are very
cool. However, I will never develop for it, because I'm a crazy one. A
misfit. And I'm not fond of rules.&lt;/p&gt;

&lt;p&gt;Now where did I pick up that idea?&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Why Engineers Hop Jobs</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/05/why-engineers-hop-jobs.html"/>
   <updated>2010-05-01T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/05/why-engineers-hop-jobs</id>
   <content type="html">&lt;p&gt;&lt;img
src=&quot;/teddziuba/images/why-is-it-that-militant-atheists-never-say-anything-bad-about-jews.jpg&quot;
class=&quot;post-lead-image&quot;&gt;What's with all the hate on my generation? It started when somebody
 quit Jason Calacanis's industrial web spam startup, Mahalo, for a higher
paying position at a competitor. Invariably, Calacanis went apeshit on
the poor guy in a very public way, and this started a cascade of blogosphere
butthurt about people in software under thirty: that we're unreliable,
that we're lazy, that we're entitled.&lt;/p&gt;

&lt;p&gt;Well I'm as unreliable, lazy, and entitled as the next guy, but that's not
why I've hopped jobs in the past. People in my generation have a very
low tolerance for bullshit, and software engineering, in general, is a
very high bullshit career. If you couple that with the standard load
of bullshit you would get from a non-technical Harvard MBA type boss &amp;mdash;
like many CEOs that you find trying to get rich in Silicon Valley by
hiring some engineers to &quot;code up this idea real quick&quot; &amp;mdash; it's no
wonder that a good engineer will walk off the job after his one year
cliff vesting.&lt;/p&gt;

&lt;p&gt;As an engineer, you are told that you're &quot;lucky to have a job&quot;, because there are &quot;a hundred people lined up
outside, ready to take it&quot;. (As chance would have it, there are at
least a thousand lined up to take the job of &lt;em&gt;rich prick who tells
people what to do&lt;/em&gt;). This backlash is the product of diseased
thinking. A CEO who makes an engineer work 80 hours a week is a driven
entrepreneur, but an engineer asking for a comfy chair is a prima
donna. So, when we are up to our knees in golf-course, martini-lunch
bullshit, don't be surprised when we jump ship for a higher
salary.&lt;/p&gt;

&lt;p&gt;I recognize the value of business people and
management. Somebody has to sell the code that I write, which in turn
puts food on my table. Since I &lt;em&gt;am&lt;/em&gt; an engineer, I like
iterative optimization. Every time I have left a job, I have
further refined the requirements that a person must fill before I agree to work for him. After every job, I add one or two requirements to the list, and
I have found that my happiness at work improves dramatically with
every step.&lt;/p&gt;

&lt;p&gt;This is my current list:

&lt;ul&gt;
  &lt;li&gt;The organization must need me at least as much as I need it.&lt;/li&gt;
  &lt;li&gt;My direct manager must have a technical background &amp;mdash; enough to understand why programming is hard.&lt;/li&gt;
  &lt;li&gt;My direct manager must have enough experience or raw intelligence such that I can trust him/her to make decisions, even though I may not understand the reasoning.&lt;/li&gt;
  &lt;li&gt;I must have absolute faith in the business plan.&lt;/li&gt;
  &lt;li&gt;I must have absolute faith in &quot;the business side&quot; to execute that plan.&lt;/li&gt;
&lt;/ul&gt;
&lt;/p&gt;

&lt;p&gt;So, Jason, when that fellow quit Mahalo, he didn't just leave you
in the lurch. He added something to his list. Maybe you should find
out what that is.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Blog Upgrade</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/04/blog-upgrade.html"/>
   <updated>2010-04-04T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/04/blog-upgrade</id>
   <content type="html">&lt;img src=&quot;/teddziuba/images/government-bailouts-prove-that-investment-banking-is-now-graded-on-a-curve.jpg&quot; class=&quot;post-lead-image&quot;&gt;

&lt;p&gt;So that's it, I'm finally done with Movable Type. I had upgraded my
  4.x install to the newest 5.x, and the process was nothing but a
  colossal fuckup. After the fact, my site was compromised 4 times by
  my last count &amp;mdash; some bot set up a handful of phishing pages
  here. RSA security caught it, notified Dreamhost, and they shut me down a
  couple of times.&lt;/p&gt;

&lt;p&gt;Anyhow, rather than figure out the attack vector with Movable Type,
  I decided to scrap it and
  use &lt;a href=&quot;http://github.com/mojombo/jekyll&quot; rel=&quot;nofollow&quot;&gt;Jekyll&lt;/a&gt;. It only took me a day and a
  couple of angry Python scripts to migrate all my shit to Jekyll from
  MT. Comments are still off because I don't care in the slightest
  what people have to say, and certainly not enough to slow my stuff
  down with Javascript.&lt;/p&gt;

&lt;p&gt;One of the reasons I used MT in the first place was that it
  generated static HTML pages for all posts, instead of doing
  something silly like querying a database to generate what amounts to
  static content. Because of this, my pages load (first request to
  final render) in about 300
  milliseconds on my home connection. For comparison, techcrunch.com
  can take upwards of 1 minute from first request to final render.&lt;/p&gt;

&lt;p&gt;In this regard, Jekyll feels right. I can keep everything under
  version control, the templating is only marginally braindead, and
  the publishing step is rsync. After using Jekyll, I feel like every
  other blogging engine out there is telling me, &lt;em&gt;&quot;You'll shoot
  your eye out, kid!&quot;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It's also come to my attention that the re-do with Jekyll has
  caused my posts to show up afresh in all of your Google Reader
  accounts. This was unintentional, but a nice benefit. It's true,
  this is the greatest web site on the internet, and everything you
  need to know, you can find out here.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>I Can't Wait for NoSQL to Die</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/03/i-cant-wait-for-nosql-to-die.html"/>
   <updated>2010-03-04T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/03/i-cant-wait-for-nosql-to-die</id>
   <content type="html">&lt;img alt=&quot;trolling-chatroulette-with-pictures-of-suicide-scenes-will-never-stop-being-funny.jpg&quot; src=&quot;/teddziuba/images/trolling-chatroulette-with-pictures-of-suicide-scenes-will-never-stop-being-funny.jpg&quot; width=&quot;335&quot; height=&quot;224&quot; class=&quot;post-lead-image&quot; style=&quot;float: right; margin: 0 0 20px 20px;&quot; /&gt;They don't teach you this in college, but the fundamental theorem of the software industry is the idea that everything needs to be rewritten all the time. &amp;nbsp;As a corollary, web startup engineers believe that there is no problem but scalability, &amp;nbsp;and architecture is its solution. And thus, the NoSQL movement was born.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The idea is that object relational databases like MySQL and PostgreSQL have lapsed their useful lifetimes, and that document-based or schemaless databases are the wave of the future. Never mind of course that MySQL was the perfect solution to everything a few years ago when Ruby on Rails was flashing in the pan. Never mind that &lt;i&gt;real&lt;/i&gt;&amp;nbsp;businesses track all of their data in SQL databases that scale just fine. (For Silicon Valley readers, Walmart is a &lt;i&gt;real business&lt;/i&gt;, Twitter is not.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Invariably, all web projects start off with something like Rails or Django, most likely backed by MySQL. The data relationships are easy to model, and the application works well. &amp;nbsp;If you are lucky enough that people actually &lt;i&gt;use&lt;/i&gt;&amp;nbsp;your application, eventually you will start to see some performance issues. At this point, a developer who values technological purity over gettin' shit done will advocate &quot;rewriting the whole thing in a weekend using Cassandra&quot;. &amp;nbsp;And if he's smart enough, he might just pull it off. (Of course, said developer has only migrated the &lt;i&gt;app&lt;/i&gt;&amp;nbsp;to use a different data store - all of the ancillary support code was conveniently ignored)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So you've magically changed your backend from MySQL to Cassandra. Stuff will just work now, right? Well, no. Did you know that Cassandra requires a restart when you change the column family definition? Yeah, the MySQL developers actually had to think out how ALTER TABLE works, but according to Cassandra, that's a hard problem that has very little business value. Right.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'm not just singling out Cassandra - by replacing MySQL or Postgres with a different, new data store, you have traded a well-enumerated list of limitations and warts for a newer, poorly understood list of limitations and warts, and &lt;i&gt;that&lt;/i&gt;&amp;nbsp;is a huge business risk.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;img alt=&quot;one-skill-all-junior-developers-lack-is-knowing-how-to-tell-your-boss-to-fuck-off.jpg&quot; src=&quot;/teddziuba/images/one-skill-all-junior-developers-lack-is-knowing-how-to-tell-your-boss-to-fuck-off.jpg&quot; width=&quot;250&quot; height=&quot;287&quot; class=&quot;mt-image-left&quot; style=&quot;float: left; margin: 0 20px 20px 0;&quot; /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;You Are Not Google&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;The sooner your company admits this, the sooner you can get down to some real work. &amp;nbsp;Developing the app for Google-sized scale is a waste of your time, plus, there is no way you will get it right. Absolutely none. It's not that you're not smart enough, it's that you do not have the experience to know what problems you will see at scale.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Besides, did you know that Google Adwords is &lt;a href=&quot;http://en.wikipedia.org/wiki/AdWords#Technology&quot;&gt;implemented on top of MySQL&lt;/a&gt;? &amp;nbsp;What, that business critical code that operates at massive scale doesn't use BigTable? No, in fact there is such enormous value in sticking with what works that Google identifies problems with InnoDB at scale and submits patches, instead of saying &quot;MySQL doesn't scale, let's dump it for something else&quot;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;NoSQL will never die, but it will eventually get marginalized, like how Rails was marginalized by NoSQL. &amp;nbsp;In the meantime, DBAs should not be worried, because any company that has the resources to hire a DBA likely has decision makers who understand business reality.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style=&quot;font-size: 0.8em; &quot;&gt;Top photo credit &lt;a rel=&quot;nofollow&quot; href=&quot;http://www.paulrussell.info/&quot;&gt;Paul Russell&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Eventlet: Asynchronous I/O for Grownups</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/02/eventlet-asynchronous-io-for-g.html"/>
   <updated>2010-02-11T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/02/eventlet-asynchronous-io-for-g</id>
   <content type="html">&lt;img alt=&quot;lose-an-argument-like-a-man-say--well-i-guess-ill-just-go-fuck-myself-then.jpg&quot; src=&quot;/teddziuba/images/lose-an-argument-like-a-man-say--well-i-guess-ill-just-go-fuck-myself-then.jpg&quot; width=&quot;250&quot; height=&quot;188&quot; class=&quot;post-lead-image&quot; style=&quot;float: right; margin: 0 0 20px 20px;&quot; /&gt;Event-driven asynchronous I/O is the newest chatter at the Silicon Valley High Abercrombie table. &amp;nbsp;Threading, the mode of parallelism we all thought we were so smart for understanding, isn't cool anymore. Everybody who is anybody is using asynchronous I/O, and of course, there are different opinions on how it should be done. This being the software world, you can count on those opinions being vehement.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If you look at the benchmarks, all of the major async libraries for Python are basically on the same operating plane. There's Twisted, Tornado, gevent, and a handful of others, but the one that really stands out in the group is &lt;a href=&quot;http://eventlet.net/&quot;&gt;Eventlet&lt;/a&gt;. Why is that? Two reasons:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1. &lt;b&gt;You don't need to get balls deep in theory to be productive with Eventlet.&lt;/b&gt;&lt;/div&gt;&lt;div&gt;2. &lt;b&gt;You need to modify very little pre-existing code to adapt a program to be event-driven.&lt;br /&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;Eventlet's approach is that &lt;i&gt;asynchronous code should look like synchronous code&lt;/i&gt;. Why? Because it's easy for people to understand synchronous code. &amp;nbsp;Thinking about callbacks and schedulers is unnecessary, after all, we have work to do. What's more, not only does asynchronous code with Eventlet &lt;i&gt;look&lt;/i&gt; synchronous, it can also &lt;i&gt;run&lt;/i&gt; synchronously.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Look at this Python snippet:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fetch_and_parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;contents&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urllib2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;urlopen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tree&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lxml&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;html&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fromstring&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;contents&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# Do some parsing on the ElementTree&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

It looks like regular synchronous code, and ostensibly it is. The output of the URL fetch is the input to the HTML parser. However, if you have a ton of URLs to do this to, how would you parallelize it? Threads are an option, but so is Eventlet:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;eventlet&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;eventlet.green&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urllib2&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;green_pool&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eventlet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GreenPool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;green_pool&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;imap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fetch_and_parse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;urls&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;

This is interesting because all I've done to make a seemingly synchronous piece of code run asynchronously is to patch the library it needs for I/O and give it a driver method. That driver class could have easily been a series of threads all reading from a Queue, and importing the standard library's version of urllib2.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now hold on a second. This is a painfully contrived example, but it's such a key point: The asynchronous code looks synchronous. It can even function synchronously. All I did to make it use event-driven I/O is &lt;b&gt;change the driver and patch a library&lt;/b&gt;. Now this is podracing!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That sort of integration has such a massive business value that I will easily disregard any pissing-contest performance gains that Twisted or Tornado may offer. I know that when you have code written in the &quot;old&quot; style, and the powers that be hand down the &quot;new&quot; style, there is an itch to re-write it, but rewriting known-working code is the worst thing you can do for your project.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The Eventlet developers have gone further than this, providing a facility to monkey-patch the existing system libraries at invocation time. For example, let's say you have a web app that does some Memcached I/O and some database I/O.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;eventlet&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;patcher&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;patcher&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;monkey_patch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;all&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

Oh look. Your application is now using asynchronous I/O. This call patches Python's socket module and a few others to make it all &quot;just work&quot; with Eventlet's internal coroutine switching mechanism. (Caveat: MySQLdb, which uses C-land sockets, needs a little bit of extra treatment, but it's only a couple of lines)&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This all sounds great in theory, but I have actually made a large I/O bound program work using monkey patching and changing the driver. It is a piece of software that reads jobs from a queue and processes them, putting the result in memcached. For esoteric reasons I will not go into, the job processors could not thread the work, they had to fork. Using this setup, one production box with 8GB of RAM was consistently 7.5GB full. After a less than 5 line code change to the driver, that same production box uses only around 1GB of RAM consistently, and can handle 5 to 10x the throughput of the old system.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now compare this to Twisted or Tornado. Twisted tries so damn hard to be Java that it really offends me personally. Those developers strike me as the alpha-programmer types who see no reason &lt;i&gt;not&lt;/i&gt; to rewrite an existing codebase for a 20% performance gain. &amp;nbsp;Tornado on the other hand is significantly less Jersey Shore douchebaggy, but they still miss the point: we are programmers who need to get stuff done. Inventing your own HTTP client class, when Python's builtin works just fine if not better is the type of hubris that gets hotshot programmers fired in their first month.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's also gevent, which appears to be a fork of Eventlet, but is not as well documented. Partial credit.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's hard to find a performance or scaling related open source library that values my time. Eventlet is one of those rare few.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href=&quot;http://eventlet.net/&quot;&gt;http://eventlet.net&lt;/a&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Break My Concentration and I Break Your Kneecaps</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/01/break-my-concentration-and-i-b.html"/>
   <updated>2010-01-24T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/01/break-my-concentration-and-i-b</id>
   <content type="html">&lt;img alt=&quot;a-handgun-is-like-an-atm-machine-and-convincing-argument-all-in-one.jpg&quot; src=&quot;/teddziuba/images/a-handgun-is-like-an-atm-machine-and-convincing-argument-all-in-one.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; width=&quot;279&quot; height=&quot;279&quot; /&gt; &lt;div&gt;I own a good set of headphones that fully enclose my ears. I am not an audiophile, I just don't like to hear other people talk at me. &amp;nbsp;When I am staring at my Emacs windows with headphones on, it generally isn't a physical cue that I am looking for conversation. In fact, when I am that deep into thinking out a problem and I get interrupted, I think about the anti-workplace-violence clause in the employee handbook, and how a poorly lit parking lot probably doesn't qualify as &quot;company property&quot;.&lt;br /&gt;&lt;br /&gt;Interrupting a thinking programmer is a sucker punch to productivity's kidney. Of course it's still important to keep open communication channels, especially in a small team. I don't mind answering questions and helping out, so long as it's not an immediate context switch for me, i.e. I'll help you if I don't have to speak.&lt;br /&gt;&lt;br /&gt;Instant messaging is a decent first attempt, but it's only person-to-person communication. (And no, group-IM &lt;i&gt;never&lt;/i&gt; fucking works right) Programming teams need group chat.&amp;nbsp; White-label Twitter clones like Yammer are okay, but I feel icky using a product that is hailed as a technological advance for supporting the ability to identify topics by prefixing a word with a pound sign. That, and I want to keep an eye on the conversation as I work, and my attention isn't on my IM client or browser when I'm coding. It's on Emacs. &lt;br /&gt;&lt;br /&gt;The answer, of course is IRC.&lt;br /&gt;&lt;br /&gt;My team recently grew, and four of us need to communicate constantly. I set up an IRC server and brought people in. One non-programmer who needed to be in the loop had never used IRC, but caught on quickly. Productivity is up, as is communication. The developer chat channel is right in front of me as I work, as a window in Emacs:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;at-the-crunchies-i-got-drunk-and-started-heckling-people-who-used-to-be-important.png&quot; src=&quot;/teddziuba/images/at-the-crunchies-i-got-drunk-and-started-heckling-people-who-used-to-be-important.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; width=&quot;576&quot; height=&quot;317&quot; /&gt;Think of developer communication like I/O. There's blocking and nonblocking. When somebody talks to me as I work, my programming train of thought needs to block. With inline chat like you see above, I can answer questions when I have spare cycles. Since the conversation is integrated into my development environment, I don't need to look around at other applications, and there's no popup notification bouncing around like a Jack Russell terrier who got into my Adderall supply. Also since it's Emacs, it's not vim. If you use vim, /quit #life.&lt;br /&gt;&lt;br /&gt;Collaboration technology doesn't need to be re-invented every six years. The stuff we had in the eighties works just fine.&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Options for Parallel Compression</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/01/options-for-parallel-compressi.html"/>
   <updated>2010-01-15T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/01/options-for-parallel-compressi</id>
   <content type="html">&lt;img alt=&quot;when-a-couple-gets-a-dog-its-like-saying-we-want-a-baby-but-dont-want-to-go-to-jail-if-it-dies-by-accident.jpg&quot; src=&quot;/teddziuba/images/when-a-couple-gets-a-dog-its-like-saying-we-want-a-baby-but-dont-want-to-go-to-jail-if-it-dies-by-accident.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;255&quot; width=&quot;288&quot; /&gt;At Milo, I pretty frequently need to pull data down from production to my workstation to test some new code. That's what happens when you raise a Series A round - you can't live-edit production data anymore. I think it's in the term sheet somewhere.&lt;br /&gt;&lt;br /&gt;Anyhow, I was pulling down a 14GB MySQL database dump today. Trying to compress it through plain Jane gzip was pretty slow, so I looked for some parallel options. The server I was pulling from has 16 cores, so I figured I could make use of them.&amp;nbsp; Anyhow, here's what I found:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://compression.ca/pbzip2/&quot;&gt;pbzip2 - Parallel BZIP2&lt;/a&gt;: Parallel implementation of BZIP2. BZIP2 is well known for being balls slow, so speed it up using multiple CPUs.&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.zlib.net/pigz/&quot;&gt;pigz - Parallel GZIP&lt;/a&gt;: Parallel implementation of GZIP written by Mark Adler (guy who co-authored zlib and gzip, so you can be reasonably confident he has his shit together).&lt;/li&gt;&lt;/ul&gt;On the 14GB database dump, both are faster than vanilla GZIP. Because Hacker News and Reddit both love this shit, here are the timing stats:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Plain gzip, default compression level: 11 minutes, 58 seconds. Resultant file is 2.3GB.&lt;/li&gt;&lt;li&gt;pbzip2, default compression level: 8 minutes, 48 seconds. Resultant file is 1.7GB.&lt;/li&gt;&lt;li&gt;pigz, default compression level: 1 minute, 33 seconds. Resultant file is 2.3GB.&lt;/li&gt;&lt;/ul&gt;Again this was on a 14GB database dump file, on a 16-core machine, with Intel solid state disks.&lt;br /&gt;&lt;br /&gt;If any readers know of other parallel compression schemes I can try, e-mail me and let me know. I will post stats here.&lt;br /&gt; &lt;div&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>I Love the GPL (Except When it Applies to Me)</title>
   <link href="http://widgetsandshit.com/teddziuba/2010/01/i-love-the-gpl-except-when-it.html"/>
   <updated>2010-01-02T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2010/01/i-love-the-gpl-except-when-it</id>
   <content type="html">&lt;img alt=&quot;if-red-wine-and-hybrid-cars-were-made-from-animals-there-would-be-no-more-vegans.jpg&quot; src=&quot;/teddziuba/images/if-red-wine-and-hybrid-cars-were-made-from-animals-there-would-be-no-more-vegans.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;360&quot; width=&quot;263&quot; /&gt; &lt;div&gt;Boy do I love free software. It is usually pretty high quality, I don't have to pay for it, and I feel completely justified in criticizing the maintainers on public mailing lists for not supporting the exact features I need.&amp;nbsp; Of course I'm not going to send patches back, because it's just way easier to bitch and moan. &lt;br /&gt;&lt;br /&gt;Also, since my software product is a web service, I have exactly zero obligation to contribute anything back to the community, ever. Sure, I may use some GPLed software, but shit, actually following the spirit of the copyleft? Don't they know this is a business, not a charity? Fuck that noise.&lt;br /&gt;&lt;br /&gt;I came up in the salad days of Slashdot, when the cast of villains and henchmen included Microsoft, SCO, and anyone else who wanted to turn a dime from software. We believed in the GPL, that a viral copyleft clause was good for humanity. That is, until we left academia and had to pay the rent.&lt;br /&gt;&lt;br /&gt;Since the world appears to be moving toward software as a service (against my sage advice, mind you), it is blisteringly easy to be a champion of the ideals behind open source and free software, but still pussyfoot around when it comes to execution.&amp;nbsp; What I'm talking about is the loophole in the GPL that exempts application service providers from having to release their derivative works under the same license as the libraries.&lt;br /&gt;&lt;br /&gt;The pedantic reader who is going to talk shit will point out the difference between &lt;i&gt;open source&lt;/i&gt; and &lt;i&gt;free&lt;/i&gt; software. So, before you write a blog post that nobody's going to read, allow me to demonstrate.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Open Source&lt;/b&gt;: I want to let others use my code in whatever manner they please, and not be bound by an anti-commercial license.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Free Software&lt;/b&gt;: I found a loophole in my student loan documentation that lets me defer payments for decades, so long as I stay in the Ph.D. program!&lt;br /&gt;&lt;br /&gt;If anything good comes out of Web 2.0, it's the malignant tumor on the GPL's kidney, still wrongly diagnosed as a urinary tract infection.&lt;br /&gt;&lt;br /&gt;Back in the Slashdot days, we all thought that the fate of free software would be decided by a landmark court decision, that if the ideals of the GPL were to die, they would wind up meeting a ceremonious end like the cabinet members of a government overthrown in a military coup. But no - the free software ideal will die by the hands of a thousand poseurs, all who want the notoriety of contributing to open source, but none who are convicted enough to release any of their business's core code under a free license.&lt;br /&gt;&lt;br /&gt;The copyleft will share the same fate as the hippie movement, now only a shell of its former self supported by college age kids who hang out in the Haight-Ashbury and smoke pot all day, and at night, drive their Lexuses over the Golden Gate, back to Marin County. But you will take off that damn Che Guevara shirt before you come back into my house, young man.&lt;br /&gt;&lt;br /&gt;Look at all of the open source software in modern use. The vast majority of it is licensed under terms without a copyleft clause. The BSD license, Apache license, MIT license, and a handful of others are the most prevalent. In some places, the GPL still kicks around, but since we are application service providers, we are all free to ignore it. &lt;br /&gt;&lt;br /&gt;The Affero General Public License, a version of the GPL that closes the service-provider loophole, is almost nowhere to be found. The only new-hotness software I know of that is licensed under Affero is MongoDB, and even they have a chickenshit implementation - they have structured the code such that the 99% case of a web application using Mongo is effectively bound by the Apache license.&lt;br /&gt;&lt;br /&gt;Affero-licensing your project is a fatal defect if you want it to be used. Since the current flow of the software industry has effectively neutered the GPL, the only serious chance the copyleft has is the Affero license, and that sure-as-shit ain't gonna happen.&lt;br /&gt;&lt;br /&gt;The toll on the Golden Gate Bridge is now six dollars.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>How I Spot Valuable Engineers</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/12/how-i-spot-valuable-engineers.html"/>
   <updated>2009-12-14T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/12/how-i-spot-valuable-engineers</id>
   <content type="html">&lt;img alt=&quot;hire-women-at-a-startup-because-an-office-full-of-young-men-will-live-in-their-own-filth-until-an-investor-shows-up-for-a-tour.jpg&quot; src=&quot;/teddziuba/images/hire-women-at-a-startup-because-an-office-full-of-young-men-will-live-in-their-own-filth-until-an-investor-shows-up-for-a-tour.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;379&quot; width=&quot;325&quot; /&gt; &lt;div&gt;Goofy stuff happens when your company announces a funding round. We've gotten walk-in solicitors who try to sell us networking equipment, pitches for I-can't-quite-see-the-scam-but-I'm-sure-it's-there-somewhere stock exchange programs, and phone calls from slick Oracle salesmen who have their get-past-the-secretary sneak perfected so well that they could probably make better livings as industrial spies.&lt;br /&gt;&lt;br /&gt;But most frequently, there are resumès that land in my inbox. Yes Milo is hiring, and a lot of people contact me directly instead of the &quot;jobs&quot; address, which I can sympathize with because I've always had this feeling that &quot;jobs@&quot; e-mail addresses are black holes where career dreams get sent to die.&lt;br /&gt;&lt;br /&gt;Our general workflow for hiring engineers is to send the person our &quot;engineering challenge&quot; programming question and see how they do on it. If that looks good, they come in for interviews. I don't like doing interviews because I've always got enough stuff to do, but sometimes it's a good break. Necessary evil, I guess. Like Katy Perry. Have you &lt;i&gt;heard&lt;/i&gt; a live performance? Ph33r. &lt;br /&gt;&lt;br /&gt;Anyhow, when I interview a candidate, I'm trying to determine how &lt;b&gt;valuable&lt;/b&gt; the candidate is, not just how smart he or she is.&amp;nbsp; Because I love English semantics:&lt;br /&gt;&lt;br /&gt;A &lt;b&gt;smart&lt;/b&gt; candidate will do well on the engineering challenge problem.&lt;br /&gt;A &lt;b&gt;productive&lt;/b&gt; candidate will be able to explain past projects in detail.&lt;br /&gt;A &lt;b&gt;valuable&lt;/b&gt; candidate is smart and productive, but also has useful knowledge gained from experience.&lt;br /&gt;&lt;br /&gt;To tell if a candidate is valuable, you need to piss them off. (By the way, does it make you feel icky that &lt;i&gt;they&lt;/i&gt; can be used with a singular antecedent? This derelict language is put together with duct tape and baling wire, I swear.)&amp;nbsp; A valuable candidate will likely have been personally offended by some sequence of bullshit thrown from a programming language, tool, library, or problem in past work. This is the kind of bullshit-train I'm talking about.&lt;br /&gt;&lt;br /&gt;Need to parse XML with Python → SGMLlib feels like a kids toy → Implement it with BeautifulSoup → Fuck me, Soup is too slow → Re-implement with LXML → LXML works great for months → LXML segfaults the Python interpreter when used in a threaded environment under heavy load → &lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;hey-look-another-4chan-meme-that-reddit-has-bludgeoned-with-the-bat-of-unoriginality.JPG&quot; src=&quot;/teddziuba/images/hey-look-another-4chan-meme-that-reddit-has-bludgeoned-with-the-bat-of-unoriginality.JPG&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;116&quot; width=&quot;154&quot; /&gt;&lt;br /&gt;&lt;/div&gt;
Any developer who has been around enough to accumulate valuable experience will have his personal collection of stories that have mad him rage. I have been burned by bugs in programming language implementations, bugs I call &quot;coding slurs&quot;. I have gotten the shaft more times than I can count from pathological character set issues that make me want to run for Congress on the platform of requiring licenses before people are allowed to use computers.&amp;nbsp; If you really want to find the value in a job candidate - find out what pisses him off.&lt;br /&gt;&lt;br /&gt;The easiest way I have found of doing this is to ask a candidate &lt;i&gt;&quot;what don't you like about your favorite programming language?&quot;&lt;/i&gt; You can grade their experience with the response. For example:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What don't you like about Java?&lt;/i&gt;&lt;br /&gt;&lt;b&gt;Out-of-college answer:&lt;/b&gt; &quot;Java is too verbose&quot;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Battle-hardened developer answer:&lt;/b&gt; &quot;Object storage is aligned on a 64-bit boundary, at least in Sun's JVM, so if you need to allocate a lot of small storage, you really need to know JVM internals so you don't run out of memory.&quot;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What don't you like about Python?&lt;/i&gt;&lt;br /&gt;&lt;b&gt;Answer from a candidate who will write frameworks for solving problems instead of getting shit done:&lt;/b&gt; &quot;Dynamic typing means you need to rely more on your tests and less on the interpreter to make sure your code is correct&quot;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Answer from Gunnery Sergeant Hartman, your senior drill instructor:&lt;/b&gt; &quot;Independent object cycles where one of the objects has a __del__ method don't get garbage collected.&quot;&lt;br /&gt;&lt;br /&gt;People think I hate programming. Nope. What I hate is fording endless rivers of horseshit that are in the way of seemingly simple tasks. And I hate it even more when I have to explain to a non-programmer what I am doing, &quot;building LXML against a different version of libiconv because I think it might be the source of a crash&quot;. &lt;br /&gt;&lt;br /&gt;&quot;But all I asked you to do was parse some documents.&quot;&lt;br /&gt;&lt;br /&gt;Good times. &lt;br /&gt; 
</content>
 </entry>
 
 <entry>
   <title>Introducing Milo</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/11/introducing-milo.html"/>
   <updated>2009-11-24T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/11/introducing-milo</id>
   <content type="html">&lt;img alt=&quot;oh-good-lord-i-hope-the-servers-stay-up-today.jpg&quot; src=&quot;/teddziuba/images/oh-good-lord-i-hope-the-servers-stay-up-today.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;301&quot; width=&quot;394&quot; /&gt;I had mentioned a couple of months ago that I had my head down into a new project. It's been an open secret that the project is &lt;a href=&quot;http://milo.com/&quot;&gt;Milo.com&lt;/a&gt;, an online local comparison shopping engine.&amp;nbsp; We index the inventory of stores nationwide and show you real-time, what is available around you. From an engineering perspective it's a cool problem because there is a lot of data to store and manage, as well as a lot of integration work to deal with the particular temperament of various retailers' inventory systems. Of course if it were easy, someone would have done it already.&lt;br /&gt;&lt;br /&gt;From a business perspective, I like it a lot. The online comparison shopping world is very crowded, and we didn't want to be just another me-too AdWords arbitrage/affiliate marketing site. When I got into this business, I thought that online shopping was like the Stairway to Heaven of Internet business, but with the local inventory lookup, I think we really have distinguished ourselves from the others out there.&lt;br /&gt;&lt;br /&gt;Today I'm happy to announce that we've closed a $4 million Series A investment round, led by True Ventures, with other investors such as Ron Conway, Aaron Patzer, and Jeff Clavier also participating.&amp;nbsp; As a side note, I was really impressed by the True team, and am happy to be working with them. There were ups and downs to the Series A process, and I have to say that pitching the True partners was a definite up.&lt;br /&gt;&lt;br /&gt;Oh, right. We also have a mascot. His name is Milo, of course. Here he is attacking me at my desk:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;what-is-it-about-dogs-and-face-licking.jpg&quot; src=&quot;/teddziuba/images/what-is-it-about-dogs-and-face-licking.jpg&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;320&quot; width=&quot;240&quot; /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Hey Lets Bitch About SEO Again</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/10/hey-lets-bitch-about-seo-again.html"/>
   <updated>2009-10-13T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/10/hey-lets-bitch-about-seo-again</id>
   <content type="html">&lt;img alt=&quot;cant-we-go-back-to-debating-if-google-is-evil-for-doing-business-in-china.jpg&quot; src=&quot;/teddziuba/images/cant-we-go-back-to-debating-if-google-is-evil-for-doing-business-in-china.jpg&quot; width=&quot;281&quot; height=&quot;375&quot; class=&quot;post-lead-image&quot; style=&quot;float: right; margin: 0 0 20px 20px;&quot; /&gt; &lt;div&gt;Hey I have an awesome idea. Let's take a field of business that many people work in to make a legitimate living, and tear it down for being immoral and accuse it of fraud. &amp;nbsp;And when it comes to solving the actual problem that this business works on, apply a nice helping of sunshine-up-your-ass, and everything's just fine.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Better yet, let's do this every six to eight months, because collectively we have the attention span of a fruit fly.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Well, it's been a solid eight months, and somebody kicked the hornet's nest. Is SEO good or evil? &amp;nbsp;It's good. It's great. I &amp;lt;3 SEO.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;When you hire a legitimate white hat SEO, you are paying for domain knowledge. Is it better to use dashes or underscores to separate keywords in a URL? I know the answer, but I've spent some time researching SEO. If I were, say, an online publisher, it would be worth money to hire somebody who knows the answer to this question and a pop-quiz full of other questions that isn't in your average web developer's job description.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Every hit on SEO eventually ends with the same solution. &quot;Just write good content or make a good web app, and the traffic will come.&quot; &amp;nbsp;Oh really, it's just that simple, eh? How many unpublished novelists are there out there? How many film students whose reels go unwatched? Google is the greatest media distribution channel that there has ever been, and you expect people not to look for every advantage they can get?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the failure with the &quot;make a good app, people will come&quot; argument. Let's say you are making an application whose target market is one person in ten. That's a&amp;nbsp;respectably sized market. &amp;nbsp;You tell your friends, your family, people you know through the internet. You write on your personal blog about it. &amp;nbsp;Let's say you reach 1,000 people, generously. &amp;nbsp;If your hit rate within that market is 50%, that's 50 people you've got who haven't immediately dumped your app. Do they care enough about it to do your marketing for you? &amp;nbsp;With that small of a user base, you don't have statistically significant feedback to improve the site, you've got to gun it on intuition, which is frequently wrong.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So there you are, with your 50 users, and since you don't have to spend any time or money on distributing your app (remember these 50 people will do it for you), then you can continue to develop the app, making it &quot;better&quot;, as you see it, in a vacuum. &amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And let's just count on those 50 people bringing in 10 million of their closest friends in the next month or so.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Hell no. This is the internet, son. Kill or be killed. &amp;nbsp;If you can spend some money on a good SEO who will bring a steady flow of traffic to your site, then you have a way better chance than with that initial set of 50. &amp;nbsp;With search engine traffic, even if you're only getting a handful of traffic every day, it's a different handful. &amp;nbsp;If you have built something of value, some percentage of users will recognize this, and maybe tell a friend, maybe they'll come back to your site, and maybe they'll link to you, but you have a continuous stream of people to try it out on.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Obviously there are shysters in SEO. &amp;nbsp;Going to an SEO who guarantees that you'll rank in the top 10 for mesothelioma is like taking your car to the dealership to get fixed. Of &lt;i&gt;course&lt;/i&gt;&amp;nbsp;you're going to get scammed. Buyer beware, and all that.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, keep debating on whether or not SEO is evil. The rest of us have to find ways to handle our traffic growth.&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>I Don't Code in my Free Time</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/10/i-dont-code-in-my-free-time.html"/>
   <updated>2009-10-10T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/10/i-dont-code-in-my-free-time</id>
   <content type="html">&lt;img alt=&quot;obama-winning-the-nobel-proves-that-white-guilt-is-one-of-the-most-awesome-powers-on-earth.jpg&quot; src=&quot;/teddziuba/images/obama-winning-the-nobel-proves-that-white-guilt-is-one-of-the-most-awesome-powers-on-earth.jpg&quot; width=&quot;210&quot; height=&quot;840&quot; class=&quot;post-lead-image&quot; style=&quot;float: right; margin: 0 0 20px 20px;&quot; /&gt; &lt;div&gt;Why would you ever hire a programmer who doesn't program in his free time? &amp;nbsp;I mean, a person who doesn't compile recreationally is probably useless on the job. &amp;nbsp;You might as well hire somebody ... &lt;i&gt;old&lt;/i&gt;. And who wants a bunch of people around the office who whine about things like &lt;i&gt;healthcare benefits&lt;/i&gt;? Just don't get sick, duh.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I love it when twenty-something engineers take such a hard-line position on something they have so little experience with, like hiring. &amp;nbsp;Saying that you wouldn't hire somebody for a programming job because they don't program in their spare time is blissfully naive. Yeah, I remember the days when my greatest responsibility to another human being was making rent on the first of the month.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If I am going to hire somebody for a programming job, I don't really give a shit &lt;i&gt;what&lt;/i&gt; they do in their spare time, so long as that person is very good at the task at hand. &amp;nbsp;I don't ask questions about what a person does in their free time in job interviews because I don't care, and because that can sometimes open the door to an illegal conversation. (What's that? There are laws about what you can ask somebody in a job interview? Who thought that up, Republicans?)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class=&quot;Apple-style-span&quot; style=&quot;background-color: rgb(255, 255, 255); &quot;&gt;Me, I can count on one hand the number of times I've programmed outside of work or a class. &amp;nbsp;There was only once when I actually enjoyed it, though. I was in college, and shared a common wall with a girl from Spain who was painfully unaware that her computer had a volume control knob. She would stay up late on AOL instant messenger, and I couldn't sleep. &amp;nbsp;So, I rigged up a Python script to play AOL instant messenger sounds randomly every 5 to 10 seconds, turned up my speakers, pointed them at the wall, and went on vacation for a week. &amp;nbsp;And thus, the asshole you all know and love is born.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I don't enjoy programming so much as I enjoy the satisfaction I get from cracking hard problems.&lt;span class=&quot;Apple-style-span&quot; style=&quot;background-color: rgb(255, 255, 255); &quot;&gt;&amp;nbsp;In that case, computer code is a means to an end, but so is my Craftsman socket set. &amp;nbsp;I like to spend free time wrenching on a car or a bike, but I don't set out on Saturday morning and say &quot;I'm going to learn how to use a torque wrench today, because those things are the future of tools&quot;. &amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I would not want to work for a company that wouldn't hire me because I don't code in my spare time. Professional development? Working at a startup, I get a heaping helping of that on the job. &amp;nbsp;Keeping up with new technology? Yeah, I read reddit, and again, startup. &amp;nbsp;You know what's more awesome than spending my Saturday afternoon learning Haskell by hacking away at a few Project Euler problems? Fuck, ANYTHING.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Really, why should I bother spending time with my family and taking an active role in my kids' development when there's a dead-beaten math puzzle that doesn't have a good answer in Clojure? &amp;nbsp;&quot;I won't hire someone who doesn't code in their free time&quot; is Siliconvallese for &quot;I don't want to hire any grownups because they remind me of my parents&quot;.&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Twisted vs. Tornado: You're Both Idiots</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/09/twisted-vs-tornado-youre-both.html"/>
   <updated>2009-09-18T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/09/twisted-vs-tornado-youre-both</id>
   <content type="html">&lt;img alt=&quot;plucking-you-unibrow-is-the-most-undignified-type-of-grooming.jpg&quot; src=&quot;/teddziuba/images/plucking-you-unibrow-is-the-most-undignified-type-of-grooming.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;268&quot; width=&quot;390&quot; /&gt; &lt;div&gt;First, a message to bloggers. If you've got the bright idea to try some new kind of benchmark that pits Twisted against Tornado, take pause. Turn off your computer, step into a public area, and reconsider your life's goals. The internet does not need another pointless network performance graph.&lt;br /&gt;&lt;br /&gt;With that out of the way, it's become clear that the Pissing Contest of the Day, Twisted.web vs. Friendfeed's Tornado web framework, reveals that neither side of the argument is particularly right, but both sides are particularly stupid.&lt;br /&gt;&lt;br /&gt;First, Twisted. Now, my company uses Twisted for a small piece of functionality because it was the easiest way that we found to send traffic over different network interfaces on a Linux machine. We never have any problems with it. The only reason I ever need to touch it is to see how something works.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;However, Twisted is probably the douchiest programming library out there. Every time I open up that code, I feel like I've wandered into a late-night bar on the Jersey Shore where everybody's drinking Jager-bombs, and nobody is wearing a shirt.&amp;nbsp; Twisted is a cool network library, but not cool enough to be named &quot;Twisted&quot;.&amp;nbsp; It's the Python programmer's version of Ed Hardy clothing and a baseball cap with the tag still hanging off the side.&amp;nbsp; When I'm digging around in this code and my co-workers ask me what's up, the only appropriate response is &quot;NOT NOW CHIEF. I'M STARTIN' THE FUCKIN' REACTOR.&quot;&lt;br /&gt;&lt;br /&gt;Now you can see why there's so buttsore over Tornado.&lt;br /&gt;&lt;br /&gt;Even though I &lt;a href=&quot;http://widgetsandshit.com/teddziuba/2009/06/startups-keep-it-in-your-pants.html&quot;&gt;advised&lt;/a&gt; &lt;a href=&quot;http://widgetsandshit.com/teddziuba/2008/04/im-going-to-scale-my-foot-up-y.html&quot;&gt;against&lt;/a&gt; things like Tornado, Friendfeed still built it. From the graphs I've seen, Tornado is just marginally faster than Twisted at serving concurrent requests. Marginally. Evidently Friendfeed figured that tiny margin was enough justification to waste their time writing something that's been re-written by every developer that gets bored on the job.&amp;nbsp; A Python web framework? My mercy how original. I think that's one of the ending exercises of &quot;Learn Python in 24 Hours&quot;.&lt;br /&gt;&lt;br /&gt;Friendfeed spent a lot of time trying to optimize the queries per second graph, but maybe they should have spent more time optimizing this graph instead:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;intuit-didnt-buy-mint-they-bought-a-license-to-stagnate.png&quot; src=&quot;/teddziuba/images/intuit-didnt-buy-mint-they-bought-a-license-to-stagnate.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;186&quot; width=&quot;561&quot; /&gt;Anyway, when it comes to Twisted vs. Tornado for a Python web framework, I use Django. Why? Because it works, and my time is valuable.&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>30 Helens Agree: You Can't Win Without Failing</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/09/i-read-fred-wilsons-blog.html"/>
   <updated>2009-09-09T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/09/i-read-fred-wilsons-blog</id>
   <content type="html">&lt;img alt=&quot;an-infant-is-a-function-whose-inputs-are-sight-sound-smell-touch-and-taste-and-whose-outputs-are-bodily-fluids.jpg&quot; src=&quot;/teddziuba/images/an-infant-is-a-function-whose-inputs-are-sight-sound-smell-touch-and-taste-and-whose-outputs-are-bodily-fluids.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;270&quot; width=&quot;360&quot; /&gt;I read &lt;a href=&quot;http://www.avc.com/a_vc/2009/09/failure.html&quot;&gt;Fred Wilson's blog post on failure&lt;/a&gt; today, and after I was finished being impressed by his three letter domain name, it really made me think about what I learned from my last failed startup.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;There's the usual Reddit material: don't write your own database, concentrate on the UI, put your users first, other such horse-beaten realities that green engineers understand after being in the field for a few years.&amp;nbsp; A true failure is one that changes your life's philosophy, not one that changes your unit testing strategy.&lt;br /&gt;&lt;br /&gt;What I really learned from the fall of Pressflip is that &lt;b&gt;arrogance is more dangerous than incompetence&lt;/b&gt;.&amp;nbsp; I believed that raw engineering prowess could make up for the complete lack of business experience, a product that really only appealed to the people who build the technology behind it, and an addressable market that could easily be mistaken for roundoff error. Couple that with the youthfully cute thought that Silicon Valley is a meritocracy, and it was only a matter of time. We had build some neat technology behind the scenes, and I was very proud of a few key parts of the system, but in the end, the users just did not come.&lt;br /&gt;&lt;br /&gt;The trouble with this lesson is that it can only be learned the hard way. Arrogant people don't listen to criticism, they just run themselves into the wall.&amp;nbsp; Incompetent people can usually be led in the right direction, even though they may execute their way into the dirt.&amp;nbsp; Arrogance doesn't listen to reason, it only listens to itself.&lt;br /&gt;&lt;br /&gt;For example, an arrogant motorcyclist will ride on the highway at twice the speed of traffic, and no matter how many times he gets pulled over, and he'll keep doing it until he crashes.&amp;nbsp; An incompetent motorcyclist will drop his bike in a U-turn in front of his house, cracking a mirror.&lt;br /&gt;&lt;br /&gt;This failure made me saltier. I now understand why old men have no patience for the modern world.&amp;nbsp; However, it did not let me keep thinking that superior code is the solution to any conceivable problem. I've hunkered down a bit, concentrating on a new project that I really believe will be a winner, and started learning the business realities of a cruel Valley.&lt;br /&gt;&lt;br /&gt;So now, if an investor asks me what I learned from past failures, I won't put him to sleep talking about schema-less versus SQL databases. Instead, I've got a good answer.&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>A Happy Life Without the Whining</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/08/a-happy-life-without-the-whini.html"/>
   <updated>2009-08-28T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/08/a-happy-life-without-the-whini</id>
   <content type="html">&lt;img alt=&quot;is-there-a-word-for-the-feeling-you-get-when-you-realize-three-quarters-of-your-twitter-followers-are-spammers.gif&quot; src=&quot;/teddziuba/images/is-there-a-word-for-the-feeling-you-get-when-you-realize-three-quarters-of-your-twitter-followers-are-spammers.gif&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;311&quot; width=&quot;208&quot; /&gt; &lt;div&gt;There's no better way to waste your time than to talk about politics.&amp;nbsp; For as good as educated people are at acting intellectual, we love to bitch and moan about one side versus the other. &lt;br /&gt;&lt;br /&gt;Politics are potato chips for the enlightened mind.&lt;br /&gt;&lt;br /&gt;I grew up in Connecticut. New Englanders are generally pretty educated people. We keep to ourselves. We vote. We donate money to causes, and for the most part, &lt;i&gt;we shut the fuck up&lt;/i&gt;.&amp;nbsp; Personally, I'm registered to one of the two major parties in the US. (If you tried to guess, you'd probably guess wrong.)&amp;nbsp; I don't get into political arguments because I've got better shit to do. I don't blog about politics because I know that nobody cares what I think. You know what? It's a good life.&lt;br /&gt;&lt;br /&gt;Lately, I've been hearing a lot about this Glenn Beck fellow. I don't know who he is or what he said to get everyone so sore-assed, but I sure as shit don't care. I don't watch CNN or Fox News. I don't have cable TV. I get all my news from my local news channel over the air. No talking heads, no shouting matches, no six-second-attention-span scrolling tickers on the bottom of the screen. In 30 morning minutes, I get a brief summary of what the president said at such and such a meeting the other day, a look at the traffic and weather for the day, and some feel-good community segment. &lt;br /&gt;&lt;br /&gt;The last thing I need on a 40 mile motorcycle ride to work is a head full of piss, thanks to Bill O'Reilly or Keith Olbermann.&lt;br /&gt;&lt;br /&gt;But I can tell you that from the inside, generating butthurt is big business. Every time I've knocked an article out of the park for The Register, there's been a decent troll element to it. Not all trolls succeed, but the ones that hit a nerve really bring in the page views and comments. That's just the IT world. If I could get a job trolling politics, I'd be damn sure to demand a page view bonus. I can't knock the hustle.&lt;br /&gt;&lt;br /&gt;The news networks aren't stupid. They know that viewership increases when people are pissed off.&amp;nbsp; Walter Cronkite delivered facts, but was a crusty old book report of a man for it.&amp;nbsp; I'm sure that all else equal, if national media never figured out how much fucking money there is to be made in keeping people salty, the news would still be a puff of dry air.&lt;br /&gt;&lt;br /&gt;So I don't watch network TV. I don't blog about politics. It's a calm life. I have informed opinions on most issues, but I know that nobody cares what I think, so I keep to myself.&amp;nbsp; Maybe that's why I still have trouble &quot;getting&quot; Twitter.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Stop Using the Word 'We'</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/08/stop-using-the-word-we.html"/>
   <updated>2009-08-20T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/08/stop-using-the-word-we</id>
   <content type="html">&lt;img alt=&quot;theres-something-about-lane-splitting-through-marijuana-smoke-on-a-motorcycle-thats-so-unsettling.jpg&quot; src=&quot;/teddziuba/images/theres-something-about-lane-splitting-through-marijuana-smoke-on-a-motorcycle-thats-so-unsettling.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;240&quot; width=&quot;320&quot; /&gt; &lt;div&gt;Yesterday, I spearheaded a new movement at the office. I stopped using the word &quot;we&quot;, and started to say what I really meant to say.&amp;nbsp; For example, instead of &quot;&lt;i&gt;We&lt;/i&gt; should fix that bug&quot;, I say, &quot;&lt;i&gt;You&lt;/i&gt; should fix that bug&quot;, and good God is it satisfying.&lt;br /&gt;&lt;br /&gt;There are a couple of motivations for this. Firstly, one of the key things I've learned being a for-pay writer is to show some conviction. Secondly, the passive discussions about defects and delegation and responsibility really started to irritate me. Why not just tell it like it is?&lt;br /&gt;&lt;br /&gt;When I worked at Google, I picked up on a really annoying trend in the software industry (or maybe just in Silicon Valley) that I call &quot;fuck-you with a smile&quot;.&amp;nbsp; You never want to outright blame somebody or something, rather, it's best to state the existence of an issue and then ask &quot;the team&quot; to fix it.&amp;nbsp; We should really move that icon ten pixels to the left. We definitely need to fix that concurrency bug. We should probably have that all done before lunch.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Well then, Mr. Manager, you had better get cracking, because I've got some YouTube videos to watch.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I learned that the goal of institutional business is to keep from angrying up the blood at all costs.&amp;nbsp; A productive employee is one whose personality has been bleached out to a yellow tinge.&amp;nbsp; Always non-confrontational, never suggesting that any one person fucked up.&lt;br /&gt;&lt;br /&gt;The best part about working at a startup is that I'm free to suggest that yes, you fucked this up. Yes, it's your fault, and yes, you need to fix it. Delegate! Don't waste time listing out action items, spend time telling people what to do. Everyone you work with should be a grown up, and can handle it. The other side of that is owning up to your mistakes. Instead of &quot;There is memory leak in the code, we should prioritize it over other defects&quot;, say, &quot;I introduced a memory leak in the code. I am going to fix it as soon as possible.&quot;&lt;br /&gt;&lt;br /&gt;Anyway, I'm going to keep this up until somebody openly calls me an asshole. You should try it too.&amp;nbsp; You don't have to be a prick about it, just be assertive. Your co-workers will be impressed at your new found confidence. It might even get you laid.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Well, probably not, but you won't be wondering when a meeting is going to end if you grab it by the balls.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Context Switches are Bad, but Stack Traces are Worse</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/08/context-switches-are-bad-but-s.html"/>
   <updated>2009-08-17T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/08/context-switches-are-bad-but-s</id>
   <content type="html">&lt;img alt=&quot;never-trust-a-person-who-wears-a-tie-who-asks-you-how-to-query-the-database.png&quot; src=&quot;/teddziuba/images/never-trust-a-person-who-wears-a-tie-who-asks-you-how-to-query-the-database.png&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;317&quot; width=&quot;290&quot; /&gt; &lt;div&gt;Every programmer works in silent fear of a manager sneaking up on him and asking him to drop everything he is doing and work on an unrelated task. Context switches like this cost us time and energy, but managers are beginning to figure out that a programmer isn't a machine that can be switched on and off, no, they're understanding that a programmer is a machine that needs a warm up phase. &lt;br /&gt;&lt;br /&gt;I guess you could call that progress.&lt;br /&gt;&lt;br /&gt;I'm fortunate enough to work with a company that understands engineering, but I have worked with my fair share of nontechnical managers, and I have to say that categorically that the most expensive question a manager can ask is &quot;What are you working on?&quot;&lt;br /&gt;&lt;br /&gt;The danger here is when you're six or seven levels deep into yak-shaving, and your manager wants to know what you're doing and why. You need to give the manager a complete stack trace from your current frame all the way up to the original task. Each jump up the call stack is a context switch of its own, where you need to remember exactly why you made the decision that you did, and justify it as the best course of action. &lt;br /&gt;&lt;br /&gt;&quot;I'm compiling a new version of libxml, so I can get the Python parser working.&amp;nbsp; I need to do that because LXML, the Python binding, would crash under heavy load when I used the system default version of libxml. I am using LXML because BeautifulSoup doesn't have support for XPath. I need to do XPath transforms against the input because the legacy system we interact with doesn't send well-formed XML. We tried to get the vendor to fix it but they said 6 to 8 weeks for a patch, and our project deadline is sooner than that. I need to interface with the legacy system because even though the DBAs have ported things over to Oracle, they're still sorting things out and it's not reliable enough for me to make any meaningful progress. Fortunately, I've thought this through and abstracted the data subsystem well enough that I can drop-in replace Oracle when it's ready, so long as the database sends some decent form of XML. Once I get this data subsystem done, I can finish the business logic, which attaches to the nonfunctional demo you love so much.&lt;br /&gt;&lt;br /&gt;So yeah, I'm working on the new asset tracking system.&quot;&lt;br /&gt;&lt;br /&gt;Stack trace, all the way up to the main method.&amp;nbsp; It's not always that simple. I know I have trouble keeping that many stack frames in my head. I can't remember why I chose to go down the path that I did, other than &quot;there was a good reason for it&quot;. After all, I'm not slogging my way though compilation bullshit for my health. When a manager demands a full stack trace like this, it sets your progress back, because you need to go over decisions that you already made, examine the circumstances, and make the same decisions again. You lose your original frame of reference, and your manager thinks you're just fiddledicking around instead of doing work.&lt;br /&gt;&lt;br /&gt;So what's there to do? If you're awesome like me and work for a manager who understands why programming is hard, chances are you can just leave the answer to &quot;what are you doing?&quot; at the innermost stack frame.&amp;nbsp; Everybody wins. However, if your manager is nontechnical, your goal is to get him off your ass as soon as possible, because you want to minimize the damage he does to your productivity. My recommended course of action, when asked &quot;What are you working on?&quot; is to slap the manager in the face and yell &quot;&lt;i&gt;YOU DON'T END A SENTENCE WITH A PREPOSITION UP IN THIS BITCH. THIS IS MY HOUSE.&quot;&lt;/i&gt; Thump your chest with a clenched fist and say &lt;i&gt;&quot;yeah what's now, fool&quot;&lt;/i&gt; under your breath.&lt;br /&gt;&lt;br /&gt;Failing the battery charge, the key phrase is &quot;I explored every option&quot;.&amp;nbsp; Beyond this, there's really no way out, because your manager doesn't trust you. You're basically fucked. Quit your job and come work with me.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>This Is America, Take Your Unicode Somewhere Else</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/07/this-is-america-take-your-unic.html"/>
   <updated>2009-07-04T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/07/this-is-america-take-your-unic</id>
   <content type="html">&lt;img alt=&quot;i-only-listen-to-NPR-so-i-can-keep-an-eye-on-what-educated-people-are-up-to-its-merely-an-early-warning-system.jpg&quot; src=&quot;/teddziuba/images/i-only-listen-to-NPR-so-i-can-keep-an-eye-on-what-educated-people-are-up-to-its-merely-an-early-warning-system.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;245&quot; width=&quot;320&quot; /&gt;There's a question that comes up on Stack Overflow every couple of months: &quot;How do I strip diacritic marks from Unicode characters?&quot;.&amp;nbsp; Popular variants include &quot;How do I remove special characters&quot; and &quot;How do I convert Unicode to ASCII&quot;, but the underlying motivation is the same: characters that don't have their own key on an American keyboard have no place in modern web software. &lt;div&gt;&lt;br /&gt;Before you go all apeshit on me and call me a bigot and whatnot, read my story.&amp;nbsp; When I was in college, Google hired me for a summer internship.&amp;nbsp; One of my projects that summer was to write Google's employee directory search.&amp;nbsp; Google, as I'm sure you could imagine, is a very multicultural employer.&amp;nbsp; Googlers in general are very accepting of different cultures, customs, and languages.&amp;nbsp; (Well, sort of.&amp;nbsp; Googlers are accepting of multicultural differences like sushi, Diwali parties, and the word &lt;i&gt;namaste&lt;/i&gt;.&amp;nbsp; They're not accepting of cultural differences like Old English 800, 22 inch rims, and the word &lt;i&gt;juicy&lt;/i&gt;. The general rule I figured out as a Googler is that you should welcome diversity so long as it doesn't make you feel guilty for making ten times as much money.)&lt;br /&gt;&lt;br /&gt;Anyhow, as a result of pulling in a lot of foreign talent, my employee directory search had to handle UTF-8 properly.&amp;nbsp; A lot of peoples' names had umlauts, tildes, and other such little nuggets that love to appear as diamonds with question marks in them. I figured, just make the database UTF-8, page encoding UTF-8, and everything should work fine, right?&amp;nbsp; Well it did, in theory.&amp;nbsp; But when the first super-tolerant Googler typed his colleague's name into my search engine, it didn't come up.&amp;nbsp; There was an o with an umlaut in the name, but our hero of race relations simply typed &quot;o&quot;.&lt;br /&gt;&lt;br /&gt;And that came through to me as a bug report.&amp;nbsp; &quot;Strip funny characters.&quot; So I did, and how the searches flowed.&amp;nbsp; See if you can guess how many people would input diacritic marks into the search box.&lt;br /&gt;&lt;br /&gt;Googlers are some of the most understanding people out there, and if they can't be bothered to type Alt-148 for an o with an umlaut, then what hope does the rest of the software industry have?&amp;nbsp; None.&amp;nbsp; That's why I want to systematically dismantle Unicode, and have a good answer to the question &quot;How do I strip diacritic marks?&quot;.&amp;nbsp; Not because handling multibyte character sets is too hard (although that asspain is what prompted me to think about this in the first place), but rather because only a small minority of people actually care about it, and an even smaller minority will whine when their umlauts disappear.&lt;br /&gt;&lt;br /&gt;(To satisfy the pedants, clearly if you're writing software whose job it is to handle and store UTF-8, this advice isn't for you.&amp;nbsp; I'm talking about web services with user input here.)&lt;br /&gt;&lt;br /&gt;Now, you can feel free to take an idealist approach to this problem.&amp;nbsp; Yes, Americans should be more accepting of other cultures and not passively destroy intricate details of pronunciation.&amp;nbsp; Well, feel free to enjoy your floating-point market share.&amp;nbsp; Nobody cares but you.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;End Note&lt;/b&gt;&lt;br /&gt;I found two decent implementations of Unicode transliteration, one in Python and one in Perl. If you know of good implementations in other languages, e-mail me and I'll add them to this list, with SEO-friendly anchor text goodness.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://www.tablix.org/%7Eavian/blog/archives/2009/01/unicode_transliteration_in_python/&quot;&gt;Strip diacritic marks in Python&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://search.cpan.org/%7Esburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm&quot;&gt;Strip diacritic marks in Perl&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.rgagnon.com/javadetails/java-0456.html&quot;&gt;Strip diacritic marks in Java&lt;/a&gt; (thanks to Simon Lieschke)&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://dev.alt.textdrive.com/browser/HTTP/Unidecode.lua&quot;&gt;Strip diacritic marks in Lua&lt;/a&gt; (for all 8 of you who use it. Thanks to Petite Abeille)&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://blargh.tommymontgomery.com/2009/08/transliteration-in-php/&quot;&gt;Strip diacritic marks in PHP&lt;/a&gt; (thanks to Tommy Montgomery)&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Print Isn't Dying, Serious Journalism Is</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/06/print-isnt-dying-serious-journ.html"/>
   <updated>2009-06-22T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/06/print-isnt-dying-serious-journ</id>
   <content type="html">&lt;img alt=&quot;when-techcrunch-pays-writers-six-figures-then-arrington-can-talk-about-success.jpg&quot; src=&quot;/teddziuba/images/when-techcrunch-pays-writers-six-figures-then-arrington-can-talk-about-success.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;278&quot; width=&quot;200&quot; /&gt; &lt;div&gt;It's a tired Silicon Valley drum beat: print is dying, blogs and Twitter are the future of news.&amp;nbsp; Many in the business of blogging like to think that print ad revenues are declining and subscriber bases are shrinking because online media is vastly superior to those dinosaurs.&amp;nbsp; This is one area where the evidence actually seems to suggest that the bloggers are justified.&lt;br /&gt;&lt;br /&gt;However, if you're not so full of yourself that &quot;citizen journalism&quot; seems like a revolution, you can understand the real reason that print is dying: &lt;i&gt;newspapers' shit is all retarded&lt;/i&gt;.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Too many big words, articles that are way too long, and boring stuff like researched facts.&amp;nbsp; Fuck all that shit, I want my news as it happens, and I don't care how true it is.&amp;nbsp; Bloggers call this process journalism...whatever.&amp;nbsp; That's just writers trying to convince themselves that they're serious when they know deep down that their readership is only interested in sensational titles and text no longer than 300 words.&amp;nbsp; Any more than that, well, shit's all retarded.&lt;br /&gt;&lt;br /&gt;The only satisfying part of journalism turning into shinythings.com is watching intellectuals whine about it.&amp;nbsp; See, I probably should be an intellectual.&amp;nbsp; I've got a degree in mathematics, I'm a computer programmer by trade, but every time I've knocked an article out of the park for The Register, it's been a great troll.&amp;nbsp; That's the only way to get by in online media, and even the New York Times knows this.&lt;br /&gt;&lt;br /&gt;Take for example, NYT columnist Paul Krugman.&amp;nbsp; He won a Nobel Prize in economics, and has been writing the same op-ed column for NYT for the past 8 years: &quot;Republicans are the cause of all the world's ills.&quot;&amp;nbsp; Someone who's shit is arguably all retarded has been reduced to trolling to get page views.&amp;nbsp; And it really works.&lt;br /&gt;&lt;br /&gt;If, as a blogger, you're above trolling, then the only other way to be popular is by printing blatant falsehoods.&amp;nbsp; In 2008, people actually started to pay attention to CNN's iReport because somebody wrote that Steve Jobs had a heart attack. Apple lost 10% of its market capitalization in 10 minutes.&amp;nbsp; Now &lt;i&gt;that's&lt;/i&gt; fucking power.&amp;nbsp; TechCrunch's Michael Arrington, showing an obvious tell of a manic depressive, keeps going off on Last.FM with lies about them giving data away to the recording industry.&amp;nbsp; None of it is true, but it brings readers.&lt;br /&gt;&lt;br /&gt;It certainly doesn't hurt that TechCrunch shies away from words longer than eight letters.&lt;br /&gt;&lt;br /&gt;Print media isn't hurting because it's an outdated business model, print media is hurting because it's boring.&amp;nbsp; Blogs and Twitter are succeeding because their shit is clearly not retarded.&amp;nbsp; And you know what?&amp;nbsp; I love it.&amp;nbsp; Intellectualism is dying, and the news is now anything we want it to be. &lt;br /&gt;&lt;br /&gt;I just can't wait until 4chan figures that out.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Startups: Keep It In Your Pants</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/06/startups-keep-it-in-your-pants.html"/>
   <updated>2009-06-09T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/06/startups-keep-it-in-your-pants</id>
   <content type="html">&lt;img alt=&quot;if-you-read-mike-arringtons-posts-closely-you-see-that-he-has-major-depression-issues.jpg&quot; src=&quot;/teddziuba/images/if-you-read-mike-arringtons-posts-closely-you-see-that-he-has-major-depression-issues.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;300&quot; width=&quot;202&quot; /&gt; &lt;div&gt;I've worked with a lot of engineers around the Valley, some who are genuinely competent and some who can fake it pretty well.&amp;nbsp; One trend that I've noticed with alot of really good engineers is that they like to swing their dicks around when it comes to implementation.&lt;br /&gt;&lt;br /&gt;You start a project with one of these guys, and the first thing to come up is how MySQL isn't going to scale, and how you're going to have to write your own data store.&amp;nbsp; With that settled, you'll also need your own object-relational mapper, and you might as well make your own web templating language because well, it will fit in better with the architecture.&lt;br /&gt;&lt;br /&gt;This, gentlemen, is dick-swinging, and it is the most colossal waste of time for a startup.&lt;br /&gt;&lt;br /&gt;Now, it's a well known fact around Northern California that I'm the greatest programmer who ever lived, and I even fell victim to this.&amp;nbsp; At my last startup, we were absolutely convinced that we were building ourselves into a corner by using MySQL, so we wrote our own data store.&amp;nbsp; It started off as an RPC wrapper around some magical key/value store in Erlang (parallelism, fuck yeah), and ended up as a different RPC wrapper around BerkeleyDB.&amp;nbsp; All in all, it went through three major rewrites, and the end product was something that took months to develop and would crash under moderate load.&lt;br /&gt;&lt;br /&gt;But hey, it was a cool architecture.&lt;br /&gt;&lt;br /&gt;As another small example, again at the last startup I spent a few hours one day writing a feedforward neural network implementation in Java, just to try my hand at implementing an algorithm.&amp;nbsp; Again, a small waste of time, but it was my attitude toward it that signaled a larger problem: I wanted to see how awesome I really was (answer: pretty fuckin' awesome).&lt;br /&gt;&lt;br /&gt;It's not just apartment-bound startups that fall victim to this, either.&amp;nbsp; Kosmix, which is a well funded science project that's fooled itself into thinking it can be a major player in search, wrote its own data store in C++.&amp;nbsp; It's basically a clone of Google's GFS because hey, if Google's doing it, then we should too, right?&amp;nbsp; Who knows how much time, energy, and money was wasted on this thing, but that's all time, money, and energy that could go into making their final product not such a joke.&lt;br /&gt;&lt;br /&gt;Kosmix falls to a different sword: they are well funded and assume they have all the time in the world.&amp;nbsp; Maybe a serious venture round buys you time, but when you spend it all writing a file system that's not core to your product, you start talking Series C, Series D, and so on.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Fortunately, trench-level engineers aren't concerned with dilution.&amp;nbsp; Oops.&lt;br /&gt;&lt;br /&gt;At my current startup, we've got business-focused leadership.&amp;nbsp; We have a good engineering team, and we don't let our hubris get the best of us.&amp;nbsp; There are so few instances where a startup will need to write something like a file system, and we're not one of them.&lt;br /&gt;&lt;br /&gt;As an entrepreneur, you should be prideful of your idea, now how big you think your compiler-cock is.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>My Twitters: Let Me Show You Them</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/06/my-twitters-let-me-show-you-th.html"/>
   <updated>2009-06-02T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/06/my-twitters-let-me-show-you-th</id>
   <content type="html">&lt;img alt=&quot;federal-assault-shark-ban.jpg&quot; src=&quot;/teddziuba/images/federal-assault-shark-ban.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;262&quot; width=&quot;350&quot; /&gt;I signed up for Twitter.&amp;nbsp; Do you people have any idea how fucking important I am?&amp;nbsp; It's a good thing I'm benevolent enough to clue you people into the glory of my day to day operations.&lt;br /&gt;&lt;br /&gt;You should consider it a fucking honor to read my Twitters.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://twitter.com/dozba&quot;&gt;http://twitter.com/dozba&lt;/a&gt;&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>Hacking Domains by Proxy</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/06/hacking-domains-by-proxy.html"/>
   <updated>2009-06-02T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/06/hacking-domains-by-proxy</id>
   <content type="html">&lt;img alt=&quot;passive-aggressive-and-gullible-is-no-way-to-go-through-life-son.jpg&quot; src=&quot;/teddziuba/images/passive-aggressive-and-gullible-is-no-way-to-go-through-life-son.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;200&quot; width=&quot;200&quot; /&gt;Remember how Uncov.com lapsed registration, and somebody bought it with Domains by Proxy?&amp;nbsp; I'm sure other people have faced this problem: how do you find out who owns a proxy domain?&amp;nbsp; Well, I successfully hacked the system.&lt;br /&gt;&lt;br /&gt;Here's how it works.&amp;nbsp; When someone registers a domain with Domains by Proxy, the e-mail provided to the DNS system for administrative and technical contacts proxy through to the person who actually registered it.&amp;nbsp; If that person directly replies to an e-mail, you can see who actually owns the domain.&lt;br /&gt;&lt;br /&gt;As usual with anything technical, the weakest link is the human.&amp;nbsp; The KGB used to say &quot;it's easier to break fingers than it is to break codes&quot;.&amp;nbsp; And it's easier to exploit greed than it is to subpoena Domains by Proxy or hack their computers.&lt;br /&gt;&lt;br /&gt;Check this shit out:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;who-the-fuck-you-think-you-fuckin-with-im-the-fuckin-boss.png&quot; src=&quot;/teddziuba/images/who-the-fuck-you-think-you-fuckin-with-im-the-fuckin-boss.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;302&quot; width=&quot;672&quot; /&gt;&lt;br /&gt;&lt;br /&gt;Names hidden to protect the douchey, but if you've got ten thousand extra dollars hanging around, you can have uncov.com all for yourself.&lt;br /&gt; &lt;div&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Don't be a Menace to South Central</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/05/dont-be-a-menace-to-south-cent.html"/>
   <updated>2009-05-25T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/05/dont-be-a-menace-to-south-cent</id>
   <content type="html">&lt;img alt=&quot;mapreduce-reduces-the-map-of-the-web.jpg&quot; src=&quot;/teddziuba/images/mapreduce-reduces-the-map-of-the-web.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;330&quot; width=&quot;240&quot; /&gt;The internet has this nasty habit of violently breaking down market inefficiencies.&amp;nbsp; For example, the music market was inefficient because driving to a store and exchanging money for a record was too much effort.&amp;nbsp; Downloading an album for free is a virtually frictionless process, so Napster, KaZaA and others thrived.&lt;br /&gt;&lt;br /&gt;Now, I think the word &quot;disruption&quot; is mostly used by half-wits in the media to pimp something.&amp;nbsp; When you read that something is &lt;i&gt;disruptive&lt;/i&gt;, that's an obvious tell that the journalist doesn't have a fucking clue about the technology, but is pretending that he does.&amp;nbsp; Journalists love that shit, and so do editors.&amp;nbsp; However, as an entrepreneur, you haven't created a disruption until a group of powerful old men convene in a board room to figure out how to shut you down.&amp;nbsp; You haven't created a disruption until the government is trying to regulate you.&amp;nbsp; You haven't created a disruption until there's a media campaign &lt;i&gt;against&lt;/i&gt; you.&lt;br /&gt;&lt;br /&gt;That being said, I love the idea of paid posting and sponsored conversations: companies paying bloggers to talk about their shit.&amp;nbsp; Why? Because it's really pissing off people who make a living out of public relations.&lt;br /&gt;&lt;br /&gt;When I was writing Uncov, I would get several e-mails daily from PR agencies, pitching me a story on such and such a shitty startup.&amp;nbsp; This is how it works: your company pays a PR agency for the size of their Rolodex.&amp;nbsp; The PR agency spams the publications with your press release in hopes that the story gets picked up.&amp;nbsp; In the tech media, the hit rate for PR isn't terribly high, so you end up spending upwards of $10,000 per month on PR that only gets your company a few writeups.&amp;nbsp; It's a scam.&lt;br /&gt;&lt;br /&gt;A company like PayPerPost, now Izea, has removed that market inefficiency.&amp;nbsp; You can simply pay bloggers directly to write about you.&amp;nbsp;&amp;nbsp; Whether or not they disclose that they're being paid, well, who gives a shit? You get the Google link juice, you get the attention, and if it comes out that you paid for it, the internet has the attention span of a fruit fly, so everyone will forget about in 24 hours.&lt;br /&gt;&lt;br /&gt;Any blogger that takes a stand against paid posting is delusionally self-important.&amp;nbsp; There is no morality to &quot;citizen journalism&quot; by definition.&amp;nbsp; The idea is that the traditional media will die in favor of hundreds of thousands of individual reporters, all working for themselves.&amp;nbsp; The good ones will rise to the top, but everybody will keep talking.&amp;nbsp; In this type of scheme, there is no force whatsoever that can stop paid placement.&amp;nbsp; With a few large media outlets, like The New York Times, The Washington Post, and other newspapers, paid placement isn't an issue because so much credibility is at stake.&lt;br /&gt;&lt;br /&gt;Not so on the internet.&amp;nbsp; Complain about it all you want, but paid placement is a necessary side effect of user-generated media.&amp;nbsp; Regulations against it are like speed limits: there's bound to be some marginal enforcement, but by and large, it does nothing to prevent it.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;It's what we wanted, now it's what we've got.&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>Startup Dad</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/05/-the-first-time-you.html"/>
   <updated>2009-05-21T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/05/-the-first-time-you</id>
   <content type="html">	&lt;meta http-equiv=&quot;CONTENT-TYPE&quot; content=&quot;text/html; charset=utf-8&quot;&gt;
	&lt;title&gt;&lt;/title&gt;
	&lt;meta name=&quot;GENERATOR&quot; content=&quot;OpenOffice.org 3.0  (Unix)&quot;&gt;
	&lt;style type=&quot;text/css&quot;&gt;
	&lt;!--
		@page { margin: 0.79in }
		P { margin-bottom: 0.08in }
	--&gt;
	&lt;/style&gt;

&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	&lt;/p&gt;&lt;img alt=&quot;babbyform.JPG&quot; src=&quot;/teddziuba/images/babbyform.JPG&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;274&quot; width=&quot;365&quot; /&gt;&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;The first time you hold somebody
else's screaming baby, you understand immediately why prostitution is
the world's oldest profession.  The first time you hold your own
screaming baby, you understand immediately why a federal prisoner who
gunned down three police officers needs no moral justification for
sticking the sharpened end of a toothbrush into a freshly jailed
child molester's kidney. &lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style=&quot;margin-bottom: 0in;&quot;&gt; It's that first revelation that scares many
first time fathers into escaping responsibility like a jackrabbit
from a coyote.  The difference between the ones who run and the ones
who stay, really, is how the news was broken to them:  a great
comedian will tell you that they key to any good joke is the
delivery.  
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	A runner was out on a Tuesday night
with some of his friends at a bar, somewhere between his third drink
and fourth cigarette, when the girl he'd been fucking calls him up
and says that she's pregnant.  A dad who sticks around is
concentrating on some manly order of housework, like changing the oil
in a car, when his wife or girlfriend calls him in to make sure that
the little blue plus sign actually does mean &amp;#8220;pregnant&amp;#8221;, and that
she's not just misreading it.&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	With a baby in the house, the bedroom
is going to lose its understated but victorious smell of Astroglide
and unwashed sheets in favor of a strong presentation of rancid
breast milk.  When there's a child to take care of, getting falling
down drunk to the point where you're willing to argue with a street
vendor over the price of a 2AM hot dog isn't really an option in the
list of things to do this weekend.  With a baby, all of the money you
used to spend on video games and car accessories is going to be
repurposed for child care.  
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	When a man runs from fatherhood, he's
not really running from responsibility, he's running from the guilt
of a mediocre life.  
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	Without the responsibility of a baby,
there's still time to salvage it.  A month after disappearing,
though, a runner realizes the vicious truth: that no amount of time
or things-not-to-be-responsible-for will turn an unaccomplished life
into one your eventual children will look up to.  Fleeing your responsibility
and making that new year's resolution to get your life on track is as
effective as telling yourself that you have the courage to ask out a
girl as you masturbate.  No number of promises will ever amount to
motivation.&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;	For those of us that stay in the
picture, ambition has a new meaning. I'm twenty five years old, a
software engineer on the startup circuit in Silicon Valley.  I'm not
in the business so that I get invited to speak at conferences.  I'm
not an entrepreneur because I want to feel important.  I'm in this
game now to provide for my family.  At first, I thought that a
startup was the only part of my youth left breathing, but now, I know
that having a picture of my daughter stuck to my monitor is the best
motivation there is.  If you're the type to man up to what's demanded
of you, a baby won't throw your entrepreneurship game off.  
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;&lt;br /&gt;
&lt;/p&gt;
&lt;p style=&quot;margin-bottom: 0in;&quot;&gt;Just make sure you're funded.&lt;/p&gt;
  
</content>
 </entry>
 
 <entry>
   <title>Disable The Annoying Thing In Ubuntu Jaunty</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/04/disable-the-annoying-thing-in.html"/>
   <updated>2009-04-25T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/04/disable-the-annoying-thing-in</id>
   <content type="html">&lt;img alt=&quot;at-least-they-didnt-fuck-up-my-nvidia-drivers-this-time.jpg&quot; src=&quot;/teddziuba/images/at-least-they-didnt-fuck-up-my-nvidia-drivers-this-time.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;230&quot; width=&quot;160&quot; /&gt;I upgraded to Ubuntu Jaunty Jackalope today.&amp;nbsp; It was a positive experience until I opened Pidgin and Evolution and started using my computer.&amp;nbsp; Jaunty has this new feature called NotifyOSD that application makers can use to bug the shit out of you at every possible moment.&amp;nbsp; Someone signed online? Bug the shit out of the user.&amp;nbsp; Received an e-mail, bug the shit out of the user.&amp;nbsp; Joined a wireless network?&amp;nbsp; You guessed it.&amp;nbsp; Let's bug the shit out of the user.&lt;br /&gt;&lt;br /&gt;The old notifier used to stay out of your way.&amp;nbsp; Get a little message or whatnot when you got a new e-mail.&amp;nbsp; It was unobtrusive and didn't distract you while you're trying to figure out why some little bit of SQLAlchemy code is making too many calls to a database.&amp;nbsp; But now, Canonical has found it necessary to make sure you're abundantly aware of every excruciating detail of your computer's operation.&amp;nbsp; Productivity be damned.&lt;br /&gt;&lt;br /&gt;I don't know whose bright idea this feature was, but whoever it is is trying to spread their terminal case of attention deficit disorder to the rest of the world.&amp;nbsp; Fuck you.&amp;nbsp; Grind up your Adderall pills and snort them until your heart shits out like a Chevy.&lt;br /&gt;&lt;br /&gt;Anyway, if you don't know what I'm talking about, this is the offending window:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;i-hope-youre-proud-of-yourself.png&quot; src=&quot;/teddziuba/images/i-hope-youre-proud-of-yourself.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;476&quot; width=&quot;733&quot; /&gt;&lt;b&gt;How To Turn This Thing Off&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Open up a command line.&amp;nbsp; Type this:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;sudo mv /usr/share/dbus-1/services/org.freedesktop.Notifications.service /usr/share/dbus-1/services/org.freedesktop.Notifications.service.disabled&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;And restart your computer.&amp;nbsp; You could probably restart the dbus daemon, but that makes a lot of things go ill on your machine.&lt;br /&gt;&lt;br /&gt;This disables the notifier for good.&amp;nbsp; Now you can get back to work.&lt;br /&gt; 
</content>
 </entry>
 
 <entry>
   <title>DiggBar is a Howl of Desperation</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/04/diggbar-is-a-howl-of-desperati.html"/>
   <updated>2009-04-10T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/04/diggbar-is-a-howl-of-desperati</id>
   <content type="html">&lt;img alt=&quot;silicon-valley-is-decadent-and-depraved.jpg&quot; src=&quot;/teddziuba/images/silicon-valley-is-decadent-and-depraved.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;270&quot; width=&quot;280&quot; /&gt; &lt;div&gt;Since the most recent death of Uncov, I've tried to lay off the Web 2.0 shit.&amp;nbsp; However, as a conoisseur of fail, I thought the DiggBar is worth an examination.&lt;br /&gt;&lt;br /&gt;DiggBar is a URL shortening service from Digg, the internet's largest community of whiners, armchair political activists, inconsolable Book-of-Steve-Jobs bible beaters, and automatic voting bots.&amp;nbsp; The long and short of it is this: you can put any address into it, and it will give you a way to view that URL through Digg.com.&amp;nbsp; For example, &lt;a href=&quot;http://digg.com/u1hrO&quot;&gt;http://digg.com/u1hrO&lt;/a&gt; brings you back here, except with a Digg toolbar at the top.&lt;br /&gt;&lt;br /&gt;There's been a small wave of butthurt over this little scheme, because every link on the front page of Digg.com now leads you to one of these toolbars instead of to the actual content.&amp;nbsp; So, when a Digg user clicks through, he never actually leaves Digg.com.&amp;nbsp; They've done some of the stuff necessary so that publishers don't get shafted on the traffic or the PageRank (and still managed to fuck that up), but still, that little bar adds little to no value to the user.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Then why did they do it?&lt;br /&gt;&lt;br /&gt;When an entrepreneur raises any useful amount of money from an investor, he needs to answer to that investor.&amp;nbsp; The CEO of a company reports to the board of directors, which usually includes the investors.&amp;nbsp; Every quarter, the CEO must lay out goals and objectives, and for internet companies, these goals and objectives always, &lt;b&gt;&lt;i&gt;always&lt;/i&gt;&lt;/b&gt; include traffic growth.&lt;br /&gt;&lt;br /&gt;Digg has raised 40 million dollars to date.&amp;nbsp; With that kind of money, investors demand explosive growth.&amp;nbsp; Since the economy has gone to shit, there's a very slim chance that Digg will see a sale before it needs to raise more money.&amp;nbsp; As a small company, Digg could have been a very profitable business, but instead they took too much money and made too many expectations for themselves.&amp;nbsp; I can guarantee you that Jay Adelson (CEO) and Kevin Rose have some demanding goals to meet, and lately, they haven't been meeting them.&lt;br /&gt;&lt;br /&gt;Hence the introduction of this DiggBar business.&amp;nbsp; When a link makes its way to the top of Digg, it gets republished quite a bit.&amp;nbsp; Now that all these links will land a user at Digg.com, Digg that collects the unique users from this collateral linkage.&amp;nbsp; And it's working, too.&amp;nbsp; In a recent interview, VP John Quinn of Digg said that the DiggBar has given them a 20% boost in unique visitors.&lt;br /&gt;&lt;br /&gt;This move shows that not only is Digg willing to pull some sleazy shit to increase their unique visitors, but that they also &lt;i&gt;need&lt;/i&gt; to pull this sleazy shit, because they need more unique visitors.&lt;br /&gt;&lt;br /&gt;Damn.&amp;nbsp; And I would have gotten away with it too, if it weren't for you meddling kids.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Footnotes.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Uncov.com died again.&amp;nbsp; I never owned the domain name, one of my business partners did.&amp;nbsp; The registration on it recently lapsed, and somebody picked it up.&amp;nbsp; It now redirects to a Twitter search for &quot;kevinrose&quot;.&amp;nbsp; If you're the one who bought it, good show.&amp;nbsp; Thanks for not being a spammer.&amp;nbsp; I'm not willing to buy it back from you, but if you want to give it to me, I will take you out for a beer to congratulate your achievement.&lt;/li&gt;&lt;li&gt;No, I did not get fired from The Register.&amp;nbsp; My wife and I had a daughter last week, and I am taking some time off.&amp;nbsp; Not sure when I'll return yet, but I will.&amp;nbsp; I just need to get my sleeping back on schedule.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Be a Better Blogger. Stop Reading Blogs.</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/03/be-a-better-blogger-stop-readi.html"/>
   <updated>2009-03-08T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/03/be-a-better-blogger-stop-readi</id>
   <content type="html">&lt;img alt=&quot;three-days-hike-to-the-douchebag-dharma-station.jpg&quot; src=&quot;/teddziuba/images/douchebag-dharma-station.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;478&quot; width=&quot;206&quot; /&gt; &lt;div&gt;The greatest hope of internet generation is that you can share your thoughts with everybody in the world.&amp;nbsp; The greatest letdown of the same generation is that nobody cares.&amp;nbsp; Still, that doesn't keep us from trying.&lt;br /&gt;&lt;br /&gt;Bloggers are good people, generally.&amp;nbsp; We're self confident in a passive aggressive sort of way, we're opinionated, and best of all, we can type quickly.&amp;nbsp; But what separates bloggers from each other?&amp;nbsp; Those who can break exclusive news usually have a good following, but what about the rest of us?&amp;nbsp; How do you actually get better at blogging?&lt;br /&gt;&lt;br /&gt;Not that you asked for it, but this is my advice: read more, write less.&amp;nbsp; By &quot;read more&quot;, I mean books.&amp;nbsp; Newspapers are useless, because the job of every newspaper editor is to remove any semblance of personality from all of the text.&amp;nbsp; It's just facts, and facts are fuckin' &lt;i&gt;boring&lt;/i&gt;.&amp;nbsp; If you want to cruise programming.reddit a few times a day and write reactionary articles, fine, live with that crowd.&amp;nbsp; It's not an interesting place to be, though.&amp;nbsp; Telling the world why you think DHH is wrong about some programming methodology isn't going to get you a column at Rolling Stone.&lt;br /&gt;&lt;br /&gt;Getting on the front page of Digg is not an accomplishment.&lt;br /&gt;&lt;br /&gt;Blogarrhea begets blogarrhea.&amp;nbsp; There's a continuous global discussion on the internet, and if you're not the one who started it, you're just background noise.&amp;nbsp; People like Paul Graham and Dave Winer never really say anything original, they just enjoy the act of typing.&amp;nbsp; Graham has been re-writing the same three essays for almost a decade, and Winer, well, Winer doesn't have much to do during the day, and at least blogging keeps him away from drugs and rap music. I guess it's a positive influence.&lt;br /&gt;&lt;br /&gt;If you want to keep that company, do so, but like programming, writing is so much better when you value elegance as well as functionality.&lt;br /&gt;&lt;br /&gt;Which brings me to my second point.&amp;nbsp; Write less.&lt;br /&gt;&lt;br /&gt;For the last two months, I have been working my way through a pile of books: everything ever published by &lt;a href=&quot;http://en.wikipedia.org/wiki/Chuck_palahniuk&quot;&gt;Chuck Palahniuk&lt;/a&gt; (tl;dr: the guy who wrote Fight Club).&amp;nbsp; I'm almost done, a book and a half to go.&amp;nbsp; Chuck likes to do these writers' workshops, and somebody once asked him what he does when he's stuck.&amp;nbsp; He knows where the story needs to go, but just doesn't know how to get it there.&lt;br /&gt;&lt;br /&gt;Chuck's response: &quot;Did you ever go into the bathroom and try and take a shit when you didn't have to go?&quot;&lt;br /&gt;&lt;br /&gt;Whenever I sit down to write a post here, it's because I really have to take a dump.&amp;nbsp; Incidentally, sometimes when I write for The Register, it feels like I'm really trying to squeeze one out.&amp;nbsp; If I end up dead from an aneurysm, that's what happened.  Setting a post-per-week quota for yourself is like setting a lines-of-code quota at work.&lt;br /&gt;&lt;br /&gt;Don't write just because you want to spend some time on the pot.&amp;nbsp; Do it because you really have to go.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Effective Vices for the IT Professional</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/02/effective-vices-for-the-it-pro.html"/>
   <updated>2009-02-08T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/02/effective-vices-for-the-it-pro</id>
   <content type="html">&lt;img alt=&quot;practicing-depravity-makes-you-better-at-it.jpg&quot; src=&quot;/teddziuba/images/practicing-depravity-makes-you-better-at-it.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;240&quot; width=&quot;320&quot; /&gt; &lt;div&gt;There's a blog post that snakes through the programming community every three months: the one about only hiring programmers who program in their spare time.&amp;nbsp; It's always the same person who writes it, too.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;He's a sequentially numbered employee at a company with a well-tracked ticker symbol, and his only outlet of authority is sweating down some poor sod in a windowless interview room, asking questions about sorting integers in linear time.&lt;br /&gt;&lt;br /&gt;The truth of it is, after a day of writing JUnit tests to achieve the corporate-policy-mandated code coverage metric, you don't need to go home to a Haskell compiler.&amp;nbsp; You need to go home to a tall drink and a depraved presentation of human sexuality.&amp;nbsp; Corporate coding sucks, and if there's no vice to counteract it, you'll be dead of an aneurysm by age forty.&amp;nbsp; They'll find you on the toilet, pants down, your copy of Design Patterns unceremoniously splayed open on the floor.&lt;br /&gt;&lt;br /&gt;Programming isn't a glamorous job, and pretending that it is won't make you any better at it.&lt;br /&gt;&lt;br /&gt;I've been studying some techniques for decompressing the tension built up by JBoss and WebSphere in my personal lab for quite some time now.&amp;nbsp; I'm not a corporate coder anymore, but when I was, I studied ways to make it easier on the head.&amp;nbsp; I'm now ready to share my results with the scientific community.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Drinking&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Alcohol is the most obvious medication because it's cheap and readily available.&amp;nbsp; Parked on the couch, racing your way to the bottom of a highball glass of Chivas Regal is a fantastic way to forget that the hour you spent in a meeting watching two type-A personalities fiercely debate Scrum versus XP is one hour less of the life you wanted.&amp;nbsp; The downside is that one drink usually leads to three or four, and you waste the drunkenness on an early sleep because you need to get up early the next day and do it all over again.&lt;br /&gt;&lt;br /&gt;Alcohol interrupts your sleep, and if you're going to stay sharp at work, you need a good rest.&amp;nbsp; If you're one of the damned souls like me, you get vicious hangovers, to the point where swimming in drink for a night isn't even worth it, if you're going to spend the next day wishing that you'd died of alcohol poisoning.&lt;br /&gt;&lt;br /&gt;That being said, at some time in your programming career you need to go to work with a severe hangover, out of sticking it to the man by way of martyrdom.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Drugs&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Here's where the process gets dicey.&amp;nbsp; When you wear a button-down shirt to work, you're not the usual type of person who ends up in county lock-up on a possession charge.&amp;nbsp; Marijuana is certainly a better option than alcohol in every way, shape, and form, but like it or not, it is illegal.&amp;nbsp; There are some exceptions here in California, but you're still rolling the dice&amp;nbsp; Getting pinched could mean getting fired.&lt;br /&gt;&lt;br /&gt;The scary shit comes as rocks or powders.&amp;nbsp; Again, being the khakis-and-necktie crowd, nobody really expects you to be shooting black-horse heroin in the shower.&lt;br /&gt;&lt;br /&gt;There is a convenient edge case when it comes to drugs, though.&amp;nbsp; Prescription painkillers, when used appropriately, really take the edge off of reality.&amp;nbsp; Again, you run the risk of upsetting John Q. Law, so make sure it's legit.&amp;nbsp; While they make for good entertainment in the evening, there's a real possibility that you can get addicted, and once a vice starts interfering with your work, then you're fucked.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Strippers&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you're the type that easily takes to strippers, you had better come ready to peel of the cabbage: this vice doesn't come cheap.&amp;nbsp; There's also a simple but unrelenting set of rules you need to learn to keep from getting your ass kicked by a bouncer.&amp;nbsp; It's the kind of thing you'll pick up as you go.&lt;br /&gt;&lt;br /&gt;For the programmer or IT professional, strippers are an excellent choice.&amp;nbsp; You usually show up to the gentleman's establishment with a bit more money than any of the other clients, so you'll be Mr. Popular.&amp;nbsp; Just be respectful of what's going on: it's not so much a smut show as it is a first hand demonstration in a loosely regulated free market.&amp;nbsp; The dancers are there to make a buck, and don't you forget it.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Tobacco&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Tobacco is a great vice for the programmer because it's a performance enhancing drug as well as an escape.&amp;nbsp; I recommend either cigars or smokeless tobacco to avoid the growing anti-cigarette movement.&amp;nbsp; A cigar is gangster, and chewing tobacco is stealth.&lt;br /&gt;&lt;br /&gt;After putting down a nice stogie wrapped in Connecticut shade, you'll feel like it's time for action.&amp;nbsp; Nicotine is a fantastic stimulant - better than caffeine.&amp;nbsp; If you're the work-from-home type, smoking a cigar twenty minutes before you start will send you on your way in a hurry.&amp;nbsp; Avoid the dregs, though: don't buy a cigar in any place that sells gasoline.&lt;br /&gt;&lt;br /&gt;Chewing tobacco is often overlooked.&amp;nbsp; Yeah, you say it's more of a staple with the Nascar crowd, but that's really just a stereotype invented by the Nascar crowd, designed to keep you damned hoity-toity folks from driving up the cost of a can of chaw.&lt;br /&gt;&lt;br /&gt;The key part about dip is that you can do it at your desk.&amp;nbsp; Spit into an empty Coke bottle.&amp;nbsp; Nobody will come by to bother you.&amp;nbsp; Plus, think of how authoritative you're going to be at a meeting when you start it off by lipping a fat digger out of a tin of Skoal.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Just Keep It Within Reason&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You can judge any vice on two dimensions: how good is it, and how likely is it to interfere with your work.&amp;nbsp; Once a vice becomes more than a vice, you're going to &lt;i&gt;wish&lt;/i&gt; you were that guy who goes home to code Haskell.&lt;br /&gt;&lt;br /&gt;However, there is a convenient side-effect to the addictiveness.&amp;nbsp; If you are aware enough to see your vice getting out of hand, it's probably time to quit your job.&lt;br /&gt;&lt;br /&gt;Just don't do anything illegal.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Corporate Blogs: It's The PageRank, Stupid</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/01/corporate-blogs-its-the-pagera.html"/>
   <updated>2009-01-19T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/01/corporate-blogs-its-the-pagera</id>
   <content type="html">&lt;img alt=&quot;still-not-giving-mint-my-banking-information.jpg&quot; src=&quot;/teddziuba/images/still-not-giving-mint-my-banking-information.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;256&quot; width=&quot;273&quot; /&gt; &lt;div&gt;If you're running an online business and have hired a consultant who tells you that you should have a corporate blog to &quot;better connect with the community&quot;, fire that consultant.&lt;br /&gt;&lt;br /&gt;If you have a corporate blog that is only marginally more interesting than a press release wire, you're wasting your time.&lt;br /&gt;&lt;br /&gt;A corporate blog should serve only one primary purpose: distribution.&amp;nbsp; And I'm not talking about building brand recognition by getting people to read your blog.&amp;nbsp; Nine times out of ten, the text on your corporate blog is a chore to read.&amp;nbsp; Even Google fails this - their pathological cuteness and lame humor comes off as contrived.&amp;nbsp; It's not funny.&amp;nbsp; It's irritating.&lt;br /&gt;&lt;br /&gt;Anyway, how does a blog get you distribution if you're not concentrating on branding?&amp;nbsp; PageRank.&amp;nbsp; You can and should use your blog for link-building and search engine optimization.&lt;br /&gt;&lt;br /&gt;A great example of this is &lt;a href=&quot;http://www.mint.com/blog/&quot;&gt;Mint.com's blog&lt;/a&gt;.&amp;nbsp; Mint is a personal finance web product that competes with desktop apps like Quicken.&amp;nbsp; Mint publishes longer articles about personal finance to their blog, and have several thousand readers.&amp;nbsp; That alone is interesting, but not mind-blowing.&amp;nbsp; The trick is that their content is &lt;i&gt;useful&lt;/i&gt;.&amp;nbsp; It's basically a magazine about personal finance without the advertisements.&amp;nbsp; Social media picks up on Mint's content, and it gets a lot of inbound links.&lt;br /&gt;&lt;br /&gt;Mint takes gross advantage of those inbound links.&amp;nbsp; That's the whole point.&amp;nbsp; At the bottom of every blog post is this little nugget:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;mint-screenshot.png&quot; src=&quot;/teddziuba/images/mint-screenshot.png&quot; class=&quot;mt-image-center&quot; style=&quot;border: 1px solid black; margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;182&quot; width=&quot;654&quot; /&gt;&lt;br /&gt;A-ha, I see what you're doing there.&amp;nbsp; Mint is juicing their PageRank with the popularity of the blog.&amp;nbsp; If you're a personal finance website, chances are you want to optimize for some of these keywords.&amp;nbsp; And it's really working for them.&lt;br /&gt;&lt;br /&gt;If you use Google's Keyword Tool to estimate the traffic for these keywords, find Mint's rank in the result page for each of them, and then multiply keyword traffic by the distribution of clicks for the top results in Google, you'll see that Mint is raking in at least 100,000 uniques per month from Google for these keywords.&lt;br /&gt;&lt;br /&gt;If you hire a writer to post on your corporate blog, you could be seeing this kind of traffic, too.&amp;nbsp; By &quot;writer&quot;, I don't mean &quot;Peggy in accounts receivable who majored in English thirty years ago&quot;.&amp;nbsp; No, I mean someone whose words are worth reading.&amp;nbsp; A decent freelancer will run you 50 cents per word.&amp;nbsp; A good length blog post is 1,000 words, and you should publish at least once per week.&amp;nbsp; 5 posts like this per month will cost $2,500.&lt;br /&gt;&lt;br /&gt;Now let's compare that to buying traffic from Google by bidding on these keywords.&amp;nbsp; A really, &lt;i&gt;really&lt;/i&gt; conservative estimate of a bid price for keywords like this is 10 cents (but good luck ranking with that bid, cheapskate).&amp;nbsp; To buy 100,000 uniques would therefore cost you $10,000 per month, &lt;i&gt;and&lt;/i&gt; you don't get the PageRank.&lt;br /&gt;&lt;br /&gt;Of course, the success of this strategy isn't as quantifiable as buying ads, but eventually you'll see traffic throughput.&amp;nbsp; Any writer worth his salt will be able to game social media sites like Digg and Reddit, which will bring in the backlinks.&amp;nbsp; All you need to do is figure out what keywords to optimize for, and put them in the blog template.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Every day I'm hustlin'&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Advice to Old Men from a Young Man</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/01/advice-to-old-men-from-a-young.html"/>
   <updated>2009-01-17T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/01/advice-to-old-men-from-a-young</id>
   <content type="html">&lt;img alt=&quot;billy-mays-is-still-cooler.jpg&quot; src=&quot;/teddziuba/images/billy-mays-is-still-cooler.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;250&quot; width=&quot;244&quot; /&gt;1. Unless you were shooting Kennedy, nobody cares where you were when Kennedy was shot.&lt;br /&gt;&lt;br /&gt;2. The left lane on the freeway is a young man's game.&lt;br /&gt;&lt;br /&gt;3. Things will always get more expensive.&amp;nbsp; Bitching about the cost of gasoline isn't going to make it any cheaper.&amp;nbsp; Corollary: nobody cares that a gallon of gasoline used to cost a nickel.&lt;br /&gt;&lt;br /&gt;4. War stories: keep them coming.&lt;br /&gt;&lt;br /&gt;5. If you have a prosthetic hook-arm, it's your duty to use it to scare children.&amp;nbsp;&amp;nbsp; Corollary to #4, your prosthetic hook-arm makes a war story way better.&amp;nbsp; If you didn't lose your arm in a war, make up a good war story to explain it.&amp;nbsp; Nobody will know the difference.&lt;br /&gt;&lt;br /&gt;6. The world doesn't owe you anything.&lt;br /&gt;&lt;br /&gt;7. Respect your youngers.&amp;nbsp; We're the ones who will pay your Social Security and take care of you when you're enfeebled.&lt;br /&gt;&lt;br /&gt;8. Advice you offer to young men should fall into one of these three categories:&lt;br /&gt;&amp;nbsp;&amp;nbsp; A. The finer points of tolerable behavior when it comes to strippers&lt;br /&gt;&amp;nbsp;&amp;nbsp; B. Recommendations on quality whiskeys&lt;br /&gt;&amp;nbsp;&amp;nbsp; C. Sticking it to the man&lt;br /&gt;&lt;br /&gt;9. If you're past the point where people depend on you, eat, smoke, drink, and gamble.&amp;nbsp; We young men must control our vices, but you've earned the right to indulge with reckless abandon.&amp;nbsp; Show us what we have to look forward to.&lt;br /&gt;&lt;br /&gt;10. You keep getting older, but they stay the same age.&amp;nbsp; From a young man's perspective, a 65 year old man with a 23 year old woman isn't a shame, it's a victory.&lt;br /&gt; 
</content>
 </entry>
 
 <entry>
   <title>Buying Sea Salt?  You Might Be a Sucker.</title>
   <link href="http://widgetsandshit.com/teddziuba/2009/01/buying-sea-salt-you-might-be-a.html"/>
   <updated>2009-01-11T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2009/01/buying-sea-salt-you-might-be-a</id>
   <content type="html">&lt;img alt=&quot;see-also-hypertension.jpg&quot; src=&quot;/teddziuba/images/see-also-hypertension.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;240&quot; width=&quot;320&quot; /&gt;If there's one thing I have a lot of contempt for, it's neo-hippie bullshit.&amp;nbsp; However, my appreciation for fresh produce just barely overthrows this contempt, so sometimes I go shopping at &lt;a href=&quot;http://www.berkeleybowl.com/&quot;&gt;The Berkeley Bowl&lt;/a&gt; for all kinds of fruits and vegetables that I've never heard of.&amp;nbsp; Really, they have some wonky shit there.&amp;nbsp; Ever see a &lt;a href=&quot;http://en.wikipedia.org/wiki/Buddha%27s_hand&quot;&gt;Buddha's Hand&lt;/a&gt;?&lt;br /&gt;&lt;br /&gt;Anyhoo, they sell sea salt there.&amp;nbsp; Salt, like 'out the ocean.&amp;nbsp; And people buy it.&amp;nbsp; And those people are morons.&lt;br /&gt;&lt;br /&gt;If you buy sea salt, you're paying a premium for the luxury of being a douchebag.&amp;nbsp; It's salt.&amp;nbsp; It has no discernible flavor other than &lt;i&gt;salty&lt;/i&gt;, it has no metric of quality other than &lt;i&gt;not mixed with dirt and glass shards&lt;/i&gt;, and it should have no variation in price other than &lt;i&gt;cheap&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;You can buy 4 pounds of standard issue table salt for $5.37 on the internet.&amp;nbsp; Alternatively, I've seen 4 ounces of sea salt for sale for $2.39.&amp;nbsp; That's a markup of roughly 712%.&amp;nbsp; It's a pretty good business if you're selling salt.&lt;br /&gt;&lt;br /&gt;In fact, sea salt might even be bad for you.&amp;nbsp; Regular salt has been used for years as a vehicle for iodine, a chemical your body needs to keep you from becoming a retard.&amp;nbsp; No bullshit, iodine deficiency can cause mental retardation.&amp;nbsp; It only costs a dollar or so to iodize a ton of salt, so it really is ideal.&amp;nbsp; Most sea salt isn't iodized, because it's sold as &quot;natural&quot;.&amp;nbsp; Boy, a lesser product for way more money?&amp;nbsp; Where do I sign up?&lt;br /&gt;&lt;br /&gt;Some people claim to be able to distinguish the &quot;superior flavor&quot; of sea salt.&amp;nbsp; These are the same kinds of people who keep a fridge stocked with gallons of bottled water and don't use the tap for anything but watering a house cactus.&amp;nbsp; If you are one of these people, you should kill yourself as a public service.&amp;nbsp; The only real difference between sea salt and table salt you'll feel when you eat it is the coarseness of sea salt.&amp;nbsp; That's it.&amp;nbsp; And coarse salt isn't worth fucking ten dollars a pound.&lt;br /&gt;&lt;br /&gt;tl;dr if you're buying sea salt, consider yourself successfully marketed to.&amp;nbsp; It's like Fiji water.&amp;nbsp; You got hustled.&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>There Will Be No Web 3.0</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/12/there-will-be-no-web-30.html"/>
   <updated>2008-12-21T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/12/there-will-be-no-web-30</id>
   <content type="html">&lt;img alt=&quot;husslin-vs-ballin-the-eternal-struggle.jpg&quot; src=&quot;/teddziuba/images/husslin-vs-ballin-the-eternal-struggle.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;284&quot; width=&quot;250&quot; /&gt;The recession reached its hand into Silicon Valley's now lukewarm tub and yanked the plug.&amp;nbsp; It's still draining out, and I wish it would go faster, because there are just too many fucking people in the San Francisco Bay Area.&amp;nbsp; I'm talking about you, guy in your Prius taking the left hand turn on to Middlefield Road too slowly.&amp;nbsp; Leave, now.&amp;nbsp; And don't come back.&amp;nbsp; Bonus points for wrapping your expression of environmental consciousness around a tree.&amp;nbsp; Be one with nature.&lt;br /&gt;&lt;br /&gt;The guy who drives the Prius likely works at a Web 2.0 company that's burning its way through the $4 million it raised from Me2 Ventures, one of the many sheep-funds in the Valley who follow the trends of top-tier investors like Sequoia or DFJ but don't have the connections to pull liquidity out of hype.&lt;br /&gt;&lt;br /&gt;In two years, this guy's company will finally run out of money, having failed to raise another round because investors are too busy conjuring up the next bubble.&amp;nbsp;&amp;nbsp; The failure of Web 2.0 was a live demonstration in I-Told-You-So, as was the first bubble.&amp;nbsp; Both times, the world looked on and thought &quot;what the fuck are you doing?&quot;, and Silicon Valley replied &quot;shut up and bring me my Vaseline&quot;.&amp;nbsp; We went from bad business plans to no business plans, and saw much less liquidity this time.&amp;nbsp; The big bang was YouTube, and it was all down hill from there.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;br /&gt;The Only Easier Money is Marijuana&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;So what will the next bubble be?&amp;nbsp; Green technology.&amp;nbsp; Green energy.&amp;nbsp; Green computers.&amp;nbsp; Green pants.&amp;nbsp; Green vomit after an Absinthe adventure.&lt;br /&gt;&lt;br /&gt;Al Gore did a wonderful job creating awareness of global warming.&amp;nbsp; Awareness isn't the right word, but neither is hysteria.&amp;nbsp; Both are close enough.&lt;br /&gt;&lt;br /&gt;San Franciscans were more motivated than usual by this cause, and have begun to care about their carbon footprints or other such nonsense.&amp;nbsp; Making a San Franciscan feel like he alone can make a difference is the best way to control his actions.&amp;nbsp; See also: spending habits.&amp;nbsp; Al Gore, with his nonthreatening voice and relentless assault of data has the power to cultivate the same feeling in stay-at-home-moms and college students.&lt;br /&gt;&lt;br /&gt;Unfortunately, the average American mind can only be concerned with one crisis at a time.&amp;nbsp; Purveyors of fine doom-and-gloom are continuously vying for this spot.&amp;nbsp; Presently, it's the economy.&amp;nbsp; Foreclosures.&amp;nbsp; You're going to lose your house.&amp;nbsp; Oh fuck, you'll lose your house, your family, your car, and did we mention that you'll be living on the street?&amp;nbsp; Fear not.&amp;nbsp; Here's some shit you can buy to make it all better.&amp;nbsp; Here's a politician you can vote for who will fix everything.&lt;br /&gt;&lt;br /&gt;Fear cycles last a few years.&amp;nbsp; Remember when we were afraid of terrorism?&amp;nbsp; What about peak oil?&amp;nbsp; Global &lt;i&gt;cooling&lt;/i&gt; anyone?&amp;nbsp; When money comes back to the Valley, it's going to be aligned perfectly with the beginning of the next fear cycle, and the next fear cycle is going to be global warming.&amp;nbsp; Or climate change.&amp;nbsp; Or polar bear rescue.&amp;nbsp; You can call it whatever you like, as long as you spend money to fix it.&amp;nbsp; Do your part.&amp;nbsp; It's your obligation as a citizen of the earth.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Still Waiting For That Twitter Business Plan&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Green tech hasn't taken off yet because liberal guilt can't support a very big market.&amp;nbsp; What you need is government collusion.&amp;nbsp; You need somebody with a gun to step in and say that if you emit more than 100 tons of carbon per year, you need to pay.&amp;nbsp; You need that same person with a gun to say that these carbon emission credits have value, and can be traded.&amp;nbsp; It helps if your typical Silicon Valley entrepreneur or investor believes the call to action.&lt;br /&gt;&lt;br /&gt;That last part is easy.&amp;nbsp; Web 2.0 was all about San Francisco values.&amp;nbsp; Sharing.&amp;nbsp; Caring.&amp;nbsp; Understanding.&amp;nbsp; What would Web 3.0 be about? Many say it's some semantic bullshit.&amp;nbsp; Those are the same people who have figured out what &lt;a href=&quot;http://www.twine.com/&quot;&gt;Twine&lt;/a&gt; does (any hints?).&amp;nbsp; Whatever we can dream up to do over the internet won't draw any money; investors will be bored with web companies after this debacle.&amp;nbsp; The money will go to green tech, because there will be an obvious business plan, popular support, and a government mandate.&amp;nbsp; How can you lose?&amp;nbsp; &lt;br /&gt;&lt;br /&gt;The entrepreneurs will follow suit.&amp;nbsp; Silicon Valley types love to feel like they're making a difference, and green tech will practically let them fellate themselves. (In Web 2.0 the Silicon Valley types fellated one another, so this is the natural extension)&amp;nbsp; It will be different people, as an extensive knowledge of Python doesn't give you much insight into solar panel construction, but the same kind of people.&lt;br /&gt;&lt;br /&gt;I believe this because it's satisfying.&amp;nbsp; No more &quot;get users, do something, get bought out&quot;.&amp;nbsp; This time, it's &quot;invent something, build it, sell it&quot;.&amp;nbsp; Sure, we'll be turning a profit by taking sick advantage of alarmism, but it's a business.&amp;nbsp; &lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>Shut Your Face, Commons Httpclient</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/12/shut-your-face-commons-httpcli.html"/>
   <updated>2008-12-18T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/12/shut-your-face-commons-httpcli</id>
   <content type="html">If you're like me and every other user on the planet, you don't give a shit when an SSL certificate doesn't validate.&amp;nbsp; Unfortunately, commons-httpclient was written by some pedantic fucknozzles who have never tried to fetch real-world webpages. &lt;br /&gt;&lt;br /&gt;If you want to turn off SSL certificate validation in httpclient, do this:&lt;br /&gt;&lt;br /&gt;1. Put &lt;a href=&quot;http://juliusdavies.ca/commons-ssl/download.html&quot;&gt;not-yet-commons-ssl.jar&lt;/a&gt; on your classpath.&lt;br /&gt;2. Execute the following method before you start any SSL connections:&lt;br /&gt;&lt;br /&gt; 

&lt;pre&gt;&lt;code&gt;
public static void trustAllCerts() throws GeneralSecurityException, IOException {
	ProtocolSocketFactory sf = new EasySSLProtocolSocketFactory();
	Protocol p = new Protocol(&quot;https&quot;, sf, 443);
	Protocol.registerProtocol(&quot;https&quot;, p);
}
&lt;/code&gt;&lt;/pre&gt;

This essentially makes commons-httpclient accept every SSL certificate it gets.&amp;nbsp; Yeah, that's what I thought.  Who's bitching now? 
</content>
 </entry>
 
 <entry>
   <title>Python Makes Me Nervous</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/12/python-makes-me-nervous.html"/>
   <updated>2008-12-06T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/12/python-makes-me-nervous</id>
   <content type="html">&lt;img alt=&quot;wait-til-you-see-those-goddamn-bats.jpg&quot; src=&quot;/teddziuba/images/wait-til-you-see-those-goddamn-bats.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;238&quot; width=&quot;274&quot; /&gt; &lt;div&gt;The amount of time saved by using Python as opposed to something like Java is inversely proportional to the number of people working on the project.&lt;br /&gt;&lt;br /&gt;As a programmer in a team, you need rules.&amp;nbsp; You need structure.&amp;nbsp; You need order.&amp;nbsp; Freewheeling your way around a software project is going to create more problems than it solves.&lt;br /&gt;&lt;br /&gt;What I'm butthurting about here is Python's duck typing.&amp;nbsp; It's cute when you're a lone wolf working on a simple Django application, but add a few more people to the project and it quickly becomes unmanageable.&amp;nbsp; Why?&amp;nbsp; Because with duck typing, you need to keep &lt;b&gt;a lot&lt;/b&gt; more state in your head to interact with other peoples' code.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Pydev for Eclipse Sucks Too&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Method signatures are virtually useless in Python.&amp;nbsp; In Java, static typing makes the method signature into a recipe: it's all the shit you need to make this method work. Not so in Python.&amp;nbsp; Here, a method signature will only tell you one thing: how many arguments you need to make it work.&amp;nbsp; Sometimes, it won't even do that, if you start fucking around with **kwargs.&lt;br /&gt;&lt;br /&gt;Calling a colleague's method isn't as easy as looking at the signature.&amp;nbsp; You need to look at the method definition itself to see what it does with its input.&lt;br /&gt;&lt;br /&gt;Let's look at an example from Thrift, Facebook's open source RPC server.&amp;nbsp; Here's the signature to a TServer constructor in Java:&lt;br /&gt;

&lt;pre&gt;&lt;code&gt;
protected TServer(TProcessorFactory processorFactory, TServerTransport serverTransport)
&lt;/code&gt;&lt;/pre&gt;

And there are a few other constructors that take different args.&amp;nbsp; Pretty straight forward, if you look at this, you know what you need to instantiate to get your TServer up and running.&amp;nbsp; Now let's look at the Python version:&lt;br /&gt;
&lt;pre&gt;&lt;code&gt;
def __init__(self, *args):
&lt;/code&gt;&lt;/pre&gt;

So, how do you use it?&amp;nbsp; Big fuckin' mystery!&amp;nbsp; You can't overload constructors in Python, so they had to mash the several different constructors into one.&amp;nbsp; To figure out how to instantiate a TServer, you need to look at the constructor implementation.&amp;nbsp; &lt;i&gt;As a user of the library, the implementation is none of my concern, unless I'm programming in Python.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;What a waste of time.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Whatever You Do, Don't Do It Wrong&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;What about errors?&amp;nbsp; Python exceptions are what really make me nervous.&amp;nbsp; Your code can run fine for the longest time then shit out with a runtime exception.&amp;nbsp; How do you know what exceptions a method can throw?&amp;nbsp; Well, you don't, unless you look at the method definition.&amp;nbsp; Fantastic.&lt;br /&gt;&lt;br /&gt;Java has a well thought out hierarchy of checked and runtime exceptions.&amp;nbsp; Sure, handling checked exceptions means you need to write a bit more code, but it's better to spend the time in development than in debugging at 4am.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Another example is in order.&amp;nbsp; In Java, the constructor to FileInputStream throws a FileNotFoundException if something goes wrong.&amp;nbsp; Since it's a checked exception, you need to deal with it somehow.&amp;nbsp; The fact that this exception is thrown is made obvious in the documentation, and your code won't compile if you ignore it.&lt;br /&gt;&lt;br /&gt;Python, on the other hand, prefers to leave things up to chance.&amp;nbsp; This is the documentation for the open() builtin, that opens a file:&lt;br /&gt;&lt;br /&gt;
&lt;pre&gt;Help on built-in function open in module __builtin__:

open(...)
    open(name[, mode[, buffering]]) -&amp;gt; file object
    
    Open a file using the file() type, returns a file object.
(END) 
&lt;/pre&gt;

How does this function handle a failure?&amp;nbsp; Does it raise an Exception?&amp;nbsp; Does it return a special value?&amp;nbsp; Nobody seems to know!&amp;nbsp; Ah, fuck it, that's a runtime problem, right?&lt;br /&gt;&lt;br /&gt;Sure, runtime exceptions happen in Java, but they are usually things that are indicative of a &lt;b&gt;big&lt;/b&gt; fuckup like a NullPointerException, not something stupid like a file not being found.&lt;br /&gt;&lt;br /&gt;Programming a large project in Python makes me uneasy.&amp;nbsp; Perhaps I'm just doing it wrong?&amp;nbsp; Do other Pythonistas drop a Valium before they begin the day?&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Analog Debugging is Hard</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/11/analog-debugging-is-hard.html"/>
   <updated>2008-11-24T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/11/analog-debugging-is-hard</id>
   <content type="html">I took a new job at a software company in Palo Alto, which means an 80 mile commute every day through Bay Area combat traffic.&amp;nbsp; The first two weeks wore hard on my motorcycle - a 14 year old Ninja 500.&amp;nbsp; Last week on my ride home, the left turn signal stopped working.&amp;nbsp; Fuck.&lt;br /&gt;&lt;br /&gt;If you thought debugging a software problem was hard, try debugging a hardware problem.&amp;nbsp; There are some salient facts about hardware problems that make them a real bitch:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;You need to buy a lot of tools.&lt;/li&gt;&lt;li&gt;There's a real possibility that you will fuck something up beyond repair.&lt;/li&gt;&lt;li&gt;There's a real possibility that you will injure yourself.&lt;/li&gt;&lt;li&gt;If it's your primary vehicle, you need to have it up and running on Monday.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;A blinker light stops working, which means electricity isn't flowing.&amp;nbsp; Sounds easy, but to get access to the wires, I needed to take the whole damn thing apart:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;close_up_small.JPG&quot; src=&quot;/teddziuba/images/close_up_small.JPG&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;410&quot; width=&quot;547&quot; /&gt;&lt;br /&gt; &lt;div&gt;You know how when you're writing software for a client, and they completely underestimate the amount of time and effort required to build something?&amp;nbsp; Yeah, the same goes for auto mechanics.&amp;nbsp; Don't bitch about a shop's $75/hr labor rate or their diagnosis fee.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Avoiding NLP At All Costs</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/11/avoiding-nlp-at-all-costs.html"/>
   <updated>2008-11-13T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/11/avoiding-nlp-at-all-costs</id>
   <content type="html">&lt;img alt=&quot;hurrdurr.gif&quot; src=&quot;/teddziuba/images/hurrdurr.gif&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;286&quot; width=&quot;317&quot; /&gt;I'm working with a startup now on a text summarization project.&amp;nbsp; The requirements are fairly loose: &quot;take all this text and make it smaller&quot;, solving the tl;dr problem (too long; didn't read).&amp;nbsp; There are a couple of critical details, namely identifying the sentiment of the text, and a few others that are excruciatingly domain-specific.&lt;br /&gt;&lt;br /&gt;At first glance, this seems approachable with some natural language processing libraries.&amp;nbsp; Oh no.&amp;nbsp; There be dragons.&amp;nbsp; At Pressflip, I had myself into a few NLP libraries, and the only takeaway I got from all that experience was &lt;i&gt;&quot;Don't use NLP.&amp;nbsp; Ever.&quot;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Why?&amp;nbsp; NLP is yucky.&amp;nbsp; It's complicated, the field is rife with academic shitheaddery, there are some major-asspain licensing issues with a couple of software packages and best of all, it's balls slow.&amp;nbsp; Plus, if you venture down the road of natural language processing, the law of diminishing returns will pull you into a dark alley, pummel you with a tire iron, take you wallet, and then just to be a prick, steal your shoes so you need to walk home barefoot.&lt;br /&gt;&lt;br /&gt;My point is, for 99 practical projects out of 100, you can cheat your way out of NLP.&amp;nbsp; Cook up some fancy shit with word frequencies and logarithms.&amp;nbsp; Reach back into your information retrieval notes for inspiration.&amp;nbsp; TF*IDF can take you a long way if you know how to use it.&lt;br /&gt;&lt;br /&gt;When I was brainstorming the project I'm working on, my first thought was some hand-waving business about a part-of-speech tagger and a Markov Chain to figure out probabilities of part-of-speech transitions and all that fancy shit.&amp;nbsp; Factor in a little bit of sentiment detection from God-knows-where and that was my sketch.&amp;nbsp; Then practicality set in: how much time do you want to spend on this?&amp;nbsp;&lt;b&gt; If you are considering NLP as the answer to a real problem, it's virtually certain that you're overthinking it&lt;/b&gt;. &lt;br /&gt;&lt;br /&gt;That being said, NLP does have its place: &lt;a href=&quot;http://www.powerset.com/&quot;&gt;making the best fucking Wikipedia search engine there ever was with technology licensed from Xerox and then selling yourself to Microsoft&lt;/a&gt;.&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>I Have Never Seen Ubuntu Get Upgrades Right</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/11/i-have-never-seen-ubuntu-get-u.html"/>
   <updated>2008-11-01T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/11/i-have-never-seen-ubuntu-get-u</id>
   <content type="html">Ubuntu upgrades always, always, &lt;i&gt;always&lt;/i&gt; fuck up the same things: network connections and graphics drivers.&amp;nbsp; Without fail, if you upgrade, your wireless connection won't work and any closed-source video card drivers you need will get ill.&amp;nbsp; 8.10 Intrepid Ibex is no exception.&lt;br /&gt;&lt;br /&gt;I figured out how to get my wireless connection back, but the NVidia drivers are still a mystery.&amp;nbsp; I don't care about free software idealism, I care that my shit &lt;i&gt;works&lt;/i&gt;.&amp;nbsp; I'm willing to jump through minor hoops to make it work, like Ubuntu's &quot;restricted drivers&quot; lecture, but now that doesn't even work:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;intrepid-fucked.png&quot; src=&quot;/teddziuba/images/intrepid-fucked.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;573&quot; width=&quot;505&quot; /&gt;&lt;br /&gt;&lt;br /&gt;Why is this such a pain in the balls?&amp;nbsp; I press the activate button and nothing happens. AWESOME. Just because I'm used to wasting hours fixing things like this doesn't mean I enjoy it.&lt;br /&gt;&lt;br /&gt;NVidia has 70-ish percent market share of all GPUs.&amp;nbsp; If your shit doesn't work out of the box on 70% of the graphics cards out there, who has failed?&lt;br /&gt;&lt;br /&gt;&lt;b&gt;[Update, the next day]:&lt;/b&gt;&amp;nbsp; I found the fix.&amp;nbsp; Navigate to System &amp;gt; Administration &amp;gt; Synaptic Package Manager.&amp;nbsp; From there, go to Settings &amp;gt; Repositories.&amp;nbsp; In the &quot;Ubuntu Software&quot; tab, check the &quot;Proprietary device drivers&quot; box.&amp;nbsp; Or edit /etc/apt/sources.list if you want to show your chest hair.&lt;br /&gt;&lt;br /&gt;I'm glad to see that passive-aggressive Debian superiority is alive and well.&amp;nbsp; &lt;br /&gt; 
</content>
 </entry>
 
 <entry>
   <title>Auto Mechanics: A Good Hobby for Programmers</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/10/auto-mechanics-a-good-hobby-fo.html"/>
   <updated>2008-10-19T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/10/auto-mechanics-a-good-hobby-fo</id>
   <content type="html">&lt;img alt=&quot;i-disagree.jpg&quot; src=&quot;/teddziuba/images/i-disagree.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;230&quot; width=&quot;307&quot; /&gt; &lt;div&gt;When I was a kid, I used to help my Dad work on the family cars.&amp;nbsp; We changed the oil, brake pads, repaired a broken hydraulic line, and fixed faulty air conditioning.&lt;br /&gt;&lt;br /&gt;It wasn't too long before I was able to do most basic repair and maintenance myself.&amp;nbsp; In college, I spent hours fixing an electrical problem that caused my rear turn signals to go out.&lt;br /&gt;&lt;br /&gt;Recently, my car started to make a weird &quot;coughing&quot; noise from the muffler, and I fixed that, too.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;This Is Going Somewhere I Swear&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Why is this relevant to programming?&amp;nbsp; Well, one of my favorite parts about programming production services is forensic debugging.&amp;nbsp; Some process crashes in the middle of the night and all you're left with is a beeper going apeshit and a stack trace.&amp;nbsp; What went wrong?&amp;nbsp; How do you debug that?&lt;br /&gt;&lt;br /&gt;Debugging a car is the same thing, but a lot harder.&amp;nbsp; With a car, the symptoms of the bug aren't usually very concrete: a funny noise, a bad smell, a jittery feeling.&amp;nbsp; Compared to a computer, a car is a very simple machine, but because it's so simple it's much harder to debug.&amp;nbsp; Newer cars have an electronic interface to tell you what sensors are indicating faults, but that doesn't always solve the problem.&amp;nbsp; Plus, the sensor readers are like $300.&lt;br /&gt;&lt;br /&gt;The more you work on a car, the more you develop an intuition about it.&amp;nbsp; In code, you can narrow your bug down and fix it.&amp;nbsp; With a car, you narrow the fault down to a couple of suspect parts and start by replacing the cheapest one.&amp;nbsp; For me, fixing a car problem is much more gratifying than fixing a code problem because of the tangibility of it.&lt;br /&gt;&lt;br /&gt;So, if you're enjoy debugging and problem solving, you'd probably like auto mechanics.&amp;nbsp; There are a couple of collateral upshots to it: you save some money, you can give your friends car advice, and you get to buy a bunch of really awesome tools.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;No one can pull your man card when you have a specialized wrench for an O2 sensor.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Most Programming Interviews are a Waste of Time</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/10/most-programming-interviews-ar.html"/>
   <updated>2008-10-07T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/10/most-programming-interviews-ar</id>
   <content type="html">&lt;img alt=&quot;dr-seuss-wtf-is-this-shit.jpg&quot; src=&quot;/teddziuba/images/dr-seuss-wtf-is-this-shit.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;266&quot; width=&quot;191&quot; /&gt; &lt;div&gt;Interviewing a candidate is so much fun because you get to passively assert your superiority &lt;i&gt;and&lt;/i&gt; be professorial enough that you can justify those nine years you spent in graduate school studying compiler optimizations only to get a job maintaining a failure-prone database driven web app.&lt;br /&gt;&lt;br /&gt;Interviewers spend almost as much time Googling for interview questions as candidates do.&lt;br /&gt;&lt;br /&gt;I've been on both sides of the interview, and I'm here to dump a big load of truth on you about what interviewers really think.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Technical Question&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Says:&lt;/i&gt; How would you find a cycle in a singly linked list?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Thinks:&lt;/i&gt; This job has nothing to do with linked lists.&amp;nbsp; In fact, I don't think anyone has used a singly liked list since the seventies.&amp;nbsp; I wonder if you're good at PHP and MySQL, because that's what all the work is here, but I'm not going to ask you anything about actual job requirements, because that doesn't afford me the opportunity to be pathologically pedantic.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;The Follow-Up&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Says:&lt;/i&gt; And how would you refine your solution to use O(n) time and O(1) space?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Thinks:&lt;/i&gt; I haven't actually solved this problem myself, nor have I formally proven the &quot;right&quot; answer to be correct. I have done no preparation for this interview beyond looking on the internet for programming interview questions.&amp;nbsp; I'm basically dead wood in this organization, and I pray every day that nobody figures this out.&amp;nbsp; As such, if you come up with the answer quickly, I'll either think you cheated and looked up programming interview questions on the internet, &lt;i&gt;or&lt;/i&gt; you're genuinely smart enough to expose my own uselessness should you get hired.&amp;nbsp; In either case, your best course of action here is to pretend like you don't know and let me explain the correct answer with a shit-eating grin on my face.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;The Bullshit&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Says:&lt;/i&gt; Where do you see yourself in five years?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Thinks:&lt;/i&gt; I have no idea what I'm doing.&amp;nbsp; I really need to keep this room from going dead-air, so I'll give you something to talk about.&amp;nbsp; Just start talking.&amp;nbsp; I really don't care, say anything.&amp;nbsp; I'm not listening.&amp;nbsp; I'm using this moment to think about the woman working in HR that I want to bone, but wants nothing to do with me because I'm an introverted nerd who will never work up the sack to ask her out.&amp;nbsp; Fuck my life.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Finally It's Over&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Says:&lt;/i&gt; Do you have any questions for me?&lt;br /&gt;&lt;br /&gt;&lt;i&gt;What The Interviewer Thinks:&lt;/i&gt; We're 35 minutes through a 45 minute interview.&amp;nbsp; If this doesn't take up ten minutes, I can blame ending the interview early on my clock being fast.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;So What Is A Good Interview?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For once, I'd like to have a 45 minute candid conversation with an interviewee.&amp;nbsp; Talk about interests, shoot the breeze, and get a general idea of the guy's technical aptitude and fit with the company.&amp;nbsp; Talk about projects, see if the guy gets animated.&amp;nbsp; Combine that with some pre-submitted code samples, and you can get a genuine idea of how suited the candidate is.&lt;br /&gt;&lt;br /&gt;I don't know about you, but I would not want to work for or with somebody who is this passive-aggressive.&lt;br /&gt;&lt;br /&gt;Asking pedantic and useless questions like this is just a waste of everyone's time.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Java subList Gotcha</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/09/java-sublist-gotcha.html"/>
   <updated>2008-09-13T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/09/java-sublist-gotcha</id>
   <content type="html">&lt;img alt=&quot;yogi.jpg&quot; src=&quot;/teddziuba/images/yogi.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;240&quot; width=&quot;320&quot; /&gt; &lt;div&gt;Don't ever try to do anything sneaky when you're programming.&amp;nbsp; It will always bite you in the ass.&amp;nbsp; If you still want to be sneaky, read the documentation.&lt;br /&gt;&lt;br /&gt;Last week, we had a problem with one of our processes hanging and burning 100% CPU.&amp;nbsp; The first time it happened we chalked it up to mysteries of the universe and restarted the process (a time-honored startup tradition), but the second time, I actually got off my ass and investigated.&lt;br /&gt;&lt;br /&gt;Through the miracle of &lt;code&gt;jstack&lt;/code&gt;, I could look at the stack trace of a currently running Java process.&amp;nbsp; This is what I found:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;

&lt;pre&gt;&quot;main&quot; prio=10 tid=0x0a030800 nid=0x10a6 runnable [0xb7d7d000..0xb7d82218]
   java.lang.Thread.State: RUNNABLE
     at java.util.SubList$1.nextIndex(AbstractList.java:713)
     at java.util.SubList$1.nextIndex(AbstractList.java:713)

...snip... about 100 lines

     at java.util.SubList$1.hasNext(AbstractList.java:691)
     at java.util.SubList$1.next(AbstractList.java:695)
     at java.util.SubList$1.next(AbstractList.java:696)

...snip... about 100 lines

     at com.pressflip.pipeline.standard.deduper.ShingleDupeDetector.&lt;br /&gt;        dedupBatch(ShingleDupeDetector.java:139)
     at com.pressflip.pipeline.standard.deduper.DeduperPipelineStep.&lt;br /&gt;        innerProcess(DeduperPipelineStep.java:115)

... and right down to the main() from here.&lt;br /&gt;&lt;/pre&gt;

&lt;p&gt;The suspect line in all this is ShingleDupeDetector.java:139, which is one of those &lt;i&gt;how-the-hell-are-you-hanging-on-this&lt;/i&gt; lines:&lt;/p&gt;

&lt;pre&gt;for (Integer x : someCollectionOfIntegers) {
&lt;/pre&gt;

&lt;p&gt;So what the shit, right?&lt;/p&gt;&lt;p&gt;I was using this collection as a cache of sorts, where on every run, I chopped some data off the front of it and added some data to the back, keeping the collection size constant.&amp;nbsp; To accomplish this, I used the &lt;code&gt;subList&lt;/code&gt; method on &lt;code&gt;java.util.List&lt;/code&gt;, something like this:&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
&lt;pre&gt;someCollectionOfIntegers = someCollectionOfIntegers.subList(fromIndex,
                                                    someCollectionOfIntegers.size());
someCollectionOfIntegers.addAll(incoming);
&lt;/pre&gt;

&lt;p&gt;Well it turns out that &lt;code&gt;subList&lt;/code&gt; didn't do what I thought it did.&amp;nbsp; I assumed that I just got a new &lt;code&gt;List&lt;/code&gt; that contained the elements in the given range of the original.&amp;nbsp; Oh no, &lt;code&gt;subList&lt;/code&gt; returns a &lt;i&gt;view&lt;/i&gt; of the original list where only elements in the given range are addressable.&amp;nbsp; A look at &lt;code&gt;AbstractList.java&lt;/code&gt;'s source reveals this:&lt;/p&gt;

&lt;pre&gt; 
public List&amp;lt;E&amp;gt; subList(int fromIndex, int toIndex) {
        return new SubList&amp;lt;E&amp;gt;(this, fromIndex, toIndex);
}
&lt;/pre&gt;

&lt;p&gt;And the &lt;code&gt;SubList&lt;/code&gt; object keeps a reference to &lt;code&gt;this&lt;/code&gt;, as well as an offset to know where iteration starts, so as I updated the &quot;cache&quot;, iterating over it became recursive.&amp;nbsp; Oh, balls.&amp;nbsp; That's why it's running slow.&lt;br /&gt;&lt;/p&gt; 
</content>
 </entry>
 
 <entry>
   <title>A Web OS?  Are You Dense?</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/09/a-web-os-are-you-dense.html"/>
   <updated>2008-09-06T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/09/a-web-os-are-you-dense</id>
   <content type="html">People are calling Google Chrome a &quot;Web Operating System&quot; and a &quot;Cloud Operating System&quot;.&amp;nbsp; Some are even calling it a Windows killer.&lt;br /&gt;&lt;br /&gt;I think it's time to nip this horseshit in the bud, before it gets out of hand.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;How Does Arringtons Know What Operating Systems Is?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;He doesn't.&amp;nbsp; It is TechCrunch's official position that Google Chrome will compete full on with Microsoft Windows, and computers will be sold with Chrome only, having the Windows layer &quot;stripped out&quot;.&amp;nbsp; I am not shitting you, &lt;a href=&quot;http://www.techcrunch.com/2008/09/01/meet-chrome-googles-windows-killer/&quot;&gt;he actually said that&lt;/a&gt;.&amp;nbsp; Yeah, I get where the argument is going about web apps being more dominant than desktop apps.&amp;nbsp; That prediction is a crock of shit.&amp;nbsp; A &lt;a href=&quot;http://www.techcrunch.com/2007/12/18/majority-of-americans-on-google-docs-what-you-talkin-bout-willis/&quot;&gt;2007 survey&lt;/a&gt; found that 73% of Americans have never even &lt;i&gt;heard&lt;/i&gt; of Google Docs, and 94% have never tried an online office suite.&amp;nbsp; Yeah, desktop apps aren't going anywhere.&lt;br /&gt;&lt;br /&gt;But I'm not here to talk shit on Web 2.0 today.&amp;nbsp; I'm going to present a glimpse of the hole that the incompetent programmers are digging for us.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;When Times Were Simple&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Let's have a look at the application stack that we all know and love: programs compiled to run in an environment with a C library.&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;normal-cropped.gif&quot; src=&quot;/teddziuba/images/normal-cropped.gif&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;245&quot; width=&quot;520&quot; /&gt;&lt;br /&gt;&lt;br /&gt;Fuck me, life is good.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Making It Easier On Programmers&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;I first learned to program in C++ and then later on I learned Java in college.&amp;nbsp; I thought the whole Java Runtime Environment thing was kind of weak, but if it means I don't have to manage memory, that's cool.&amp;nbsp; Same goes for Python, Ruby, and whatever else has its own VM or interpreter.&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;runtime-cropped.gif&quot; src=&quot;/teddziuba/images/runtime-cropped.gif&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;290&quot; width=&quot;527&quot; /&gt;&lt;br /&gt;&lt;br /&gt;This situation is pretty agreeable, and lets us prototype applications rapidly.&amp;nbsp; Sure, there's a small trade-off with execution speed, but they have multi-gigahertz processors nowadays.&amp;nbsp; No big deal.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Making It Easier On Idiots&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;After a while, everybody wanted to be a programmer.&amp;nbsp; Since programming is actually kind of hard, many of these folk landed in PHP and HTML, hence the explosion of webapps.&amp;nbsp; As such, the browser became a feeble example of a &quot;runtime&quot;.&lt;br /&gt;&lt;br /&gt;Now, with Google Chrome being lauded as a Web Operating System, the stack gets way bigger.&amp;nbsp; This is what it looks like on my computer, considering I run Linux and Google hasn't released their Operating System for the Linux Operating System (that makes sense, doesn't it?)&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;chrome-os2.gif&quot; src=&quot;/teddziuba/images/chrome-os2.gif&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;480&quot; width=&quot;640&quot; /&gt;&lt;br /&gt;&lt;br /&gt;Users have pretty basic needs when it comes to computers.&amp;nbsp; They want word processing, spreadsheets, communications, and games.&amp;nbsp; These needs have not changed much since the advent of the personal computer.&amp;nbsp; So, when your Aunt asks why her 1.2GHz computer isn't fast enough to run an online word processor that has the same fucking features as the 1987 version of Corel WordPerfect, you don't have an answer for her.&amp;nbsp; There is no justification.&lt;br /&gt;&lt;br /&gt;The &quot;Web Operating System&quot; just highlights how much journalists don't know about computers. &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Porter Stemming Makes Me Rage</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/07/porter-stemming-makes-me-rage.html"/>
   <updated>2008-07-23T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/07/porter-stemming-makes-me-rage</id>
   <content type="html">&lt;img alt=&quot;not_illegal_in_thailand.jpg&quot; src=&quot;/teddziuba/images/not_illegal_in_thailand.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;267&quot; width=&quot;334&quot; /&gt; &lt;div&gt;I have no formal training in natural language processing.&amp;nbsp; As such, I figure out a lot of this shit on my own.&lt;br /&gt;&lt;br /&gt;One of the simplest concepts in NLP/text mining is stemming.&amp;nbsp; If you're not in the know, to stem a word is to remove all the unnecessary shit after its root.&lt;br /&gt;&lt;br /&gt;For example, &quot;computer&quot;, &quot;computing&quot; and &quot;compute&quot; all stem to &quot;comput&quot;.&amp;nbsp; Same root, virtually the same meaning.&lt;br /&gt;&lt;br /&gt;Something like this is clearly useful in a search engine like Pressflip, because if somebody searches for &quot;iphone&quot; (and a &lt;i&gt;lot&lt;/i&gt; of you people are), the engine should pull up documents that contain the plural (iphones) of the word.&lt;br /&gt;&lt;br /&gt;The canonical algorithm for doing this sort of thing is called the Porter Stemming Algorithm, which considers each word on its own.&amp;nbsp; Porter works great 99% of the time, but when it fails, it fucks you &lt;i&gt;hard&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Why You Keep Tryin To Say That Word?&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A good example of this comes from the pressflip query logs.&amp;nbsp; A user searched for &quot;marketing&quot;.&amp;nbsp; Perfectly reasonable.&amp;nbsp; Porter stemmed that to &quot;market&quot;, which returned a bunch of search results about the Dow Jones and Nasdaq.&amp;nbsp; Ouch. Right in the butt.&lt;br /&gt;&lt;br /&gt;What went wrong?&amp;nbsp; In smart-talk, the bare infinitive that corresponds to the gerund has a different meaning than the gerund.&amp;nbsp; Again, I know dick-shit about NLP, so maybe you guys have a serious-business name for this sort of thing.&lt;br /&gt;&lt;br /&gt;So yeah, gerunds make Porter suck sometimes.&lt;br /&gt;&lt;br /&gt;There are some other failure cases I've discovered.&amp;nbsp; Proper nouns will give it to you Clydesdale-style, too.&amp;nbsp; More specifically, proper nouns that don't stem to themselves.&amp;nbsp; Example: &quot;Mariners&quot; and &quot;Marin&quot; both share the same stem.&amp;nbsp; So potentially, someone searching for the baseball team from Seattle will come up with news about the hoity-toity town across the Golden Gate Bridge from San Francisco.&lt;br /&gt;&lt;br /&gt;What's the answer to this?&amp;nbsp; If you're a company with millions in VC lottery winnings, you can pay Basistech $100,000 for a 3-year license of their context sensitive stemmer.&amp;nbsp; If you're me, though, you make exclusion lists.&amp;nbsp; Big ones.&lt;br /&gt;&lt;br /&gt;That being said, after a large re-processing this weekend, Pressflip search quality is going to improve.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Build Google Protocol Buffers Without Maven</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/07/build-google-protocol-buffers.html"/>
   <updated>2008-07-07T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/07/build-google-protocol-buffers</id>
   <content type="html">&lt;img alt=&quot;trippin_balls.jpg&quot; src=&quot;/teddziuba/images/trippin_balls.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;208&quot; width=&quot;225&quot; /&gt;Google released &lt;a href=&quot;http://code.google.com/p/protobuf/&quot;&gt;protocol buffers&lt;/a&gt; as open source, which, with a proper transport, will give both XML-RPC and &lt;a href=&quot;http://developers.facebook.com/thrift/&quot;&gt;Thrift&lt;/a&gt; a run for their money.&lt;br /&gt;&lt;br /&gt;Anyway, it's kind of a pain in the balls to build the Java version.&amp;nbsp; If you import the Java source into Eclipse, it's got all sorts of build errors, all stemming from a missing file: &lt;code&gt;DescriptorProtos.java&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;If you've got Maven installed, it will make &lt;code&gt;DescriptorProtos.java&lt;/code&gt; for you (this file is generated via &lt;code&gt;protoc&lt;/code&gt;).&amp;nbsp; But Maven is stupid, because it didn't work immediately after &lt;code&gt;apt-get install&lt;/code&gt; and I couldn't figure out how to fix it within 30 seconds.&amp;nbsp; I have no patience for this kind of bullshit.&lt;br /&gt;&lt;br /&gt;So, to build &lt;code&gt;DescriptorProtos.java&lt;/code&gt; without Maven, you make it by hand:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;protoc --java_out=/home/ted/some_directory \ &lt;br /&gt;/path/to/protobufsrc/src/google/protobuf/descriptor.proto&lt;/pre&gt;(You already compiled &lt;code&gt;protoc&lt;/code&gt;, didn't you?)&lt;br /&gt;&lt;br /&gt;Drop the output file into Eclipse and protocol buffers will build.&amp;nbsp; There are still a bunch of compilation warnings, but only chumps listen to those.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Corporate Competence</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/07/corporate-competence.html"/>
   <updated>2008-07-04T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/07/corporate-competence</id>
   <content type="html">&lt;img alt=&quot;1213940897512.jpg&quot; src=&quot;/teddziuba/images/1213940897512.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;360&quot; width=&quot;328&quot; /&gt; &lt;div&gt;I really love it when people &lt;i&gt;just do their jobs&lt;/i&gt;.&amp;nbsp; I feel gifted whenever I call a company and get a customer support representative who know what they are doing and actually cares about me.&lt;br /&gt;&lt;br /&gt;It's rare, but it happens.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Worst ISP Ever&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;For a while, I had Comcast's cable internet service.&amp;nbsp; It was clear after two years of putting up with their horseshit that they don't care about customers at all.&lt;br /&gt;&lt;br /&gt;Oh, wait, they set up a &lt;a href=&quot;http://twitter.com/comcastcares&quot;&gt;Twitter account&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Fantastic, but my BitTorrent shit still didn't work on their network.&amp;nbsp; Their installation staff is rude and has questionable hygeine, and their customer support representatives are downright lazy.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Switch to AT&amp;amp;T Now&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;When I moved, my first order of business was to call Comcast and tell them it's over.&amp;nbsp; They said my service wouldn't end until I brought back my cable modem, and of course, the place I need to bring it back to is only open during working hours.&lt;br /&gt;&lt;br /&gt;I took off work early to get this little brick of dissatisfaction back to its rightful owner, because fuck them.&lt;br /&gt;&lt;br /&gt;At the same time, I was waiting for AT&amp;amp;T to show up and install U-Verse internet service.&amp;nbsp; They did, and shit was &lt;i&gt;impressive&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;They told me the tech would be at my house any time from noon to 2pm on a Sunday.&amp;nbsp; The tech showed up at noon on the dot.&lt;/li&gt;&lt;li&gt;It took him about an hour to set up the service.&amp;nbsp; When he left, he gave me a card with his direct cell phone number.&amp;nbsp; If I had any problem in the next ten days, I called him directly and he would come fix it.&lt;/li&gt;&lt;li&gt;An hour after he left, the service went out.&amp;nbsp; I called him, and he was back at my house within 30 minutes.&amp;nbsp; It turns out there was something wrong with the line from the street to my house, and he had to get &lt;i&gt;another&lt;/i&gt; tech out to fix it.&amp;nbsp; That guy showed up, fixed the problem, and was on his way.&amp;nbsp; The two of them were at my place until 8pm on a Sunday until the job was done right.&lt;/li&gt;&lt;/ul&gt;I've been using the service for almost a week now and it's great.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;No BitTorrent fuckery.&amp;nbsp; All my torrents work great, and I can seed.&lt;/li&gt;&lt;li&gt;10 megabits downstream, 1.5 megabits upstream.&lt;/li&gt;&lt;/ul&gt;Great job, AT&amp;amp;T, you actually care about the people paying your salaries.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Practical Unique Identifiers</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/07/practical-unique-identifiers.html"/>
   <updated>2008-07-01T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/07/practical-unique-identifiers</id>
   <content type="html">&lt;img alt=&quot;dogs_love_md5.jpg&quot; src=&quot;/teddziuba/images/dogs_love_md5.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;230&quot; width=&quot;307&quot; /&gt;There have been a handful of places within the Persai pipeline where I have needed unique identifiers of varying length.&amp;nbsp; 64 bits here, 32 bits there.&amp;nbsp; I'm not the only one to ever have to solve this problem, but I could never find a concise toolbox of information on it.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;Automatic Increment or Not&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;MySQL has the AUTO_INCREMENT modifier for integral record keys.&amp;nbsp; That's great, if you're using MySQL.&amp;nbsp; In general, prefer a non-automatically increasing record identifier, unless you have a specific reason.&amp;nbsp; Here's why:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;You may actually have to think about thread synchronization at some point when creating records.&lt;/li&gt;&lt;li&gt;If these identifiers become publicly visible, they can leak information about how many records are in your database.&lt;/li&gt;&lt;li&gt;If you make identifiers out of other pieces of data (say URLs), then you can't get the identifier value of a given datum without a table lookup.&amp;nbsp; And even then, you'll need another index on &lt;i&gt;that&lt;/i&gt; field.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;There are a few cases where automatic increment identifiers are good, though:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;You are using a MySQL database and are setting up a simple structure of tables. (i.e. MySQL handles synchronization for you and it's actually harder to &lt;i&gt;not&lt;/i&gt; use automatic increment)&lt;/li&gt;&lt;li&gt;The creation order of records is really important to you, but not important enough to store a timestamp field.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;Making an Identifier Out Of Arbitrary Data&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Easy, right?&amp;nbsp; Just hash whatever data you've got.&amp;nbsp; It's not reversible and spread uniformly over the identifier space.&amp;nbsp; However, many times the output of a standard hashing algorithm is too big.&amp;nbsp; SHA-1, for example, is 160 bits wide.&amp;nbsp; Way too long for most purposes.&lt;br /&gt;&lt;br /&gt;In this case, I truncate the output.&amp;nbsp; Yes, this is mathematically valid, because any good hashing algorithm's output will be uniform over the range of the function.&amp;nbsp; And by uniform, I mean really uniform.&amp;nbsp; For example, if you take the first 64 bits of a 160-bit SHA-1 hash and call that your unique identifier, the probability of a collision is going to be uniform over the space of all 64-bit numbers.&amp;nbsp; If it wasn't (i.e. the first 64-bits of a SHA-1 hash were distributed, say normally), then the hash function would be cryptographically insecure.&lt;br /&gt;&lt;br /&gt;Don't try to swing your dick around and come up with your own hash function.&amp;nbsp; You'll screw it up.&amp;nbsp; I know I have.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;
GUIDs&lt;/b&gt;&lt;br /&gt;
&lt;br /&gt;
I got an e-mail from a reader about using GUIDs for unique identifiers.&amp;nbsp; This fits with the hashing scheme, but for the most part, I think GUIDs are far too large, especially if you are storing a lot of records.&amp;nbsp; GUIDs are 128 bits wide, so if you have a hundred million records, that's about 1.5GB worth of identifiers.&amp;nbsp; Use a 64-bit identifier, and your space is halved, without a significant increase in collision probability.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Making An Identifier Easier On The Eyes&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;If you need to put a unique identifier in a URL, it can't look too nerdy.&amp;nbsp; For example, this URL looks like shit:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;code&gt;http://www.website.com/document?id=1b25a53bf21d0206&lt;/code&gt;&lt;br /&gt;&lt;/blockquote&gt;Too many numbers.&amp;nbsp; So, to make it look better, Base-64 encode it.&amp;nbsp; It will lengthen the code a little, but it's much easier to look at:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;code&gt;http://www.website.com/document?id=ZnJvc3RlZCBidXR0cw==&lt;/code&gt;&lt;br /&gt;&lt;/blockquote&gt;Eh, well it looks better to me.&amp;nbsp; Personal taste, I guess.&lt;br /&gt;&lt;br /&gt;You'll need to make sure that your Base-64 alphabet doesn't include the + and / characters: they aren't URL safe.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Sort Orderings&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Don't worry about sort ordering unless you have to worry about sort ordering.&amp;nbsp; Duh.&amp;nbsp; The vast majority of Persai's data is stored simply as files, and for most purposes we don't have to care about the processing order.&amp;nbsp; We're fortunate in that regard (well maybe not &lt;i&gt;fortunate&lt;/i&gt;, I mean that's like saying you're &lt;i&gt;fortunate&lt;/i&gt; that you're not fat because you exercise and eat sensibly).&lt;br /&gt;&lt;br /&gt;Anyway, there are a couple of places in Persai where sort order matters.&amp;nbsp; The ordering of recommendations, for example.&amp;nbsp; There, though, we're just ordering by time, and we need to display the exact time, not just the relative times of the recommendations, so we store a date field and order data by it in the store.&lt;br /&gt;&lt;br /&gt;This drives one of my earlier points home: &lt;i&gt;if you need ordering by time, don't count on an automatic increment unique identifier to do it&lt;/i&gt;.&amp;nbsp; It's much more robust to store a timestamp.&lt;br /&gt;&lt;br /&gt;In fact this point goes deeper.&amp;nbsp; Very rarely do you actually need records sorted by record identifier.&amp;nbsp; What you need is the records sorted by some other value that happens to be reflected in the record identifier by virtue of automatic increment and the insertion order.&amp;nbsp; It's always more robust to store the actual value you need to sort by.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;I'm Not Going To Tell You How To Write Code&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Because I don't really care.&amp;nbsp; This is how I do it, though.&lt;br /&gt; &lt;div&gt;&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>An Engineer's Guide To Weight Loss</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/05/an-engineers-guide-to-weight-l.html"/>
   <updated>2008-05-20T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/05/an-engineers-guide-to-weight-l</id>
   <content type="html">&lt;img alt=&quot;1208418061380.jpg&quot; src=&quot;/teddziuba/images/1208418061380.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;249&quot; width=&quot;200&quot; /&gt;After a year of the Google cafeteria and another year of eating low cost Startup FounderChow, I put on a few pounds.&amp;nbsp; Now, I'm starting to shed them, and&amp;nbsp; I'll tell you how.&lt;br /&gt;&lt;br /&gt;Before I get into it, I want to lay down a few prerequisites.&amp;nbsp; There are a lot of diet guides out there that will bullshit you into thinking that the process is easy.&amp;nbsp; This is a lie.&amp;nbsp; &lt;b&gt;Dieting and exercising suck.&amp;nbsp; This is possibly the most miserable thing you can do to yourself.&lt;/b&gt;&amp;nbsp; You are not going to have fun.&lt;br /&gt;&lt;br /&gt;To that end, if you are more than 50 pounds overweight, are unmarried, have no children, and your only reason to get up in the morning is your shitty software job, the healthy lifestyle is not for you.&amp;nbsp; You are better off eating yourself to the grave: you will get much more satisfaction out of life by eating cheeseburgers than you will by torturing the pounds of fat off your gut.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Simple I/O Operation&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The science of weight loss is simple: eat fewer calories than your burn.&amp;nbsp; You have heard this before, I trust.&amp;nbsp; To follow this principle, you need to start quantifying.&amp;nbsp; I use a web service called &lt;a href=&quot;http://www.fitday.com/&quot;&gt;FitDay&lt;/a&gt; to track the calories I eat versus the calories I burn.&lt;br /&gt;&lt;br /&gt;Start by running a 1,000 calorie per day deficit.&amp;nbsp; To lose a pound of fat, you need to burn around 3,500 calories, so you'll lose two pounds in a week.&amp;nbsp; Just to be clear, &lt;b&gt;doing this sucks ass&lt;/b&gt;.&amp;nbsp; However, there are a few easy ways to trim calories here and there.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Go Easy On The Drinking&lt;/i&gt;&lt;br /&gt;When I go out on a Friday night to have a few beers, it's not hard for me to consume 800 calories worth of booze.&amp;nbsp; Yes, liquor helps to numb the pain of writing XML parsers all day, but it comes at an expense.&amp;nbsp; To compensate, take up smoking.&amp;nbsp; I smoke more cigars now: it's a good zero-calorie alternative.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Eat One Serving&lt;/i&gt;&lt;br /&gt;There are eight servings in a box of Barilla pasta.&amp;nbsp; I used to eat half a box of pasta in a single sitting, 4 servings worth.&amp;nbsp; Since you're counting your calories anyhow, you'll already be monitoring servings.&amp;nbsp; You will also spend less money this way: since I started counting my calories, I've been spending 50% less per week on food.&amp;nbsp; More money for cigars.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Drink More Coffee&lt;/i&gt;&lt;br /&gt;Caffeine is an appetite suppressant.&amp;nbsp; In large enough quantities, it can be used as an amphetamine.&amp;nbsp; Drink up.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;Ice Cream Keeps You Sane&lt;/i&gt;&lt;br /&gt;Low-fat ice cream has around 120 calories per half cup.&amp;nbsp; Fat calories keep you feeling satiated for longer.&amp;nbsp; The Dreyer's brand (sold as Edy's on the east coast) doesn't suck that much.&lt;br /&gt;&lt;br /&gt;After you have been limiting your calorie intake for two weeks, your stomach will shrink enough that it takes significantly less food to satisfy you.&amp;nbsp; So that's step one: stop eating so damn much.&amp;nbsp; Step two is exercise.&amp;nbsp; And yes, it's awful.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;See What Condition My Condition Was In&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The only real &lt;i&gt;benefit&lt;/i&gt; to exercise is being able to hold your nose over people who don't exercise.&amp;nbsp; That's pretty cool if you're looking to take your misery out on your co-workers.&amp;nbsp; Protip: it's better for your general well-being to be a prick to your colleagues than to your family.&lt;br /&gt;&lt;br /&gt;You will lose more weight by dieting than by exercise if you are eating 1,000 fewer calories per day than you burn by doing nothing, so use exercise only as a supplement to your calorie loss.&lt;br /&gt;&lt;br /&gt;If you're going to exercise, use an elliptical machine.&amp;nbsp; Treadmills are terrible: they make you run.&amp;nbsp; If you're like me, you have horrific flashbacks of being 10 year old, sucking wind, being the one that got nailed by the cops because your friends were all physically fit and managed to get away.&amp;nbsp; Failure.&lt;br /&gt;&lt;br /&gt;If you're going to pussyfoot around and work out for 30 minutes in your &quot;fatburn&quot; zone&amp;nbsp; three times a week, don't even bother.&amp;nbsp; You're just wasting your time.&amp;nbsp; One hour per day, hard.&amp;nbsp; You should be close to vomiting by the end of that hour.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Well, That's It&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Easy, huh.&amp;nbsp; Stop eating so damn much and get off your fat lazy ass.&lt;br /&gt; 
</content>
 </entry>
 
 <entry>
   <title>Machine Learning Is Not As Cool As It Sounds</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/05/machine-learning-is-not-as-coo.html"/>
   <updated>2008-05-14T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/05/machine-learning-is-not-as-coo</id>
   <content type="html">I don't like to talk about my job.&amp;nbsp; Don't get me wrong, I like what I do, I just don't like having to explain things to people who are feigning interest.&amp;nbsp; It's a waste of everyone's time.&lt;br /&gt;&lt;br /&gt;When I do have to go into more detail than &quot;I write software&quot;, I sex it up by saying &quot;I write artificial intelligence software for recommendation systems&quot;.&amp;nbsp; Sounds pretty awesome when you say it like that, huh?&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Truthfully, that's like describing a summer job at Burger King as &quot;caloric energy distribution engineer&quot;.&lt;br /&gt;&lt;br /&gt;Yes, one of the things I do is implement machine learning methods for a news recommendation system.&amp;nbsp; The prerequisite amount of pain-in-the-ass, why-did-I-go-to-college-for-this work, though, dominates the cool stuff.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Vector Space Model AI... Sounds Hot&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The idea here is that you turn your data into N-dimensional vectors and let loose a bunch of linear algebra on that shit.&amp;nbsp; In return, you get stuff like classification and clustering.&amp;nbsp; If you want to sound like you know what you're talking about here, you can mention stuff like &lt;i&gt;separating hyperplane&lt;/i&gt;, &lt;i&gt;sigmoid kernel function&lt;/i&gt;, or &lt;i&gt;k-means++&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;I deal mostly in the vector space model.&amp;nbsp; As awesome as all of this sounds, most of the work is a real pain in the balls.&amp;nbsp; Writing a sequential minimum optimization routine for a support vector machine is a good exercise, but it's not useful in practice.&amp;nbsp; Somebody else has already written it for me, and besides, that's not the problem I need to be concentrating on.&lt;br /&gt;&lt;br /&gt;Most of the methods that deal with VSM machine learning are well defined and fairly easy to implement.&amp;nbsp; What remains a mystery, though, is the generation of the input.&amp;nbsp; How you translate your data into vectors is &lt;b&gt;the most important problem to solve&lt;/b&gt;.&amp;nbsp; It's also the most boring. After that, you can worry about shaving 3 nanoseconds off of your dotproduct routine.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Then You Need To Deal With The Academics&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Publish or perish.&amp;nbsp; Yeah, that's cute, but in the real world it's profit or perish, and that means getting useful results.&amp;nbsp; Academics love machine learning because it affords them the opportunity to make slight variations to undergraduate level mathematical procedures, quantify the result, and write it up in LaTeX with graphs and shit.&lt;br /&gt;&lt;br /&gt;It's easy to write a paper showing the effects on precision and recall for a perceptron classifier on the Reuters corpus using normalized vs. non-normalized vectors.&amp;nbsp; It's not easy to generate data as clean as the Reuters corpus from a web crawl.&amp;nbsp; Not only is this task hard, it's about as much fun as chemotherapy.&amp;nbsp; As such, there are no useful papers coming out of academia about how to parse HTML.&amp;nbsp; Unfortunately, problems like these are the ones that need solving.&lt;br /&gt;&lt;br /&gt;When I talked earlier about all of the prerequisite bullshit, this is what I meant.&amp;nbsp; You get the most testicular pain when dealing in text content, and the real deep-rooted ball ache comes from web content.&amp;nbsp; We put a ton of effort into our HTML parsing routines, and it has paid off.&lt;br /&gt;&lt;br /&gt;For reference, altering a method that helped with removing boilerplate content from a web page (&lt;i&gt;boring&lt;/i&gt;) had a greater benefit to the accuracy of our classifier than did dimensionality reduction and normalization combined (&lt;i&gt;sexy&lt;/i&gt;).&amp;nbsp; If you're not picking up what I'm putting down here, I'm saying that the really hard and less &lt;i&gt;science-y&lt;/i&gt; improvements made the machine learning better than any of the shit you would read about in an ACM journal.&lt;br /&gt;&lt;b&gt;&lt;br /&gt;This Is Going Somewhere I Promise&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This really is just a buttsore blog post, but I'm on a roll now.&lt;br /&gt;&lt;br /&gt;When I am working with the machine learning part of my job, I am rarely working in my development environment.&amp;nbsp; Most of the real stuff is done in Excel.&amp;nbsp; Well, at least it used to be, until I figured out that &lt;a href=&quot;http://www.r-project.org/&quot;&gt;GNU R&lt;/a&gt; is so awesome it makes me want to fuck myself up with a chainsaw.&lt;br /&gt;&lt;br /&gt;When I make a change to the inputs of a machine learning method (support vector machine in this case), I need to verify that the change I just made was actually positive.&amp;nbsp; And since that can't be done with a JUnit test, I have to get all scientific-method on that shit.&amp;nbsp; Remember in college when you snoozed through advanced statistics because it sucked?&amp;nbsp; Yeah, me too.&amp;nbsp; Good thing I kept the book.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>Weekend Science Project</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/05/weekend-science-project.html"/>
   <updated>2008-05-10T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/05/weekend-science-project</id>
   <content type="html">I built an HD antenna today.&amp;nbsp; Using &lt;a href=&quot;http://www.metacafe.com/watch/762088/coat_hanger_hdtv_antenna_better_than_store_bought_amazing/&quot;&gt;these instructions&lt;/a&gt; and less than $15 worth of materials, I can get a few local channels in HD over the air.&amp;nbsp; Check this sucker out:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;antenna.JPG&quot; src=&quot;/teddziuba/images/antenna.JPG&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;346&quot; width=&quot;461&quot; /&gt;&lt;br /&gt;&lt;div&gt;Ugly as hell, but I can watch Lost in HD without paying Comcast an extra dime.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>JavaFX Native Look and Feel</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/05/javafx-native-look-and-feel.html"/>
   <updated>2008-05-10T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/05/javafx-native-look-and-feel</id>
   <content type="html">I have been toying around with JavaFX, Sun's answer to Adobe AIR and Microsoft Silverlight.&amp;nbsp; Since JavaFX is pretty much an easy way to do Swing, you can get Swing's pluggable look and feel in Java FX programs.&amp;nbsp; Thank Christ, because the Swing UI components look like shit:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;demo-swing.png&quot; src=&quot;/teddziuba/images/demo-swing.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;90&quot; width=&quot;512&quot; /&gt;&lt;br /&gt; &lt;div&gt;Versus the GTK look and feel:&lt;br /&gt;&lt;br /&gt;&lt;img alt=&quot;demo-gtk.png&quot; src=&quot;/teddziuba/images/demo-gtk.png&quot; class=&quot;mt-image-center&quot; style=&quot;margin: 0pt auto 20px; text-align: center; display: block;&quot; height=&quot;94&quot; width=&quot;477&quot; /&gt;&lt;br /&gt;This was done by adding the following snippet to my JavaFX code:&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;pre&gt;&lt;code&gt;
import javax.swing.UIManager;
 
 UIManager.setLookAndFeel(
            UIManager.getSystemLookAndFeelClassName()); &lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;Oh neat, native UI components. 
</content>
 </entry>
 
 <entry>
   <title>Eclipse Crashes in Ubuntu Hardy Heron</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/04/eclipse-crashes-in-ubuntu-hard.html"/>
   <updated>2008-04-26T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/04/eclipse-crashes-in-ubuntu-hard</id>
   <content type="html">I just upgraded my workstation to Hardy and got a SIGSEGV when starting Eclipse.&amp;nbsp; It appears to be a bug in the Sun JVM that ships with Hardy, and it only happens on AMD64.&lt;br /&gt;&lt;br /&gt;If you're set on using this runtime, the fix is to disable the JIT compiler by launching Eclipse with -Xint, but that's comes with a severe performance penalty.&lt;br /&gt;&lt;br /&gt;The fix I used was to simply downgrade the Hardy JRE (6-06-0ubuntu1) to the Gutsy version (6-03-0ubuntu2).&amp;nbsp; You'll have to edit /etc/apt/sources.list to add a Gutsy repository.&lt;br /&gt;&lt;br /&gt;I'm pretty sure this is the Sun bug: &lt;a href=&quot;http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6614100&quot;&gt;http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6614100&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And this is the Ubuntu bug: &lt;a href=&quot;https://bugs.launchpad.net/ubuntu/+source/eclipse/+bug/174759&quot;&gt;https://bugs.launchpad.net/ubuntu/+source/eclipse/+bug/174759&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>I'm Going To Scale My Foot Up Your Ass</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/04/im-going-to-scale-my-foot-up-y.html"/>
   <updated>2008-04-24T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/04/im-going-to-scale-my-foot-up-y</id>
   <content type="html">&lt;img alt=&quot;1205210029413.jpg&quot; src=&quot;/teddziuba/images/1205210029413.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;166&quot; width=&quot;399&quot; /&gt; &lt;div&gt;Engineers love to talk about scalability.&amp;nbsp; It makes us feel like the bad ass, dick-swingin' motherfuckers that we wish we could be.&lt;br /&gt;&lt;br /&gt;After we talk about scalability with our co-workers (&lt;i&gt;Yeah, Rails doesn't scale!&lt;/i&gt;), we flex our true engineering prowess by writing a post about it on our blog.&amp;nbsp; Once that post hits Reddit, son, everyone will know &lt;i&gt;how hardcore&lt;/i&gt; you really are.&amp;nbsp; Respect.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;People Who Talk Big About Scalability Don't Need To Worry About It&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Fact:&amp;nbsp; every chest-thumping blog post I have seen written about scalability is either about architecture, Memcached, or both.&amp;nbsp; Some asshole who writes shitty code starts pontificating about &lt;i&gt;&quot;scalable architecture&quot;&lt;/i&gt; with data storage, web frontends, whatever-the-fuck.&amp;nbsp; Dude, your app isn't having scalability problems because of the &lt;i&gt;architecture&lt;/i&gt;.&amp;nbsp; It's having scalability problems because you coded a ton of N^2 loops into it and you're too self-important to get peer reviews on your commits.&lt;br /&gt;&lt;br /&gt;And let's not forget the tools who discover Memcached for the first time, install it on a web server, and notice how fast their app runs now.&amp;nbsp; Yeah, welcome to the modern age.&amp;nbsp; Hope you know what a cache expiry policy is.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;If You Haven't Discussed Capacity Planning, You Can't Discuss Scalability&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You don't need to worry about scalability on your Rails-over-Mysql application because nobody is going to use it.&amp;nbsp; Really.&amp;nbsp; Believe me.&amp;nbsp; You're going to get, at most, 1,000 people on your app, and maybe 1% of them will be 7-day active.&amp;nbsp; Scalability is not your problem, getting people to give a shit is.&lt;br /&gt;&lt;br /&gt;Unless you know what you need to scale &lt;i&gt;to&lt;/i&gt;, you can't even begin to talk about scalability.&amp;nbsp; How many users do you want your system to handle? A thousand?&amp;nbsp; Hundred thousand? Ten million?&amp;nbsp; Here's a hint: the system you design to handle a quarter million users is going to be different from the system you design to handle ten million users.&lt;br /&gt;&lt;br /&gt;Of course you'll point to the engineer's wet dream: linear scalability.&amp;nbsp; &lt;i&gt;Lulz but when we get more users we just add more machines you are so stupid ted. uncov sucks.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Yeah, great, well it doesn't exist.&amp;nbsp; Oh no, go ahead and try out Amazon SimpleDB and think to yourself that it will scale linearly.&amp;nbsp; Then, when you get enough users that the latency becomes a problem, blame it on &quot;those shitty Amazon datacenters&quot;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Choosing Technology Don't Mean Shit If You Don't Know How To Use It&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The most common butthurt about scalability is this:&amp;nbsp; choose a technology.&amp;nbsp; If you like the technology, claim &lt;i&gt;&quot;technology X scales better!&quot;&lt;/i&gt; If you don't like it, claim &lt;i&gt;&quot;technology X doesn't scale!&quot;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Saying &quot;Rails doesn't scale&quot; is like saying &quot;my car doesn't go infinitely fast&quot;.&amp;nbsp; Alternatively, saying &quot;We'll have no problems scaling because we're using Django&quot; is like saying &quot;I will win every race because my car is the most powerful&quot;.&amp;nbsp; Maybe so, but you suck at driving, and you're up against professionals.&lt;br /&gt;&lt;br /&gt;If you're having scalability problems and blaming it on a single technology, chances are, you're doing it wrong.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;tl;dr&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Shut up about scalability, no one is using your app anyway.&lt;br /&gt;&lt;/div&gt; 
</content>
 </entry>
 
 <entry>
   <title>Don't Serialize Java Objects In Hadoop SequenceFiles</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/04/dont-serialize-java-object-in.html"/>
   <updated>2008-04-08T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/04/dont-serialize-java-object-in</id>
   <content type="html">Not if you can avoid it, at least.&lt;br /&gt;&lt;br /&gt;Hadoop provides you with the &lt;code&gt;Writable&lt;/code&gt; interface if you want to write your object to a &lt;code&gt;SequenceFile&lt;/code&gt;.&amp;nbsp; It's up to you to implement the &lt;code&gt;write()&lt;/code&gt; and &lt;code&gt;readFields()&lt;/code&gt; methods for your object.&amp;nbsp; It's easy if your object is simple: just write each of your instance variables to a &lt;code&gt;DataOutput&lt;/code&gt; and read them back in the same order from a &lt;code&gt;DataInput&lt;/code&gt;.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Don't Write Your Object As A Serialized Byte Array&lt;br /&gt;&lt;/b&gt;&lt;br /&gt;I got lazy when I was implementing the Writable interface with one of our classes because it had a ton of instance variables.&amp;nbsp; I figured I'd just serialize it to a byte array, then write the array length and the whole array to the DataOutput.&amp;nbsp; And on the read, well, just unserialize the object from the byte array.&amp;nbsp;&amp;nbsp; This was my &lt;code&gt;write()&lt;/code&gt;:&lt;br /&gt; 

&lt;pre&gt;&lt;code&gt;
@Override
public void write(DataOutput out) throws IOException {
	ByteArrayOutputStream byteOutStream = new ByteArrayOutputStream();
	ObjectOutputStream objectOut = new ObjectOutputStream(byteOutStream);
		
	objectOut.writeObject(getContainedObject());
	objectOut.close();
		
	byte[] serializedObject= byteOutStream.toByteArray();
		
	out.writeInt(serializedObject.length);
	out.write(serializedModel);

}
&lt;/code&gt;&lt;/pre&gt;

Naw, dude.  Bad idea.&lt;br /&gt;&lt;br /&gt;I knew that I'd be paying some overhead in both space and time for this little scheme, but I didn't know how much.&amp;nbsp; It was just a little bit per object, but when we started seeing MapReductions take way too much time in I/O, it was time to revisit this.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;What This Cost In Space And Time&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;First, the Java serialization space overhead.&amp;nbsp; On a toy example of this object, serialization to a byte array used 953 bytes.&amp;nbsp; Properly writing out the instance variables consumed 296 bytes.&amp;nbsp; In production, doing it the right way shrunk a 1,600-record &lt;code&gt;SequenceFile&lt;/code&gt; from 1.4GB to 825MB.&lt;br /&gt;&lt;br /&gt;Time savings were great, too.&amp;nbsp; In the same toy example, it took my JVM 7.2 milliseconds to serialize the object and 1.7 milliseconds to unserialize.&amp;nbsp; Doing with with stream I/O only took 76,000 nanoseconds to serialize, 58,000 nanoseconds to unserialize.&lt;br /&gt;&lt;br /&gt;I love order-of-magnitude improvements.&lt;br /&gt;&lt;br /&gt;Lesson learned: get off your lazy ass and do it right.&lt;br /&gt; 
</content>
 </entry>
 
 <entry>
   <title>Plugged in the New York Times</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/03/plugged-in-the-new-york-times.html"/>
   <updated>2008-03-20T00:00:00-07:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/03/plugged-in-the-new-york-times</id>
   <content type="html">Me, Persai, and Uncov got a &lt;a href=&quot;http://www.nytimes.com/2008/03/20/technology/personaltech/20basics.html&quot;&gt;plug in the New York Times&lt;/a&gt; today.&amp;nbsp; We've been in VentureBeat, Slate, and now NYT, but not TechCrunch.&amp;nbsp; Something tells me that's not an accident.  
</content>
 </entry>
 
 <entry>
   <title>A Magic Elixir</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/02/a-magic-elixir.html"/>
   <updated>2008-02-27T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/02/a-magic-elixir</id>
   <content type="html">&lt;img alt=&quot;piglet.jpg&quot; src=&quot;/teddziuba/images/piglet.jpg&quot; class=&quot;post-lead-image&quot; style=&quot;margin: 0pt 0pt 20px 20px; float: right;&quot; height=&quot;253&quot; width=&quot;338&quot; /&gt;In software, there are no silver bullets.&amp;nbsp; In internal combustion engine mechanics, however, there are plenty. And I just discovered one.&lt;br /&gt;&lt;br /&gt;It's called &lt;a href=&quot;http://www.amazon.com/Sea-Foam-Marine-Motor-Treatment/dp/B0002ZVMQO&quot;&gt;Sea Foam&lt;/a&gt;, and it will cure what ails 'ya.&lt;br /&gt;&lt;br /&gt;My wife's first motorcycle was a Honda Rebel 250.&amp;nbsp; She upgraded too late in the season and couldn't sell the starter before winter showed up.&amp;nbsp; Winter time is a dead zone for the used motorcycle market in the San Francisco Bay Area, so the Rebel sat in the parking garage for 5 months.&amp;nbsp; Being lazy, I didn't properly store it.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;We went to fire it up yesterday to prepare it for its 15 minutes of Craigslist fame, and it wouldn't turn over.&amp;nbsp; Gasoline, if left long enough, will degrade into a mucky varnish that cakes the inside of your carburetors.&lt;br /&gt;&lt;br /&gt;I poured half a can of Sea Foam into the tank and let it sit for a few minutes.&amp;nbsp; I cranked it again and it made a few pathetic putts.&amp;nbsp; A few more cranks, a few more putts, but after about 5 tries, the Rebel roared to life.&lt;br /&gt;&lt;br /&gt;A six dollar bottle of some petroleum distillate has the same end effect as a three hundred dollar carburetor job.&lt;br /&gt;&lt;br /&gt;I am detecting much win in this sector.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>Core Dumps Disabled By Default In Ubuntu</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/02/core-dumps-disabled-by-default.html"/>
   <updated>2008-02-19T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/02/core-dumps-disabled-by-default</id>
   <content type="html">Enable them with this command:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;ulimit -c unlimited&lt;/code&gt;&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>The Road To Hell Is 64 Bits Wide</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/02/the-road-to-hell-is-64-bits-wi.html"/>
   <updated>2008-02-14T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/02/the-road-to-hell-is-64-bits-wi</id>
   <content type="html">Java is an awesome language because you get to ignore hard stuff like memory allocation.&amp;nbsp; Write once, run anywhere.&amp;nbsp; Sweet, where do I sign up?&lt;br /&gt;&lt;br /&gt;The privilege of not having to manage memory comes at a cost: you aren't allowed to question how the JVM works.&amp;nbsp; Move along, coder.&amp;nbsp; Keep making those objects.&amp;nbsp; Don't ask how much memory things take up. In fact, to keep you from getting curious, we're not even going to have a &lt;code&gt;sizeof&lt;/code&gt; function.&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;&lt;br /&gt;How My Complacency Made Me Fail&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;When you're coding in Java, it's easy to buy into this mentality.&amp;nbsp; You never really have to worry about how much space anything takes up, and if you get an &lt;code&gt;OutOfMemoryError&lt;/code&gt;, just give the JVM more memory.&amp;nbsp; Problem solved.&lt;br /&gt;&lt;br /&gt;But there are times when you need to be conscious of how the JVM actually works.&amp;nbsp; For example, when you're trying to squeeze every bit of performance out of the crappiest of machines, such the small Amazon EC2 instances (where Persai is hosted).&lt;br /&gt;&lt;br /&gt;Here's a real world example of how I got burned:&lt;br /&gt;&lt;br /&gt;Persai does a lot of work with high-dimensionality, sparse vectors.&amp;nbsp; To save space, we compact the vectors.&amp;nbsp; Since most of the values in the vectors are zero, we simply do not store them.&amp;nbsp; What we store amounts to a list of the nonzero element indices and the corresponding values.&amp;nbsp; This is our basic data structure:&lt;br /&gt;&lt;br /&gt; 
&lt;pre&gt;class sparseNode {
    public int index;
    public double value;
}
&lt;/pre&gt;
So a vector is an array of &lt;code&gt;sparseNode&lt;/code&gt; objects. Sounds easy enough, and for a while, it was.&amp;nbsp; That is until I was tasked with storing as many of these  in memory as I could.&lt;br /&gt;&lt;br /&gt;Where's the fail here?&amp;nbsp; How big is an &lt;code&gt;int&lt;/code&gt; primitive in Java? 32 bits, right?&amp;nbsp; Sort of.&amp;nbsp; The Java specification says that an implementation must provide 32 bits of workable space for the programmer using an &lt;code&gt;int&lt;/code&gt;, but makes no mention of how how the virtual machine must store this variable.&lt;br /&gt;&lt;br /&gt;In Sun's HotSpot JVM, object storage is aligned to the nearest 64-bit boundary.&amp;nbsp; On top of this, every object has a 2-word header in memory.&amp;nbsp; The JVM's word size is usually the platform's native pointer size.&amp;nbsp; Alright, two words for the object header, one word for the &lt;code&gt;int&lt;/code&gt;, two words for the double.&amp;nbsp; That's 5 words: 160 bits.&amp;nbsp; Because of the alignment, this object will occupy 192 bits of memory.&amp;nbsp; Effectively, the &lt;code&gt;int&lt;/code&gt; value is taking 64 bits!&amp;nbsp; In an array of these things, I've wasted N times 32 bits.&amp;nbsp; Figure, a typical vector is about 200 elements long, so that's 800 bytes out the window for each one.&lt;br /&gt;&lt;br /&gt;This fun fact would have been good to know when I was doing my initial back of the envelope calculation of how many vectors I can fit in a gigabyte of memory.&lt;br /&gt;&lt;br /&gt;Yes, I know I should be complaining about the same thing when using C structs.&amp;nbsp; But you know what?&amp;nbsp; When you learn C, you are introduced to many harsh realities.&amp;nbsp; When you learn Java, you are introduced to XML.&amp;nbsp; They protect you from the hard things.&amp;nbsp; Live and learn, I guess.&lt;br /&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;The Fix&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;After knowing this, it was easy to rescue myself.&amp;nbsp; Java primitive arrays fall to the same alignment issue, but in our case, they can help solve the problem.&amp;nbsp; Instead of representing a vector as an &lt;i&gt;array of objects&lt;/i&gt;, we'll represent a vector as an &lt;i&gt;object of arrays&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;

&lt;pre&gt;class sparseVector {
  public int[] indicies;
  public double[] values;
}
&lt;/pre&gt;

This way, we're going to lose at most 4 bytes per vector with the alignment of the &lt;code&gt;int[]&lt;/code&gt; array.&amp;nbsp; This sure beats the ~800 byte loss with the other solution.&lt;br /&gt; 
</content>
 </entry>
 
 <entry>
   <title>Why Machine Learning Isn't Getting Any Easier</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/02/why-machine-learning-isnt-gett.html"/>
   <updated>2008-02-02T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/02/why-machine-learning-isnt-gett</id>
   <content type="html">I learned the basics of machine learning in college.&amp;nbsp; Classifiers, clustering, all that jazz.&amp;nbsp; Every undergrad computer science major loves this crap because they can understand it with a passing knowledge of linear algebra.&amp;nbsp; &lt;i&gt;Machine learning&lt;/i&gt;.&amp;nbsp; Sounds pretty sexy.&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;&lt;br /&gt;In Practice, It's A Lot Harder Than What You Did In College&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;Up front, let me say that I implemented the vast majority of the machine learning technology behind &lt;a href=&quot;http://www.persai.com/&quot;&gt;Persai&lt;/a&gt;.&amp;nbsp; In the beginning, I thought it was going to be a breeze.&amp;nbsp; Take some documents, turn them into features, train a classifier, and off you go.&amp;nbsp; The harsh reality is that less than one percent of the time I spent on this system went into the &quot;sexy part&quot; of machine learning, and most of &lt;i&gt;that&lt;/i&gt; was done by the guy who wrote the SVM library we use!&lt;br /&gt;&lt;br /&gt;The lion's share of time, and the source of most of the hair-pulling, was spent dealing with the data.&amp;nbsp; Data coming in off of the open internet is &lt;i&gt;dirty&lt;/i&gt;.&amp;nbsp; Conflicting character set declarations, boilerplate removal, duplicate detection: these things will drive you to insanity and back.&lt;br /&gt;&lt;br /&gt;I was fooled by the simplicity when I first learned this stuff.&amp;nbsp; This is what they teach you about vector space model based classifiers:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Turn your data into vectors.&lt;/li&gt;&lt;li&gt;Specify the positive and negative samples.&lt;/li&gt;&lt;li&gt;Train your classifier.&lt;/li&gt;&lt;li&gt;Tune your vectorization scheme and classifier parameters until the classifier is good.&lt;/li&gt;&lt;/ol&gt;What they don't teach you is this: &lt;b&gt;Step 1 is a &lt;i&gt;bitch&lt;/i&gt;&lt;/b&gt;.&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;&lt;br /&gt;Publish Or Perish&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;If turning data into vectors is such a hard problem, why aren't the academics churning out papers about it?&amp;nbsp; Because it's not sexy.&amp;nbsp; There are no numerical nuances to deciding how to handle a document whose declared character set is ISO-8859-1 but is actually encoded in UTF-8.&amp;nbsp; There's no Turing award coming your way for finding a way to make reasonable text out of horrifically malformed HTML that makes you curse Firefox and Internet Explorer for accepting as renderable.&lt;br /&gt;&lt;br /&gt;When I started Persai, I admitted that somebody else has already done the mathematical programming better than I ever could.&amp;nbsp; I didn't spend years of my life studying numerical analysis, so chances are, if I attempted to write my own SVM library, I would fail.&amp;nbsp; So, in the interest of success and avoiding Not-Invented-Here syndrome, I used somebody else's library.&lt;br /&gt;&lt;br /&gt;People have busted my chops for this, too, as if I am somehow less of an engineer if I use a third-party library.&amp;nbsp; However, one thing has become painfully obvious: &lt;i&gt;the quality of a classifier depends much, much more on your ability to sanitize data than on the algorithm you use&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;  
</content>
 </entry>
 
 <entry>
   <title>Hermetic RPC Unit Testing With Thrift and jMock</title>
   <link href="http://widgetsandshit.com/teddziuba/2008/02/hermetic-rpc-unit-testing-with.html"/>
   <updated>2008-02-01T00:00:00-08:00</updated>
   <id>http://widgetsandshit.com/teddziuba/2008/02/hermetic-rpc-unit-testing-with</id>
   <content type="html">Unit testing is a pain in the ass.&amp;nbsp; I will admit it, I hate doing it.&amp;nbsp; More often than not, you just write a few obvious JUnit tests that you know will pass and say you're finished.&lt;br /&gt;&lt;br /&gt;Testing code that makes RPC calls is especially discouraging.&amp;nbsp; You'll say &lt;i&gt;&quot;I can't unit test it, it needs to set up an RPC server and that's too complicated for JUnit&quot;&lt;/i&gt;, or, if you're like me, you won't even make up an excuse.&lt;br /&gt;&lt;br /&gt;Of course, this laziness comes back to bite you when the code goes into production, the RPC server throws a one-in-a-million exception, and your entire service bites the dust because you never tested that execution path.&lt;br /&gt;&lt;br /&gt;So, given that you don't like to be woken up at 3AM by sysops when you have been out drinking all night, let's unit test our RPC clients.&amp;nbsp; Let's do it without having to start up an RPC server when the test runs, and it would be nice to be able to have fine-grained control over the RPC methods.&lt;br /&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;&lt;font style=&quot;font-size: 1em;&quot;&gt;She's Thrifty - She's Just My Type&lt;/font&gt;&lt;br /&gt;&lt;/b&gt;&lt;/font&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 0.8em;&quot;&gt;This is the Thrift RPC definition we will be using for this example:&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;&lt;/font&gt;
&lt;pre&gt;service MyRPCService {
  i64 getDocidForUrl(1: string url),
}&lt;/pre&gt;

Simple.  We'll be looking up a 64-bit integral document identifier for a given URL.  Our client code will make a decision about the state of the document given that identifier.&lt;br /&gt;&lt;br /&gt;This is the class we will be testing:&lt;br /&gt;&lt;br /&gt;

&lt;pre&gt;public class ProgramToTest {

	// class constants
	private static final int RPC_SERVER_PORT = 3141;
	private static final String RPC_SERVER_HOST = &quot;rpcserver.widgetsandshit.com/teddziuba&quot;;
	private static final long DOCID_IS_OLD_IF_LESS_THAN = 1000;
	public static enum DocumentStatus { OLD, NEW, UNKNOWN };
	
	// instance variables
	private MyRPCService.Iface myRpc;
	private TSocket socket;
	
	public ProgramToTest() {}
	
	private void init() throws TTransportException {
		socket = new TSocket(RPC_SERVER_HOST, RPC_SERVER_PORT);
		TProtocol protocol = new TBinaryProtocol(socket, true, true);
		myRpc = new MyRPCService.Client(protocol);
		socket.open();
	}
	
	public Enum&lt;DocumentStatus&gt; getDocumentStatus(String documentUrl) {
		try {
			long docId = myRpc.getDocidForUrl(documentUrl);
			if (docId &amp;lt; DOCID_IS_OLD_IF_LESS_THAN) {
				return DocumentStatus.OLD;
			}
			return DocumentStatus.NEW;
		} catch (TException e) {
			return DocumentStatus.UNKNOWN;
		}
	}
	
	public void finished() {
		socket.close();
	}
	
}
&lt;/DocumentStatus&gt;&lt;/pre&gt;

If I were still in CS class in college, I would get dinged for having multiple &lt;code&gt;return&lt;/code&gt; statements, but the best part about being a grown up is that when I want a cookie, I can have a cookie.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;

The &lt;code&gt;getDocumentStatus&lt;/code&gt; is really the only thing we need to test, as clients of this class will be responsible for dealing with a &lt;code&gt;TTransportException&lt;/code&gt; if the socket initialization fails.  The unfortunate part about testing that method is that it makes an RPC call. &lt;i&gt;Sockets. Exceptions. Icky.&lt;/i&gt;  Even though it's easier to say screw it and go have a beer, remember: &lt;b&gt;&lt;i&gt;you gotta do what you gotta do&lt;/i&gt;&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;Making a Mockery&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://www.jmock.org/&quot;&gt;JMock&lt;/a&gt; is a clever unit testing library that makes mock objects really easy.&amp;nbsp; If you're new to mock objects, read more about them &lt;a href=&quot;http://www.mockobjects.com/&quot;&gt;here&lt;/a&gt;.&amp;nbsp; The basic idea is that we will make an object that &quot;mocks&quot; the behavior of the RPC server, but without doing any I/O.&amp;nbsp; That way, we have complete control over the operations of the server, and can actually test how your client code interacts with that one-in-a-million exception.&lt;br /&gt;&lt;br /&gt;We'll be mocking out the &lt;code&gt;MyRPCService.Iface&lt;/code&gt; interface that is autogenerated by Thrift, and defining our own behavior for it.  If you've got some experience with JMock, this should be pretty straight forward, and if not, then you'll catch on quick.  JMock's syntax focuses on making the testing conditions human readable.&lt;br /&gt;&lt;br /&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;Prepare The Class For Testing&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;

Since we will be providing the &lt;code&gt;ProgramToTest&lt;/code&gt; class with a mocked version of this interface, we need to add a constructor to the class for testing only:&lt;br /&gt;&lt;br /&gt;

&lt;pre&gt;public ProgramToTest(MyRPCService.Iface testOnlyIface) {
	this.myRpc = testOnlyIface;
}
&lt;/pre&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;

JUnit.&amp;nbsp; We In It.&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;We'll test the low-hanging fruit first.&amp;nbsp; Using our mock to control the return value of the RPC call, we can make sure the logic works:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;

&lt;pre&gt;@Test
public void testHandlesOldDocid() throws TException {
	final MyRPCService.Iface mockedRpc = context.mock(MyRPCService.Iface.class);
	ProgramToTest testObject = new ProgramToTest(mockedRpc);
		
	final long rpcCallReturnValue = 100L;
	final String testUrl = &quot;http://www.widgetsandshit.com/teddziuba/&quot;;
		
	context.checking(new Expectations() {
		{
			one(mockedRpc).getDocidForUrl(with(equal(testUrl)));
			  will(returnValue(rpcCallReturnValue));
		}
	});
				
	assertEquals(ProgramToTest.DocumentStatus.OLD,
                               testObject.getDocumentStatus(testUrl));
}
&lt;/pre&gt;

That is pretty cool.&amp;nbsp; Without a whole lot of effort, we've managed to make a unit test for a method that depends on an RPC server.&amp;nbsp; This test does not require any network I/O and runs very quickly.&amp;nbsp; It can be run in a self-contained environment, like an automated test server.&amp;nbsp; I call this kind of test &lt;i&gt;hermetic&lt;/i&gt;, because nothing outside of the test code can affect its outcome.&lt;br /&gt;&lt;br /&gt;We can also use JMock to test what happens when an exception is thrown.&amp;nbsp; If a Thrift RPC server throws an exception somewhere in its handler method and that exception is not caught server-side, it will be thrown up to the client as a &lt;code&gt;TException&lt;/code&gt;.&amp;nbsp; To simulate this, we simply change one line of the test expectations:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;

&lt;pre&gt;@Test
public void testHandlesException() throws TException {
	final MyRPCService.Iface mockedRpc = context.mock(MyRPCService.Iface.class);
	ProgramToTest testObject = new ProgramToTest(mockedRpc);
		
	final TException rpcException = new TException(&quot;something awful has happened.&quot;);
	final String testUrl = &quot;http://www.widgetsandshit.com/teddziuba/&quot;;
	
	context.checking(new Expectations() {
		{
			one(mockedRpc).getDocidForUrl(with(equal(testUrl)));
			  will(throwException(rpcException));
		}
	});
				
	assertEquals(ProgramToTest.DocumentStatus.UNKNOWN,
                                testObject.getDocumentStatus(testUrl));
}
&lt;/pre&gt;&lt;font style=&quot;font-size: 1.25em;&quot;&gt;&lt;b&gt;

Go And Do Likewise&lt;/b&gt;&lt;/font&gt;&lt;br /&gt;&lt;br /&gt;JMock is an incredibly useful library.&amp;nbsp; If you're a lazy tester like me, it beats the pants off of subclassing.&amp;nbsp; Now, you have no excuse for leaving RPC calls untested.&lt;br /&gt; 
&lt;br /&gt;
</content>
 </entry>
 
 
</feed>