Tag Archives: geek

Open source stats – but what do the numbers *mean*?

I recently sent a report to project management containing some numbers that purport to describe the status of the RDO project.

I got a long and thoughtful response from one of the managers – we’ll call him Mark – and it seems worthwhile sharing some of his insights. To summarize, what he said was, don’t bother collecting stats if they don’t tell a story.

1. Focus on the goals

Listing a bunch of numbers without context – even with pretty graphs – doesn’t tell us anything unless you relate them to goals that we’re trying to achieve.

Several weeks ago I presented a “stakeholder review” to this same audience. Any statistics that I present in the future should be directly related to a goal in that review, or they are just meaningless numbers, and possibly a distraction, and, worse still, might cause people to work towards growing the wrong metric. (Google for “be careful what you measure” and read any of those articles for more commentary on this point.)

2. Focus on the people

One of the stats that I provided was about how certain words and phrases feature in the questions on ask.openstack.org. Mark looked beyond the numbers and saw three people who are very active on that website, two of whom are not obviously engaged in the RDO community itself. Why not? How can we help them? How can they help us? What’s their story? Why are we ignoring them?

3. Focus on the blips

In February, our Twitter mentions, retweets, visits, and so on, went through the roof. Why? And why didn’t we do that same thing again in March?

As it turns out, in February there were two conferences that contributed to this. But, specifically, we captured a lot of video at those events, and the Twitter traffic was all around those videos. So clearly we should be doing more of that kind of content, right?

4. Ignore the stuff that doesn’t seem to mean anything

We track “downloads” of RDO, which roughly speaking means every time someone runs the quickstart and it grabs the RPM. Except RDO is on a mirror network, so that number is false – or, at best, it reflects what the trends might be across the rest of the mirror network. So we have no idea what this metric means. So why are we bothering to track it? Just stop.

5. Ask not-the-usual-suspects

This last one wasn’t one of Mark’s observations, but is what I’m taking from this interaction. We tend to ask the same people the same questions year after year, and then are surprised that we get the same answers.

By taking this data to a new audience, I got new answers. Seems obvious, right? But it’s the kind of obvious thing we overlook all the time. Mark provided insight that I’ve been overlooking because I’m staring so hard at the same things every day.

By the way, I’ve presented Mark’s insight very bluntly here, because it’s important to be clear and honest about the places where we’re not doing our job as well as we can be. Mark’s actual response was much kinder and less judgmental, because Mark is always kind and supportive.


Reposting from an email I sent a while back:

As several people have asked about my todo list within the last 2 weeks, I thought I’d share the goodness with everyone.

I’ve been using todo.txt for about a year now. http://todotxt.com/

Don’t let the website fool you. todo.txt isn’t (primarily) a gui app, or a phone app. The todo list is in a plain text file. There’s a dozen different tools that you can use to manage it, but I just use the command line:

t ls – what’s in my list?
t add ITEM – Adds ITEM to my todo list
t pri ## A – Makes item ## priority A
t do ## – Marks item ## as done, moves it to DONE list for later reference
t ls blarg – Lists todo items that match ‘blarg’
t lsp A – Show me all the things that are priority A
done – An alias to ‘cat ~/Dropbox/todo/done.txt’ which shows me what I’ve done most recently

If you happen to store your todo list in your Dropbox directory, you can then also use the free Android app to manage your todo list from your phone. (I’ve heard it also work with google drive, or owncloud, or a variety of other things.)

As someone who has used every possible todo list out there, including a dozen issue trackers, and writing a few different todo list webapps, sticking with a single tool for a whole year is unprecedented. Being able to work from the command line made all the difference for me, since that’s where I always am anyways.

OpenStack PTG, trip report

last week, I attended the OpenStack PTG (Project Teams Gathering) in Atlanta.

Even more in depth: PTG info at https://www.openstack.org/blog/2016/05/faq-evolving-the-openstack-design-summit/


1) This is a hugely productive event, with project teams getting an enormous amount of work done without the distractions that are usually present at a conference.

2) I remain very concerned about how this event will effect the
character of OpenStack Summit – removing the bulk of the engineers from that event, and making it more product/marketing/sales focused. Time will tell

At the gathering, I did 23 interviews with Red Hat engineers about what they did in the Ocata release. You can see some of those interview on the RDO YouTube Channel. I’m not done editing them all yet, but they will appear over the coming weeks as part of various blog posts, as well as all of them appearing in that YouTube playlist.

I am constantly blown away by the passion, expertise, and
professionalism of the folks I get to work with. Wow.

Anyways, more about the PTG.

I was (and, really, still am) very skeptical about this new format.
Splitting OpenStack Summit into four events rather than two has already had significant impact on travel budgets, not just at Red Hat, but also at other companies involved in OpenStack. A lot of companies, for example, didn’t send anyone to FOSDEM, and we had a hard time staffing the OpenStack table there. Usually people work one shift at the table, but this year several of us worked 4 and 5 shifts to cover all the slots.

I am concerned that splitting the engineers off into their own event
will significantly change the character of OpenStack Summit from being a community-centric, tech-centric event, to more of a sales and marketing event, light on technical depth.

But this event, for what was intended, has already been amazing.
Everyone is commenting on how much is getting done, how much less distracted the team meetings are, how much better the teams are gelling than they have at any previous event. This is a working event, and people are here to get work done. They are meeting all day, every day, working on plans and blueprints, and fighting out agreements on things that would take weeks in email, and everyone seems VERY pleased with outcomes.

So, perhaps the trade off will be worth it. Time will tell. Regardless, Erin Disney and her team put on an amazing event that fulfilled, and exceeded, its goals.

On Wednesday  night, everyone that has ever contributed a patch to RDO was invited for drinks and hors d’oeuvres at the SideBar, and while there the RDO Ocata release announcement was sent out.

We had about 50 people in attendance, who ate and drank up all of my budget in about 2 hours.

Here’s some pictures.


Features, not lies

A colleague is attending the nginx conference in Austin this week, and shared with me several anecdotes in which a speaker preached misinformation – or if I want to be generous, grievously outdated information – about Apache httpd, to support the notion that nginx is better.

This led to the following:


Each time I have encountered nginx people at conferences, and attended their talks, they have compared nginx to grossly misconfigured, 10 year old installations of Apache httpd 2.2 to support their claim that nginx is leaner, faster, and easier to administer.

Here’s the thing. nginx is a solid project. I have zero beef with the software itself. I have used it myself, when the need arose. What I object to is the habit of the fans of nginx to lie (or exaggerate, or just spout uninformed opinions) to make themselves look better. If you must compare, compare our latest to your latest, and have experts correctly configure each. That way, each will show where it shines, and where it doesn’t.

It is possible to configure ANY software badly. This is why it’s almost always a bad idea for an expert on SoftwareA, who knows little or nothing about SoftwareB, to compare them head to head – they’ll invariably be comparing a well-configured A to a less than optimally configured B. And in the case of nginx vs Apache httpd, these guys almost always use 2.2 or 1.3 as an example of … well, all of the things that 2.4 fixed. 5 years ago.

Any intro to marketing class will tell you that you need to talk about your own strengths more than you talk about the other guy’s weaknesses. This is a message that nginx and presidential candidates seem to have missed. And, in the case of software, it’s even more important, because whereas Donald Trump will always be a monster, every time you point out a legitimate shortcoming in Apache httpd, we fix it.

Cacti and the Asus RT‑N66U

I discovered a few days ago, quite by accident, that the Cacti project is still quite alive and well. I don’t know why I thought it wasn’t. I thought it would be kind of cool to set it up to graph my network traffic here at home.

I have an Asus RT-N66U wireless router, which I’ve been very pleased with since I acquired it.

Step one was getting Cacti running, which has always been something of a challenge. The installation instructions, while extensive, miss a lot of prerequisites that you encounter along the way. (Install packages are not, apparently, available for CentOS7.) Notably, you have to install Spine (the stats collection daemon) from source, and it required the -devel version of several of the items that the docs mention. So, things like php-devel, mysql-devel, snmp-utils, which are not mentioned in the installation instructions. No big deal, but it did make the process a little longer, finding and installing these prerequisites.

Step two was getting file permissions set up correctly in my Apache httpd document directory. This turned out to be a combination of missing directories (log/ and rra/ in Cacti’s home directory) and the fact that my vanilla installation of php had logging turned off, so everything just silently failed. Those directories, once created, and ownership changed to the newly created cactiuser user, Cacti itself started running. Awesome.

Step three is enabling SNMP on the router, which isn’t hard, but is a little time consuming. Instructions for doing this may be found on the My gap in the void blog, and I will not copy them here.

Finally, there’s the step of getting Cacti to talk to the N66U. This turned out to be absurdly easy. Under ‘devices’, I clicked Add. I gave it the name and IP address of the router, and selected “Generic SNMP-enabled Host” from the Host Template dropdown, and pressed ‘Create’.

On the server, in the cactiuser’s crontab, add:

*/5 * * * * php /var/www/html/poller.php > /dev/null 2>&1

Then, click ‘New Graphs’, select the “select all” checkbox, and press ‘Create’.

Finally, under ‘Graph Management’, select the ‘select all’ checkbox, select ‘Place on a tree (default tree)’, and press go.

And, you’re done. Wait a few hours for data to accumulate:


Server Not Found

Up until about 9 years ago, I had a server rack in my home office. At one point, there were as many as 12 servers running on it, running websites, DNS services, email , NNTP, and a variety of other services.

I blogged about this a while back.


As time went on, I started to recognize the benefits of running services in the cloud. This meant everything from moving email to GMail, to running web and DNS on a hosted server at Rackspace (then known as SliceHost).

Eventually, I had no computers in the house at all, except for my work-issued laptop.

A little while ago, I started to miss my servers, for many reasons. I still run some mail services on my VPS at Rackspace, to do things like mail aliasing for a variety of domains, and of course I still run all my own web servers there. But there’s something about having a server that you have to physically maintain that keeps your skills going in ways that you just don’t have to when it’s out there somewhere.

A year or so ago, I found a refurb Dell machine for next to nothing, and put CentOS 7.1 on it. That released in April, so it must have been about then. For a while I didn’t do much with it, other than test OpenStack installs. But after a while I brought up a Minecraft server on it, running the Bukkit distribution of Minecraft, to play with family members. Then I opened it up to a few friends.

treasure_huntThis Christmas, one friend did a treasure hunt for her kids in the Minecraft world, which was incredibly cool. I’ve seen them on the server a number of times since then.

When I posted on Facebook about this a few days ago, someone from ArcLight responded, saying I should host my server there, which brought on this trip down memory lane. No, Seth, while I appreciate the offer, and I’m certain you’d do a better job than I, running it myself was the entire point. Thanks, though, and I wish you well with your business.

Raspberry Pi, episode 1

I got a Raspberry Pi for Christmas. I’ve been meaning to get one for some time, because I wanted to play with home automation (x10) stuff again. So here we go.

I’d rather not mess with getting usb/serial stuff working again. That was a pain. So I’m going to get a CM19a instead and see if I can get that working.

Should be fun.

So far, I have the Rasberry Pi booted, and I’ve got some basic stuff installed on it. It’s running Debian 6 (Raspbian) and I’ve got a 4G card in it, which, so far, looks like it’s going to be plenty.

Open Source and The Cloud

I had something of an epiphany in the shower this morning. I discovered that I actually agreed with Bradley Kuhn about something.

TLDR: Is “the cloud” a threat to Open Source? I stopped working on an Open Source calendaring project because of Google Calendars.

Several months ago I attended (part of) a talk by Bradley about how The Cloud (whatever that is) is a threat to Free Software. (Yes, I know what The Cloud is. Snarky remark in reference to all the different things The Cloud might mean to various people. See Simon Wardley’s wonderful talk about what the cloud is.)

His reasons struck me as so outside of my way of thinking about software that I ended up leaving the talk. Oh, also, Skippy wanted to go to lunch, and that sounded like a lot more fun. Nothing personal, Bradley. He was talking about how something like Google Calendar (actually his example was GMail, but hold on a minute) was a threat to Free Software because the code, even though it’s in Javascript and right there in front of you, can’t really be inspected (ie, you can’t learn from it) because it’s hugely obfuscated. Also, you can’t see the back end. So here’s a service you can use for “free”, but it’s not Free, because it’s in chains, metaphorically speaking.

Then, this morning, I was thinking about why people are involved in Free/Open Source software, but also why they stop being involved, and I realized something.

I used to have a web-based calendar thingy. It was written in Perl, and it was really very cool. In fact, it not only started my passion for Open Source (it was the first thing I ever had on CPAN, and it was the first software that I ever wrote which was featured in a book!) it also paid my mortgage for a few years. I used to write calendaring applications for the General Motors Desert Proving Grounds in Mesa, Arizona. Although that plant is long closed, their scheduling ran on my software. If you wanted to schedule a test on the dust track (tests a vehicles various rubber seals to make sure they keep out dust, as well as handling in those conditions) you used the web-based scheduling application, called D.U.S.T. (I forget what it stands for – Dusttrack Usage Scheduling Tool or something) and scheduled it. This worked better than grubby bits of paper, because it didn’t get lost, and you always could get to it without walking down the hallway.

Also, when I was at Databeam, back in the late 90s, I wrote a similar application for scheduling conference rooms (clever name: Conference Rooms). I went up to the front desk one day and stole the conference room scheduling book and hid it, forcing everyone to use the online scheduling app. Strangely, it worked, and I didn’t get fired.

Then, I got involved in a project called Reefknot, which was an implementation of various international calendaring standards, in Perl. That was humming along nicely. And I had a dozen different calendar modules on CPAN.

By the way, in case you don’t know, calendaring is hard. Sure, it looks easy, but then you get into things like “every other Monday at 10am, except during company vacations.” Or possibly “the last day of each month.” Think for a little while about how you’d implement that, and your brain will start to melt just a little. “every monday” suggests a simple solution, but as soon as you start having to deal with exceptions, things get very very complicated. And what with different length months and leap years … and don’t even get me started on time zones. *shudder*

Anyways, then something called Google Calendar came along. It worked with all of the various calendaring applications. It did the various calendaring specifications, including the long-elusive CalDav. We were all very excited in the calendaring community, but then an odd thing happened. People stopped working on calendaring stuff. Because, you know, it’s already done.

So, I stopped working on an Open Source project because there was an implementation in the cloud. (ie, online somewhere.)

So, was Bradley right? Did Google Calendar kill the Reefknot project specifically because it’s closed source? Yes, in a sense. I don’t believe, as the FSF does, that closed source is intrinsically immoral. But there’s a direct correlation between the projects I no longer work on, and great cloud based implementations of the same functionality, where I don’t have access to the source to participate.

Furthermore, as my interaction with software is increasingly via a browser, and not via running software on my own computer, I have less and less incentive – and ability – to tinker with those things.

Now, I’m weird, I still run several of my own servers. Granted, those servers are “in the cloud”, meaning that I have no idea where they are physically located. But I have root on them. I build software from source on them, and tinker with that source from time to time. I tinker with the source code of my blog, even though there’s a good blogging platform “in the cloud”, but I also have several blogs on Blogger, simply because it’s simple and I don’t want to monkey with it.

So, although I disagree with Bradley’s philosophically, I find that he may be completely right for more pragmatic reasons.

But at the same time, Open Source has a whole new rebirth of late, and there continue to be ever more exciting projects out there. I’m much more concerned about my kids, and what they will find to hack on. My son is a hacker. He likes to build stuff, take stuff apart, break it and fix it, figure out how it works. I don’t know if I’m doing an adequate job of encouraging this. I really need to get him a subscription to Make magazine. I wonder, however, when he gets a little older, if he’ll be interested in programming. I think he’d be really good at it, but it would be a great shame if the removal of applications to The Cloud also results in a lack of opportunities to hack on code.

Review: Squid Proxy Server 3.1 Beginner’s Guide, by Kulbir Saini

I just got done reading “Squid Proxy Server 3.1 Beginner’s Guide” by Kulbir Saini, from Packt Publishing.

(Full disclosure: Packt sends me free books on the condition that I review them. However, I’m under no obligation to say nice things, and they keep sending me books even when I say harsh things.)

Quick version: The writing style is ideal for a beginner – clear descriptions of concepts, rich in step-by-step examples and section reviews/summaries to clarify why you just did what you did. Frequent exercises reinforce concepts. Saini writes like a teacher.

Saini’s writing style suggests to me that he’s been teaching this material for quite some time. Every concept has copious examples and suggested exercises to help you remember what you’ve learned. Concepts are clearly defined, clearly explained, and then illustrated before moving on to another concept. Previous concepts are reintroduced and incorporated into the new ones as you go along, so that idea builds upon idea.

This is a book to be read in order, but aso is structured so that someone at an intermediate level can go back and use individual sections as reference.

I love reading technical books that aren’t dry and boring, and which inspire me to be a better writer. Oh, yes, and I learned a lot about Squid, too.

I presume that the content isn’t really quite so specific that it’s only for 3.1, but having not used Squid for an extended time, I can’t say for sure.

The book starts with clear descriptions of the general concepts of proxying, but very quickly gets hands-on and specific to Squid. This is how I like it, because I learn by examples and doing.

So, on the whole, I give this book a firm thumbs up. I’m always a little hesitant doing a review of a book on a technology I don’t know by an author I don’t know, because I make it a firm rule to say the book stinks if it stinks. So it’s always a joy to read a book like this that exceeds my expectations, and turns out to be not just ok, but actually really well written.

Tek11 just a week away

A quick reminder that Tek11 is just a week away, but there’s still time to get your tickets. I’ll be speaking twice – once about all of the things you didn’t know the Apache HTTP Server could do, and once about writing a better FM. And, of course, dozens of far-more-brilliant people will be there too, speaking about many PHP-related topics, and hanging out having fascinating conversations, and working on fascinating projects.

You can hardly afford to miss it. See you there.