Excel

Lies, damned lies, and huge Excel reports.

I was in a meeting the other day trying to figure out a particularly thorny issue. We had a group of smart, experienced, opinionated people in the room -- my kind of meeting.

About halfway through the meeting, one of the participants produced a huge, multi-page Excel print-out. Many of the pages had colored graphs, three or four to a page. A second set of pages contained an assortment of tables correlating variables against one another. An accompanying narrative outlined that there was a "significant" relationship between some of the variables because the correlation coefficient was over .5.

​"As you can clearly see from this graph..."

We all kind of sifted through the print-outs and gave them the old college try while the presenter tried to narrate. ​After a few minutes, it was clear that none of us, including the presenter, understood what the graphs meant. There was no description of the units or explanation of how they were derived. And so what happened was everyone started to use the graphs to explain their own point of view: "What I think they mean is..."

​It was humorous, really, and luckily we all noticed it and started laughing and threw the spreadsheets aside. At the same time, the experience was a good reminder of how easy it is to manipulate -- and be manipulated by -- numbers. 

A few tactical takeaways:​

  • Label graphs clearly for your audience, not for yourself. Provide notes if the graphs aren't clear -- but if the graphs aren't clear, rethink whether to use them at all.​
  • Be careful of comparing correlations of a huge number of interconnected variables. In many systems (datasets), the variables are correlated with each other -- that might be why you are studying them in the first place. So comparing a correlation of .5 to a correlation of .6 and calling the latter "better" is more than a tad sloppy and isn't the whole story by a long shot. Variables interact. To study the interaction of a number of variables, look to regression analysis instead.
  • "Significance" means something very specific in statistics. It is not the same as "strongly correlated."​ When two things are correlated, it means they vary in relation to one another. That's it. A correlation does not answer any questions about the causes of the relationship. When a relationship is "significant," it means that there is a very low probability that the relationship has occurred by chance. Two variables could have a low correlation with high significance, or a high correlation with low significance. Think of it this way: You might have a great night on the town with someone you hardly know; in the same way you might have a really lousy day with your closest friend. How well the date went is not the same as how close your relationship is. (IMPORTANT NOTE: Significance will increase with the population size, so with large datasets you can find that all the relationships are significant mathematically even if they have no practical significance at all!)
  • And most of all, this hopefully (?) goes without saying, but don't base decisions on a report that no one can understand!

Getting Started with Analytics: Some Reading

Since returning home from last week’s 2011 NTEN Nonprofit Technology Conference, I’ve been asked about a half-dozen times for reading suggestions for fundraisers looking to learn more about statistics, and in particular, segmentation. 

I have a couple of suggestions to get you started, but I want to say that the best way to start to learn segmentation is to export some data from your database — say, the results of your most recent initiative — open it in Excel, and just look at what you see. Sort the list by donation size. How many large gifts are there? How many small gifts? Do you notice clumping around certain numbers? Look at the addresses of the donors — are more from certain places than others? These are basic questions, but they are the first step towards viewing your donors as individuals rather than as one anonymous whole. I’ll write more on that in coming weeks, but the message is: Don’t be afraid to play with your data! You won’t break anything, I promise.

Now, as for the recommendations, I always start with two books. The first is the reassuringly titled Statistics Without Tears by Derek Rowntree. You’ll like this book immediately just by its size — unlike most statistics texts, you can carry it with one hand. It looks at you non-threateningly, as a small puppy might. It is a classic book, first published years ago, and there’s something comforting about the type and the graphs. It reminds me of cookies and tea at Grandma’s. More than the appearance, though, is the content. You may sweat a bit in places, but there will be no crying, and you’ll come out the other side knowing a bit more about the things you know you should know (what is a median, and why does it matter; what does the standard deviation measure, and why shouldn’t you be afraid of the word “deviation”) but don’t. 

The second is the much more recent but excellent Fundraising Analytics: Using Data to Guide Strategy by Joshua Birkholz. Unlike Rowntree’s book, this book was written after the secret consortium of business publishers decreed that all business books much contain a colon in their title. (Have you noticed this? The same rule applies to movie sequels.) But more importantly, this is a very recent and much-needed addition to the vast number of fundraising books on the market, most of which lack any real specificity when it comes to collecting data and understanding it, and a few of which are patently banal. Birkholz walks through a number of basic and more advanced analytics issues, including a treatment of RFM analysis and an introduction to regression. It won’t make you a statistics hero, but it will go a long ways towards improving your knowledge, particularly if you read it with an eye not only towards specific techniques, but towards how he approaches data and analysis more generally. High recommended.

Neither book is a cheap ticket, but both are worth it, and should get you started. Happy reading!

Analyze this!

Event 360 has launched a new webinar series, which gave me the fun opportunity this past week to talk for 90 minutes or so to well over 100 nonprofits about our first topic: the basics of event analytics. This is a subject near and dear to my heart, both because I’m a bit of a data geek and because so many of the groups I work with are great at tracking data but pretty poor at doing anything with it. 

The fact is, with an hour of time, a flat file of your fundraising data, and Microsoft Excel, you can get a far deeper understanding of what is actually powering (or holding back) your program. 

The webcast was recorded and archived; you can view it for free here

Don’t be afraid of your data!