Seasonal Safety: Secco Recit's Spiffy Spreadsheet

We are right now in the middle of a wonderful holiday season. That is, of course, Season Season! 'Tis the time of year when opera companies around the world announce what operas they will be producing in the coming year, and audiences wait with bated breath to see if their local house will be performing their favorite Verdi for the umpteenth time, or if they're taking a chance on that obscure Auber opera you've been dying to see but never gets done.

Season planning is a delicate balancing act. If you only perform operas no one's ever heard of, you're going to have a tough time turning a profit. At the same time, it's going to get awfully boring if you only perform the top ten highest-grossing operas year after year.

Opera snobs like me take great joy in critcizing or kvelling at a company's choice of programming, and a word that's often thrown around this time of year is "safe." A season is "safe" if it has fifteen performances of La Boheme, twelve performances of Die Zauberflote, eight performances of Rigoletto, and three performances of La Cenerentola. In contrast, a season is "exciting" and "daring" if it has a plethora of obscure operas by composers no one has ever heard of. But what's "safe" and what's "exciting," while somewhat directed, is still a little nebulous, especially when you're the one planning the season. If you work for an opera company, and it's your job to decide what operas you produce, and how many performances of each you put on, you want a clear and precise way to determine what sort of season is going to maximize ticket sales. As far as I know, no such method exists.

So I decided to create one, which you may view here.

I went to Operabase. Operabase has an extensive database of opera performances, and a statistics page where you can see what operas have been produced in the past few years, and how may productions and performances each of them got. At the time of writing, Operabase lists La Traviata as the most frequently performed opera in the world, with 869 productions and 4190 performances between 2011 and 2016. Now, as I went through this project, I came to realize that Operabase's data is not entirely there. Some operas, like Bartok's Bluebeard's Castle somehow fail to make an appearance. But because I could think of no other resource for getting such extensive numbers, I pulled the full spreadsheet of every opera ever performed in those five seasons. (Note: Operabase does not have a built-in easy way to access the data other than by physically looking at the site. To get the data in a usable form, I wrote a script to parse the HTML and transfer the data to a .csv file.)

(Note: Since I made this, Operabase has since updated their site to allow for much more detailed searching. I have not yet updated the spreadsheet to take advantage of it, so this is all based on old and hard-to-search data.)

The next step was to assign each opera a number, describing how "safe" that individual opera is. I gave this number in terms of portions of La Traviata, the most frequently produced. I divided the number of productions of a given opera by the number of productions of La Traviata, and that was the opera's "safety score." I also assigned each opera a second score using the number of performances rather than productions.

Now it was simple to be able to input a given opera season, and the spreadsheet would pull from the data and assign the season two scores. A first score as an average of the scores of the individual operas in the season, and a second score weighted based on the number of performances of each opera. You can try this for yourself in the spreadsheet. The block on the right has spaces for you to write in the titles of operas, and the number of performances for each, and it will automatically calculate for you the season's safety score.



Note: You may have to manually search for some operas in the database. The titles have to be spelled exactly, and this can be difficult for Russian operas. Also, if two operas share the same title, and you mean the less popular of the two, you'll have to input the numbers for that opera manually. If an opera isn't in the database, I ended up simply substituting the numbers as if it had had one production and one performance. (Hey, I did this in one day; It's a proof of concept, not a rigorous professional tool.)

This turned out not to make a huge difference when dealing with operas on the far extreme, because after the very most popular operas, the numbers drop rapidly. We're not even at Aida (the twelfth-most-frequently-produced) before the safety score drops below 0.5, and at number 52 (The Bartered Bride) the numbers drop below 0.1. Even the second opera on the list, Die Zauberflote, only scores 0.79 for performances, and 0.65 for productions, because La Traviata is just done that much more. This means that it's actually surprisingly difficult for an opera season of more than half a dozen operas to score above a 0.5 on a season-wide basis.

Operabase also has a second set of data, from an earlier five years. The arrangement of operas is similar, but notably, the top handful are much closer together, and while La Traviata still has the most productions, Die Zauberflote has more performances. What this did in general was it boosted most seasons' scores by about 0.02 to 0.05, but the relative relationships between the numbers were basically the same. I've included a second tab on the spreadsheet using the old data for reference. The advantage of using the old data is that, in a small season, it keeps La Traviata from absolutely dominating any season it appears in. The difference isn't as great in larger seasons. It's probably better to use the newer data, but it does bother me a little on an intuitive level, because I can't see why La Traviata should be so much more popular than the next half-dozen on the list, all of which are relatively close together. What happened in the past half-decade that gave La Traviata such a boost? If you know, please do tell me.

Now, this is crude and generally unhelpful except for the broadest overview of a season. It doesn't take into account star casting, particular artistic production, or audiences getting sick of La Traviata after the hundreth performance. I assume most major opera companies have a formula that tells them how many performances of La Boheme they can put on before returns start to diminish. But in a very general way, I've made up a simple module to quantify, in a real number between 0 and 1, how safe or daring an opera season is, which should be good enough for the average opera onlooker.

So I started plugging in some opera seasons. I've come to the conclusion that a score above 0.35 is a particularly safe season, and a score below 0.25 is a particularly daring season. Most of the professional seasons I plugged in fell in between. Of course, what you think is safe or daring is your own opinion, and your boundaries may be different from mine.

Some companies are more consistent than others. La Scala's score jumps around a lot from year to year, but the performance score is always very close to the production score, because La Scala almost always performs each opera in its season a pretty uniform number of times, usually about 7-9 performances per opera. Paris Opera is a bit more consistent, possibly because they perform so many more operas per season that the numbers average out better, but even they vary by more than a percent from season to season. The Met, meanwhile, scores much more consistently between seasons.

In fact, the Met was shockingly consistent. It was significantly the most consistent company I looked at. Each of the past four seasons scored (for productions) approximately 0.30 to within a margin of a single percent, and the three seasons before that scored close to 0.25. Looking at the actual seasons, it can be intuitively seen that the most recent four were less adventurous than the prior three, and so a difference of 0.05 is pretty significant. The Met's plateaued numbers suddenly leaping up to a slightly higher plateau makes me wonder if they use a system vaguely resembling mine in their own season planning. It would not surprise me if any sufficiently large opera company used some sort of algorithm rather than intuition to determine what operas they should produce in a given season.

Then I made graphs of the past few Met seasons. Not to prove anything. I just like graphs.




The first graph shows the operas of the past four Met seasons overlayed on one another. They are sorted in increasing order of popularity (I used production numbers rather than performance numbers, somewhat arbitrarily), and they are evenly spaced between 0 and 1. (So if a season has 24 operas, the plot points are at X-values 1/24, 2/24, 3/24, and so on.) Then I trend-lined. A quadratic formula fit the graph well. I made the second graph to average out the seasons into one big 100+ opera season. You can see on the right-hand side where the popular operas repeat from year to year. The formula this one gave me was pretty close to an average of the first four graphs, which makes intuitive sense.

And so the formula for creating a Met season is thus: Take the graph of X2 - 0.156X + 0.0316 over the domain [0,1]. Take 24-28 points on this graph, evenly spaced horizontally, and find operas whose safety scores are near the Y-values of those points.

As I said, this math is not particularly scientific, and the data has holes, and there are just generally a lot of problems with this. Four seasons isn't a great sample set, and this trend of consistency only started in 2014. I'm not a statistician. I put this together hastily in a couple of hours because I was mildly curious, but not curious enough to apply serious scientific rigor. But it is just thorough enough to yield some interesting results. I enjoyed plugging in seasons and seeing which companies play things safer than others, and making up hypothetical seasons of my own. I hope you'll find it mildly amusing as well. The Met announces their new season tomorrow. By how closely it matches my graphs, we'll see if I'm on to Peter Gelb's secret formula, or if, by coincidence, I just happen to have found a formula that by coincidence makes the past few Met seasons look more similar than they really are. Probably the latter.

Comments

Popular posts from this blog

The Best Andrew Lloyd Webber Ballad

The Cinderella Problem