Posted in Uncategorized

## Hans Rosling

It was announced yesterday that Swedish epidemiologist Hans Rosling died (from cancer). To give you a taster of his performance skills – here is a snippet of his work for “The Joy of Stats” broadcast on BBC4.

A full version of this is on Vimeo:

Kate Allen at the Financial Times did an interesting piece about him in 2014 – where he spoke of being famous but having limited impact; this piece also told of his successful sale of the software underpinning many of his charts to google, creating a funding source for the Gapminder foundation.

To my surprise, the announcement of his death and peoples responses to it was trending worldwide on twitter.  Thus, considering my last few posts, I thought that it would be fitting to do a quick analysis of what was being said. Looking at tweets between 2017-02-07 14:35:59 UTC and 2017-02-08 10:20:15 UTC.  This required some extra processing as the number of emojis keeps on increasing, and R does not recognise these naturally –  thus requiring an external list of how R encodes each emoji.  This is still not perfect, thus some manual editing of the wordclouds are still needed to remove emojis that are represented as Japanese characters!

The sentiments expressed in the tweets were largely positive, and, considering the man in question, suitably full of joy and trust/hope rather than sadness.

The first of these represents the number of tweets that expressed each sentiment [note that a single tweet can express multiple sentiments]

While this image represents the strength of the sentiments:

Posted in politics, Uncategorized

## Trump’s tweets (and the Women’s March comparison)

So, I’ve been working on sentiment analysis again.  What could be more topical than analysing the tweets from the period after Trump was elected to before he was sworn in.

Using the syuzhet package in R it is very simple to perform sentiment analysis; with some simple manipulation afterwards, we can see the profile of the different “Sentiment Score”.  A step-by-step guide is here.

First look at the profile across all of the tweets:

But, things with Trump are not quite that simple.  It has been previously speculated that Trump’s tweets from Android devices are not written by the same person as Trump’s tweets from other sources. The general conclusion is that the tweets from Android are written by Trump himself, but tweets from other devices are written by staffers. The time of posting the tweets can be investigated to see if there are patterns:

It becomes very obvious that whoever is tweeting from the iPhone is primarily active during office hours – with a few evening [18] tweets that were either thank you tweets or mentioning events that evening or the following morning. So, let’s look at those tweets written on an Android:

In the following sequence of wordclouds, the word in the middle corresponds to the sentiment in the graph above.

So what words were said in tweets that the sentiment analysis deemed to be angry?

Compare this to the joyful tweets:

and the tweets expressing “trust”

This process is continued for the other sentiments…

Who might be interested in this type of analysis?  It doesn’t just apply to political figures; companies may be interested in the sentiments being expressed about their brands / services and in particular may be interested about the effects that changes have on what is being said online.  Whether you are aware about it or not, this type of analysis is happening every day and is providing insight into how people think about a wide variety of terms.

There are limitations, of course.  These include the problem of sarcasm and emojis. Automatic sentiment analysis struggles to capture sarcasm.  Furthermore, emojis can be converted into text, but the additional meanings behind the emojis (think aubergine) are lost in this process!

Who knows what the future will bring as Donald J Trump has control over both the @POTUS and @realDonaldTrump accounts!

As a quick aside; here’s the sentiment captured between 21:57:57 and 22:40:31 UTC [about 5pm EST and 2pm PST] on Saturday January 21st under the hashtag #WomensMarch. This consisted of approximately 65 thousand tweets in total.  I could have collected more data, but twitter has a limit of 5000 tweets in a single download, so it’s quite a faff to collect more.  Furthermore, I didn’t think that more tweets from earlier in the day to substantially change the pattern.

The names of the “sentiments” are fixed, sometimes not exactly to my preferred choice; tweets with a high “trust” sentiment are often quite hopeful for example… but that is a whole different problem (and is someone else’s problem to worry about!)

Definitely a more striking ratio of positive to negative tweets.

And the most commonly used words:

Finally, for those interested in this tweet:

It came from an Android phone on a Sunday (but not very early in the morning / late at night); so those speculating that it was not the man himself tweeting don’t have the obvious indication of it coming from an iPhone!

Posted in teaching, Uncategorized

## Student Projects: Spatial Statistics

At this time of the year, I once again start to think about how to create interesting, but feasible, projects for final year students.  Many times I find students have their own particular set of interests and I will try to work through a process with them to develop project ideas that will maintain their interest for an academic year.

Recently, I have been primarily focusing on projects with a spatial element, for a number of reasons.

1. Goes beyond what they are taught in an particular module on their degree programme
2. Lots of public/government data available have a spatial element
3. Encourages students to use R rather than SPSS/Minitab (the other statistics packages that we teach our students)
4. Looks good on a CV as it is unusual to see analysis and modelling of spatial data at an undergraduate level.

I mainly recommend a single textbook to students; Applied Spatial Data Analysis with R by Bivand R.S., Pebesma E. and Gómez-Rubio V. This is a great book for those learning spatial statistics.

As we mainly use Generalised Additive Models when analysing the data, the framework that I use for explaining the concepts tend to be:

• (Multiple) Linear Regression: response variable continuous, explanatory variable(s) continuous

$E[y|x]=\beta_0+\beta_{1}x_{1}+\cdots+\beta_{p}x_{p}$

• General Linear Models: response variable continuous, explanatory variable(s) may be categorical or continuous
• Additive Models: response variable continuous, model uses functions of explanatory variables

$E[y|x]=\beta_0 + f_{1}(x_{1})+\cdots+f_{p}(x_{p})$

• Generalised Linear Models: response variable not necessarily continuous (could be binomial or poisson), explanatory variable(s) may be categorical or continuous

$g\left(E[y|x]\right)=\beta_0+\beta_{1}x_{1}+\cdots+\beta_{p}x_{p}$

• Generalised Additive Models: response variable not necessarily continuous (similar to Generalised Linear Models), model (may) use functions of (some of) the explanatory variables.

$g\left(E[y|x]\right)=\beta_0 + f_{1}(x_{1})+\cdots+f_{p}(x_{p})$

This talk gives a very quick overview of GLM / GAM.

This year, I have students looking at the US Primary election results on a county-by-county level (principally examining the within-party rather than the between-party distribution of votes) and also looking at cancer rates around Europe.  Previous projects have looked at more economic data with a spatial element.. but perhaps the future will involve more environmental applications.

Posted in Uncategorized

## Another morning after the night before

This time around, I didn’t wait for the election to be called.  On the morning of June 24, I stayed up until the numbers of votes to be declared was less than the margin at the time.  This morning, I watched the results on a county-by-county level for states like Virginia – which did veer to Clinton at the end, I could see that Clinton’s Democratic “firewall” wasn’t as solid as people were predicting.  Pennsylvania was the same on a county-by-county level; even prior to full reporting, it was obvious that Trump was more popular than the polls had anticipated.

So, once again, the polling agencies have to question themselves.  While many states were within the usual margin of error of polls, that the bigger errors were typically in the same direction (towards Trump) once again reveals that there are some structural issues about the performance of the opinion polls and how the polling companies capture difficult to reach voters.  An even trickier issue is understanding those who lie to pollsters about their voting intentions [or, to be more generous, change their mind at the last moment] for reasons of social acceptability.

At this stage, with 31 electoral college votes still available, but with only 3 states left to declare, Trump has exceeded the required 270 electoral college votes to win [even without Washington state’s faithless elector]; but with 99% reporting, according to AP, the current state of the popular vote is:

99% reporting
Donald Trump
Republican Party
47%
59,521,401
Hillary Clinton
Democratic Party
48%
59,755,284
Gary Johnson
Libertarian Party
3%
4,050,927
Jill Stein
Green Party
1%
1,210,290
Other candidates
0.7%
798,952

Were any states sufficiently close as to be influenced by potential “lower order” candidates.  Well, yes, quite a few, but most were votes for Gary Johnson, the Libertarian candidate – which are perhaps more likely to break towards Trump (or, more likely, not voted).

What happens if we consider Jill Stein’s voters.  Suppose that she hadn’t been running, and, instead, her voters split 50/50 for Clinton [leftish] and Johnson [challenger / alternative party]

Michigan’s 16 electoral college votes [as of the current reported state of play] would be firmly in Clinton’s column, but no other states would have swung.

If, instead, the split had been 75/25 for Clinton [leftish] and Johnson [challenger / alternative party], then still only Michigan would have been in play for Clinton.

So, the splitting of the left wing vote may have cost Clinton Michigan [subject to the final votes being counted], it cannot be blamed on her losing the Electoral College.

Notes: data obtained via the AP feed (as google reports it) and via http://edition.cnn.com/election/results/president

The CNN website is particularly useful as it gives the county-by-county breakdown across (almost) all states, rather than requiring you to go to individual states webpages.

ps: I have a student who is looking at the relationship between socio-demographic variables and the primary voting patterns in a selection of US states on a county level (or town where appropriate).  It will be interesting to see if anything crops up by applying similar analysis to the general election results to see if there were obvious trends [or if it is just a spatial thing] that had been overlooked during the campaign period.  But that is definitely work for another day, as she has until the end of the second semester to complete her work!

Posted in Uncategorized

## Clearing: Just what have I got myself into?

Back from my holidays, into the process of UCAS clearing. It is the first time I’ve experienced it, as I’m now a programme leader, which means that I now get some input into the relative rankings of different potential students.

Chatting with one of my colleagues at lunch today, we discussed the different systems as experienced by us.  In Belgium, some courses have an entrance examination, but otherwise, on successful completion of secondary level you are qualified to enter a degree course.   This slightly fills me with worry, as there must be some fierce logistical issues with having “no limits” to your potential student numbers, especially for some courses that require access to labs.

In Ireland, the system administered by the Central Applications Office [CAO] gives universities the ability to set minimum requirements in specific subjects and the total number of students that they want to recruit for a particular course. Once that process is complete, the universities essentially have no further input (there are some minor exceptions).  Instead the magic of the Irish predilection for preference based systems comes into its own.  Just considering the direct entry into degree level, Irish students get to rank their top 10 degree programmes.  For the setting of the tariff is completely out of the hands of the universities – it is a complete supply / demand situation that governs the equivalent of the tariff attached to the courses.  Provisional offers are only made for those on non-traditional entry routes, otherwise everything comes down to the students meeting the minimum requirements, the entrance examinations [for Medicine] and then allocating the number of spaces (let’s call that X) available in a manner like this:

• Look at all applicants who gave your course a rank.
• See who passes the minimum requirements.
• Offer a place to the X best students.  The lowest of them sets the “points” total.
• Some students may decline their offer (for any number of reasons), even if students accept an offer, they  you may receive an offer from higher up your original preference list, but not lower down your  preferences…
• Thus there can be many rounds of offers…

The important thing is that it is based on your results; not your personal statement, nor your personal reference, or any subjective input from people like me…

Posted in juggling

## EJC workshop main calendar

Submit your workshop here and they will appear on this calendar. You can view the events either through the calendar below or via a version of resulting spreadsheet here which is edited to not show email addresses (so you can check to see what has already been entered). If you want access to the scripts used to create the calendar events complete the email form at the end of this post.

Unless you are planning to sync the calendar to your own device, then I would recommend using the “Agenda” view if you want to be easily able to read the different workshop / event names.

You can then sync this calendar to you own devices using the following link:

Posted in silly season

## Eurovision 2016: Was Australia Robbed?

With all the serious news abound in the aftermath of the EU referendum, we thought that we would examine something less serious in the European context: the ramifications of changes to the voting system in the Eurovision Song Contest.  This piece was originally written as a piece by myself and Ben Derrick for submission to the Young Statisticians’ writing competition in Significance magazine.

Figure 1: Final Points Won in Eurovision Song Contest 2016 (Australia not to scale)

Eurovision: the drama, the excitement, the statistics. After an engaging climax to the reveal of the votes, the 2016 Eurovision Song Contest in Stockholm was won by Ukraine, with Australia finishing in second place. During the final on May 14th, Eurovision host Petra Mede stated that nothing has changed in the way you vote, it is simply the way they are presented that has changed. What Petra neglected to mention is that the way the winner is calculated has changed.

The organisers introduced a new voting system hoping that it would lead to more exciting end to the night [1]. The process of revealing the results was tense, but the results were different to what they would have been under the old system.

There were twenty-six finalists of the 2016 Eurovision Song Contest. A total of forty two countries participated in the voting process. Each country may not vote for itself. Points are allocated 1 – 8, 10 and 12. Each participating country has a televote and a jury vote; equally weighted.

The calculation underpinning the voting system has evolved over time. In the old system used from 2009-2015, these were combined prior to votes being cast; meaning that each country could give votes to ten other countries; the televotes were used to break any ties in ranks. In the new system, the results of each jury vote were presented – in the form of the points allocated to the ten countries; and then the combined votes of the televoting from all forty two countries were presented from lowest score to highest score. Overall ties in the final positions were then broken by first comparing the number of countries that voted for each of the finalists and then comparing the number of countries who awarded twelve points to the tied countries.

In the new system first used in 2016, the votes were not combined, there was no restriction on countries allocating both sets of points to the same ten countries. To illustrate the two systems, consider Albania’s votes in Table 1; they gave votes to fifteen different countries [2].

Albania’s jury and public voters were in agreement about their favourite song – the Australian entry. The jury placed France second (giving them 10 points), but the televoters did not give any points to France (because the French song was ranked 11th in the Albanian televoting process). The televoters ranked Ukraine 5th (thus allocating them six points) whereas the jury ranked the same song 12th – assigning them “null points”. If there is a disagreement between the ranks given by the jury and the phone-votes, a country may now allocate points to more than ten countries.

 New Method Old Method To Country Jury Rank Televote Rank Jury Points Televote Points Points Given Sum of Ranks Points Given Australia (AUS) 1 1 12 12 24 2 12 Italy (ITA) 3 2 8 10 18 5 10 Russia (RUS) 4 4 7 7 14 8 8 Bulgaria (BGR) 7 3 4 8 12 10 7 France (FRA) 2 11 10 10 13 6 Ukraine (UKR) 12 5 6 6 17 5 Spain (ESP) 5 23 6 6 19 4 Poland (POL) 14 6 5 5 20 3 United Kingdom (GBR) 6 18 5 5 20 2 Lithuania (LTU) 20 7 4 4 24 1 Sweden (SWE) 11 8 3 3 24 0 Armenia (ARM) 15 9 2 2 25 0 Israel (ISR) 8 17 3 3 25 0 Hungary (HUN) 10 10 1 1 2 27 0 Malta (MLT) 9 16 2 2 28 0 Austria (AUT) 16 13 29 Azerbaijan (AZE) 13 19 32 Germany (DEU) 18 15 33 Cyprus (CYP) 23 12 35 Latvia (LVA) 17 21 38 Belgium (BEL) 25 14 39 Croatia (HVR) 21 20 41 Czech Rep. (CZE) 22 22 44 Serbia (SRB) 19 26 45 The Netherlands (NLD) 24 24 48 Georgia (GEO) 26 25 51

Under the old system, the sum of the ranks assigned are sorted from smallest to biggest, with the order for tied ranks being decided by the song that received more viewer votes. Under this system, Sweden just misses out on receiving a point because Lithuania received more viewer votes in Albania.

Taking this into account, what would have happened if no changes had been made to the presentation of the preferences of the 41 different countries and the same calculation method used last year had been used again?

Table 2 shows that under the old method the winners would be Australia, followed by the Ukraine, with Russia in third place. The major winner under the new system (other than Ukraine of course) is Poland; moving from what would have been nineteenth place under the old system to eighth place under the new system.

Table 2: New (actual) results against Old results

 Points Rank Country New Method (2015) Old Method (2009-2015) New Method (2015) Old Method (2009-2015) Ukraine 534 288 1 2 Australia 511 333 2 1 Russia 491 242 3 3 Bulgaria 307 187 4 4 Sweden 261 164 5 6 France 257 171 6 5 Armenia 249 146 7 7 Poland 229 49 8 19 Lithuania 200 108 9 8 Belgium 181 95 10 9 The Netherlands 153 81 11 10 Malta 153 24 12 23 Austria 151 73 13 12 Israel 135 28 14 22 Latvia 132 81 15 11 Italy 124 70 16 13 Azerbaijan 117 55 17 17 Serbia 115 61 18 15 Hungary 108 63 19 14 Georgia 104 59 20 16 Cyprus 96 55 21 18 Spain 77 39 22 20 Croatia 73 24 23 24 United Kingdom 62 30 24 21 Czech Republic 41 2 25 26 Germany 11 8 26 25

The different results between the two methods can be explained by considering the difference between the jury and telephone votes as per Table 3.

Table 3: Comparison of Jury and Televote points allocation

 Televote points Jury Points Difference in points Ukraine 323 211 112 Australia 191 320 -129 Russia 361 130 231 Bulgaria 180 127 53 Sweden 139 122 17 France 109 148 -39 Armenia 134 115 19 Poland 222 7 215 Lithuania 96 104 -8 Belgium 51 130 -79 Malta 16 137 -121 The Netherlands 39 114 -75 Austria 120 31 89 Israel 11 124 -113 Latvia 63 69 -6 Italy 34 90 -56 Azerbaijan 73 44 29 Serbia 80 35 45 Hungary 56 52 4 Georgia 24 80 -56 Cyprus 53 43 10 Spain 10 67 -57 Croatia 33 40 -7 United Kingdom 8 54 -46 Czech Republic 0 41 -41 Germany 10 1 9

There is an apparent wide disparity between the total televote points and the total jury points. Poland received 222 of their 229 points from the televote. The poor performance of the United Kingdom in the televote is in line with previous years, but this was more clearly highlighted on the night by the new method of presenting the results.

The old calculation method allows an entry that is considered average by both the televote and public vote to receive some points, but an entry that is considered very poor by one or the other would be very unlikely to obtain points. The new calculation method allows an entry that is considered to be amongst the poorest by either the jury or the televote to receive substantial points from the other. The pronounced difference between the points allocated by jury and televotes systems to Poland is highlighted in Figure 2.

Figure 2: Top 8 finishers – ordered by final finishing position.

The Australian entry was a solo female singing a moderate tempo ballad. Traditionally this is considered a safe entry and similar entries have done very well in the past. Under the old method every country except for Montenegro would have awarded Australia some points, Ukraine and Russia would have failed to receive points from 5 countries each.

Most of the juries rated Poland very low, but Poland amassed a large total due to the televote, receiving points from every country as illustrated in Figure 2. Under the old method, the televote and jury vote would have been averaged resulting in a more modest score from each country to Poland.

It is surprising to note that Poland have never won Eurovision. With a strong televote secured due to diaspora across Europe, all Poland has to do is provide a song that will also appease the jury vote, and they will certainly be a favourite for victory under this new system.

If the old calculation method were to be applied to the final, it would also be applied to the semi-finals as well. If different countries were to qualify for the final, this would inevitably impact the final voting. The change in voting system did not affect which countries qualified from the semi-finals on this occasion, although it did make some minor differences in the order. Sorry Ireland (and Westlife) fans – you still would not have qualified!

The choice between the old calculation method and the new calculation method highlights the perils of applying rank based approaches. There are many occasions where a group of judges are asked to rank items, but there is no optimum solution for the combination of their ranks. Rank based approaches do not give an indication of the extent of the difference between any two consecutive ranks. In any event, the ranking applied by each judge or individual is fundamentally subjective.

When assessing whether two distributions are equal, the standard test when the observations are paired is the Wilcoxon Rank Sum test. In this example the observations are paired by country. The results of a Wilcoxon Rank Sum test on the distribution of the ranks for the old method compared to the distribution of the ranks for the new system, shows that the two distributions for the two methods do not differ (Z=-1.500, p=0.134). Similarly, a Wilcoxon Signed Rank test comparing the distribution of the points awarded by the televote and the points awarded by the juries, shows that the distributions do not differ (Z=-0.546, p=0.585).

These results are counter-intuitive to the suggestion that the jury and televote opinions show a wide disparity. Therefore the standard test for comparing equality of distributions is not without scrutiny. The mean (and median) rank from the jury is fixed by design to be equal to the mean (and median) rank of the televote. In addition the mean (and median) number of points awarded by both the jury and the televote is also fixed by design. This highlights that the Wilcoxon Rank Sum test is less powerful when the measure of central location are equal for both groups. In fact, if the televote were to rank the countries in the complete opposite order to the jury vote, the test statistic would result in a p-value of 1.000.

This peculiarity could be detected by also calculating the correlation between the two ranks – complete opposite ranks would have a correlation coefficient of -1.

It is difficult to say whether one calculation method is fairer than the other calculation method. The apparent reason behind the reintroduction of the jury vote for use within the old system was to try and nullify the effect of the diaspora and geographical block televote. However, the new method may not be as effective in achieving this goal.

The new method is more transparent, it is clear to see whether the votes are coming from the jury or the televote. In addition, a country is guaranteed to be rewarded if it is liked by either. Certainly the new format for the voting resulted in a dramatic finale, therefore the producers are likely to favour this method for future contests.

Even with clear instructions as to how to use the ranking system, problems such as those faced by Denmark’s “Juror B” may arise, where she ranked the countries in reverse order in error [3].

This could have been detected by noting the negative correlation between the ranks given by Juror B and those given by the other four Danish jurors. The juror repeated the same procedural error in both the semi-final voting stage and during the final. A simple check using correlations could have detected the problem after the semi-final and allowed a reminder to be given about the correct procedures, without disclosing ranks given by other jury members.

In the final, if Danish Juror B had specified her ranks correctly then Australia should have received 12 rather than 10 points from the Danish jury, whereas Ukraine would have received 0 points from the Danish jury (rather than 12 points). This would have made the final points tally even closer. Under the old method, Australia’s margin of victory would have increased.

The number 13 may also have proved to be unlucky for Australia, this position in the running order meant they performed in the first half of the contest. Ukraine and Russia both received positions in the second half of the running order. All but 4 winners in the 21st century have appeared in the second half of the contest. In fact, Australia beat Ukraine in their semi-final, in which they both performed in the second half.

Historically, analyses of the Eurovision song contest voting focus on largely geographical voting blocs. Ukraine is part of the established former Soviet bloc. As relative newcomers to the contest and located outside Europe, Australia are not yet part of an established bloc vote. Eastern Europe and Western Europe appear to be divided over which entry was their favourite, see Figure 3.

Figure 3: Points differential between the points awarded to Ukraine over Australia

Countries in the former Soviet bloc and in closest proximity to Ukraine generally gave more points to the entry from Ukraine than that of Australia. The Scandinavian bloc however was much more favourable to Australia than Ukraine. If Juror B had voted correctly Denmark would have been more obviously part of this bloc.

The Ukrainian entry was considered by some commentators to have a political tone with respect to the Russian and Ukrainian dispute over Crimea, possibly explaining why Russia awarded Ukraine a lower proportion of its total points than usual. However, with its clever staging similar to the 2015 winner, Russia was the pre-contest favourite to win. Political reasons may have encouraged many other countries to vote for Ukraine. This in turn may have further contributed to victory being snatched from Australia.

It appears that everything was conspiring against Australia, but is it fair to conclude that Australia was robbed? Aspects transpiring against Australia can be explained as factors that are to be expected in a subjective competition. Australia can claim to be unfortunate being randomly drawn in the first half of the running order. The new calculation method had been made available well in advance of the contest. Ultimately, the countries that rated Ukraine more favourably than Australia may be down to cultural tastes.

References:

Posted in Brexit, politics

## A lucky immigrant? Immediate post Brexit thoughts

I’m a lucky immigrant. I’m skilled and my skills are transferable from one country to another.  However, the atmosphere in England towards immigrants has noticeably (for me) changed during the very long EU referendum campaign.  The focus on controlling immigration has made me feel uncomfortable in my immigrant status for the first time since I moved to the UK in 2009.  As an Irish citizen I had a vote in this referendum and I used it, to no avail.  I also, according to the majority of Leave campaigners, will not have any changes made to my right to live and work in the UK.

Furthermore, during the campaign Michael Gove actively promoted rejecting expert opinion.  This anti-intellectualism is dangerous.  Leave campaigners are now saying that we shouldn’t trigger Article 50 immediately and that “during the campaign we all said things…”.  The consequences were known and they were highlighted by many, yet voters in England and Wales voted in sufficient numbers to override the decisions made by the majorities of the electorate in Scotland and Northern Ireland.   I’m tempted to invoke a claim that all those areas who voted against the EU should be among the first to lose EU income so that they can see just how much more pro-active the EU is on allocating resources to regions.  For too long, the UK government and press has used the EU as a lazy scapegoat rather than taking responsibility for their actions. However, if this rule were to be invoked, then people and worthwhile activities would be affected by unforeseen consequences, much like David Cameron not expecting to have a majority government and thus not being able to blame the lack of a referendum on a junior coalition partner.  The reality of what many people are only waking up to is a bigger shock to the UK market than the 2008 financial crisis or the 1992 ERM crisis and it is no outside entities fault.

Sitting MPs knowingly repeating lies despite the errors being pointed out should be treated as bring the house into disrepute.  For the first time since moving to the UK, I started looking for jobs outside the UK system when the results became obvious overnight.  Today, and especially last night, has not been a good one for me.

Posted in Uncategorized

## More or Less

Today my first interview on BBC Radio 4 aired – as part of the Friday afternoon programme “More or Less”: available here

I became involved in this when a producer contacted the Royal Statistical Society who fielded it out to the Statistical Ambassadors looking for a volunteer.

What started out as a very simple listener question:”What’s my chance of being called to do jury service” from a Scottish resident threw up many different quirks of the Scottish system.  In order to simplify the problem, I first decided to look at the probability of receiving a citation (the equivalent of a summons); because this part of this could be treated as an essentially random process.

The Scottish Courts service helpfully provided data on the number of citations issued and also the number of jury trials in Scotland, leading to the first quirk of the Scottish system: they have 15 rather than 12 member juries.  From this we could work out the probability of being cited from the Scottish electoral register [which contains some people who are ineligible for service].

I used a poisson distribution to model the probability of receiving 0 (zero) citations in a year.  For ease, I assumed that this rate was approximately constant over the range of years under investigation.  This may, in reality, be a bit of a stretch for the 53 years of typical eligibility, the listener in question had only 9 more years before he could opt out for age reasons. I also assumed that the chances of receiving a citation is independent year-on-year (although eligibility is definitely not independent).  I also assumed that the number of trials in a catchment area was approximately proportional to the number of people on the electoral register – again, this simplification had to be made as that was all of the available data – anything deeper would have been beyond the scope of a general audience radio programme.

Last year, only 13% of those who were cited actually served on a jury in Scotland..

Because you are exempted from jury service for a period after being balloted for service (and for even longer if you actually serve on a jury), looking at the number of times a person can serve on a jury is far more complex.

As for the experience of recording the programme.  All of my interactions were with a producer (who was lovely) – many emails, several phone calls and then a trip for me into BBC Bristol to record my thoughts on a decent ISDN line.  The recording took about 25 minutes in total, partially because some new questions popped up during the recording, meaning that I ran some calculations on the spot!  These were condensed into about a minute on the radio. Because of the format of the show, and the fact that I was prerecorded, it wasn’t nearly as stressful as I’m sure other media experiences can be [I wasn’t there to argue or provide balance against another person].

Listening back to myself was a strange experience – I definitely moderated my voice to ensure that my accent is less pronounced and also I spoke much more deliberately than usual.  Perhaps this was because I was conscious that More or Less goes out to quite an international audience (it is also broadcast on the World Service).

Posted in Uncategorized

## Final Count of Seats: who won what?

So Longford-Westmeath results are in, so we can now look at the geographic breakdown of the final votes and seats in the Irish General Election:

Beginning with the most important: who won seats where?

In the next two plots, the size of the pie-chart are proportional to the number of seats being contested (so Dún Laoighaire was adjusted to only have 3 out of 4 seats contested as the Ceann Comhairle is returned automatically).  Renua won no seats (so the colour orange is slightly redundant in these legends!)

Dublin: Labour won no seats south of the Liffey; but the Greens won two.  Social Democrats won one seat in Dublin; Fine Gael won at least one seat in every Dublin constituency (and two in Dún Laoighaire and Dublin Bay South).  Fianna Fáil improved their lot over 2011 when they won a single seat in Dublin – now they have 6 seats in Dublin.  PBP-AAA won 5 seats in total in Dublin while Sinn Féin won 7. Independents also won 7 seats in Dublin.

Interesting constituencies include Tipperary: 3 out of 5 seats were won by Independent candidates and Roscommon-Galway where 2 out of 3 seats were won by Independents.

Fine Gael managed to return at least one TD in all of the other constituencies (barring the two with two with a majority of independent seats), with two returned in Wexford, Kilkenny-Carlow, Limerick County, Clare, Galway-West, Mayo, Wicklow, Meath-East and Louth.

Fianna Fáil returned at least one TD in all non-Dublin constituencies, with two in Cork South-Central, Cork North-West, Kilkenny-Carlow, Kildare South, Kildare North, Cavan-Monaghan, Sligo-Leitrim and Mayo.

Sinn Féin won two seats in Louth and one in: Donegal, Sligo-Leitrim, Cavan-Monaghan, Meath West, Offaly, Laois, Wicklow, Kilkenny-Carlow, Limerick City, Kerry, Cork North-Central, Cork South Central, Cork East and Waterford.