plotProportions: Plot Proportions of Tied Matches and Non-tied Matches Won

Description

Plot proportions of tied matches and non-tied matches won by the first player, within matches binned by the relative player ability, as expressed by the probability that the first player wins, given the match is not a tie. Add fitted lines for each set of matches, as given by the generalized Davidson model.

Usage

plotProportions(
  win,
  tie = NULL,
  loss,
  player1,
  player2,
  abilities = NULL,
  home.adv = NULL,
  tie.max = NULL,
  tie.scale = NULL,
  tie.mode = NULL,
  at.home1 = NULL,
  at.home2 = NULL,
  data = NULL,
  subset = NULL,
  bin.size = 20,
  xlab = "P(player1 wins | not a tie)",
  ylab = "Proportion",
  legend = NULL,
  col = 1:2,
  ...
)

Value

A list of data frames:

win: a data frame comprising prop.win, the proportion of non-tied matches won by the first player in each bin and bin.win, the mid-point of each bin.
tie: (when ties are present) a data frame comprising prop.tie, the proportion of tied matches in each bin and bin.tie, the mid-point of each bin.

Arguments

win: a logical vector: TRUE if player1 wins, FALSE otherwise.
tie: a logical vector: TRUE if the outcome is a tie, FALSE otherwise (NULL if there are no ties).
loss: a logical vector: TRUE if player1 loses, FALSE otherwise.
player1: an ID factor specifying the first player in each contest, with the same set of levels as player2.
player2: an ID factor specifying the second player in each contest, with the same set of levels as player2.
abilities: the fitted abilities from a generalized Davidson model (or a Bradley-Terry model).
home.adv: if applicable, the fitted home advantage parameter from a generalized Davidson model (or a Bradley-Terry model).
tie.max: the fitted parameter from a generalized Davidson model corresponding to the maximum tie probability.
tie.scale: if applicable, the fitted parameter from a generalized Davidson model corresponding to the scale of dependence of the tie probability on the probability that player1 wins, given the outcome is not a draw.
tie.mode: if applicable, the fitted parameter from a generalized Davidson model corresponding to the location of maximum tie probability, in terms of the probability that player1 wins, given the outcome is not a draw.
at.home1: a logical vector: TRUE if player1 is at home, FALSE otherwise.
at.home2: a logical vector: TRUE if player2 is at home, FALSE otherwise.
data: an optional data frame providing variables required by the model, with one observation per match.
subset: an optional logical or numeric vector specifying a subset of observations to include in the plot.
bin.size: the approximate number of matches in each bin.
xlab: the label to use for the x-axis.
ylab: the label to use for the y-axis.
legend: text to use for the legend.
col: a vector specifying colours to use for the proportion of non-tied matches won and the proportion of tied matches.
...: further arguments passed to plot.

Author

Heather Turner

Details

If home.adv is specified, the results are re-ordered if necessary so that the home player comes first; any matches played on neutral ground are omitted.

First the probability that the first player wins given that the match is not a tie is computed: $$expit(home.adv + abilities[player1] - abilities[player2])$$ where home.adv and abilities are parameters from a generalized Davidson model that have been estimated on the log scale.

The matches are then binned according to this probability, grouping together matches with similar relative ability between the first player and the second player. Within each bin, the proportion of tied matches is computed and these proportions are plotted against the mid-point of the bin. Then the bins are re-computed omitting the tied games and the proportion of non-tied matches won by the first player is found and plotted against the new mid-point.

Finally curves are added for the probability of a tie and the conditional probability of win given the match is not a tie, under a generalized Davidson model with parameters as specified by tie.max, tie.scale and tie.mode.

The function can also be used to plot the proportions of wins along with the fitted probability of a win under the Bradley-Terry model.

Examples

Run this code


#### A Bradley-Terry example using icehockey data

## Fit the standard Bradley-Terry model, ignoring home advantage
standardBT <- BTm(outcome = result,
                  player1 = visitor, player2 = opponent,
                  id = "team", data = icehockey)

## comparing teams on a "level playing field"
levelBT <- BTm(result,
               data.frame(team = visitor, home.ice = 0),
               data.frame(team = opponent, home.ice = home.ice),
               ~ team + home.ice,
               id = "team", data = icehockey)

## compare fit to observed proportion won
## exclude tied matches as not explicitly modelled here
par(mfrow = c(1, 2))
plotProportions(win = result == 1, loss = result == 0,
                player1 = visitor, player2 = opponent,
                abilities = BTabilities(standardBT)[,1],
                data = icehockey, subset = result != 0.5,
                main = "Without home advantage")

plotProportions(win = result == 1, loss = result == 0,
                player1 = visitor, player2 = opponent,
                home.adv = coef(levelBT)["home.ice"],
                at.home1 = 0, at.home2 = home.ice,
                abilities = BTabilities(levelBT)[,1],
                data = icehockey, subset = result != 0.5,
                main = "With home advantage")

#### A generalized Davidson example using football data
if (require(gnm)) {

    ## subset to first and last season for illustration
    football <- subset(football, season %in% c("2008-9", "2012-13"))

    ## convert to trinomial counts
    football.tri <- expandCategorical(football, "result", idvar = "match")

    ## add variable to indicate whether team playing at home
    football.tri$at.home <- !logical(nrow(football.tri))

    ## fit Davidson model
    Dav <- gnm(count ~ GenDavidson(result == 1, result == 0, result == -1,
                                   home:season, away:season, home.adv = ~1,
                                   tie.max = ~1,
                                   at.home1 = at.home,
                                   at.home2 = !at.home) - 1,
               eliminate = match, family = poisson, data = football.tri)

    ## fit shifted & scaled Davidson model
    shifScalDav <- gnm(count ~
        GenDavidson(result == 1, result == 0, result == -1,
                    home:season, away:season, home.adv = ~1,
                    tie.max = ~1, tie.scale = ~1, tie.mode = ~1,
                    at.home1 = at.home,
                    at.home2 = !at.home) - 1,
        eliminate = match, family = poisson, data = football.tri)

    ## diagnostic plots
    main <- c("Davidson", "Shifted & Scaled Davidson")
    mod <- list(Dav, shifScalDav)
    names(mod) <- main
    alpha <- names(coef(Dav)[-(1:2)])

    ## use football.tri data so that at.home can be found,
    ## but restrict to actual match results
    par(mfrow = c(1,2))
    for (i in 1:2) {
        coef <- parameters(mod[[i]])
        plotProportions(result == 1, result == 0, result == -1,
                        home:season, away:season,
                        abilities = coef[alpha],
                        home.adv = coef["home.adv"],
                        tie.max = coef["tie.max"],
                        tie.scale = coef["tie.scale"],
                        tie.mode = coef["tie.mode"],
                        at.home1 = at.home,
                        at.home2 = !at.home,
                        main = main[i],
                        data = football.tri, subset = count == 1)
    }
}

Run the code above in your browser using DataLab