2017 EOMMAnEngagementOptimizedMatchm

(Chen, Xue, et al., 2017) ⇒ Zhengxing Chen, Su Xue, John Kolen, Navid Aghdaie, Kazi A. Zaman, Yizhou Sun, and Magy Seif El-Nasr. (2017). “EOMM: An Engagement Optimized Matchmaking Framework.” In: Proceedings of the 26th International Conference on World Wide Web. ISBN:978-1-4503-4913-0 doi:10.1145/3038912.3052559

Subject Headings: Video Game Matchmaking, Player Engagement Prediction, Engagement Optimized Matchmaking (EOMM), Game Outcome Prediction.

Notes

Cited By

Quotes

Abstract

Matchmaking connects multiple players to participate in online player-versus-player games. Current matchmaking systems depend on a single core strategy: create fair games at all times. These systems pair similarly skilled players on the assumption that a fair game is best player experience. We will demonstrate, however, that this intuitive assumption sometimes fails and that matchmaking based on fairness is not optimal for engagement.

In this paper, we propose an Engagement Optimized Matchmaking (EOMM) framework that maximizes overall player engagement. We prove that equal-skill based matchmaking is a special case of EOMM on a highly simplified assumption that rarely holds in reality. Our simulation on real data from a popular game made by Electronic Arts, Inc. (EA) supports our theoretical results, showing significant improvement in enhancing player engagement compared to existing matchmaking methods.

1. INTRODUCTION

Player-versus-Player (PvP) is a mode of video game in which multiple players directly engage in competition or combat. PvP games, which cover many popular genres, such as multiplayer on- line battle arena (MOBA), first-person shooting (FPS), and e-Sports, have increased worldwide popularity in recent years. For example, League of Legends, one of the most played MOBA games, has 90 million summoner names registered, 27 million unique daily players and 7.5 million concurrent users [30, 41]. As data released by [39] shows, e-Sports is estimated to have 188 million viewer- ship and 748 million dollar worth market in 2015 and the numbers are expected to grow continuously.

Matchmaking is the process that connects players to form PvP matches. In practice, a matchmaking system takes practical lim- itations, such as players’ geo-location and network latency, into consideration. For example, cross-ocean pairing is not good for player experience. Beyond technical constraints, the strategy var- ious matchmaking systems employ is creating fair games. This strategy relies on the assumption that matching closely skilled players tend to create competitive games which are desired by players [21]. In order to establish player skills, numerous models have been studied, such as Elo [16], Glicko [20] and TrueSkill [23].

Are fairly matched games always beneficial for player experi- ence? This fundamental, yet intuitive, assumption is worthy of deep investigation. We can challenge it with a few examples. Consider a cautious player who cares about protecting his rank among friends, and a risk taker who enjoys difficult matches. Pairing them with the similarly skilled opponents will affect these players very differ- ently. Even for the same player, their expectation on the coming match when they just lost three games in a row can be very differ- ent from that when they recently performed well. In Table 1, we show an example that churn risks vary drastically upon players’ re- cent match outcomes in a popular PvP game made by EA. These facts lead to two key insights: (1) the effectiveness of matchmaking needs to be measured quantitatively; and (2) matchmaking should depend on dynamic and individual player states.

In this paper, we propose a new matchmaking framework, En- gagement Optimized Matchmaking (EOMM). By formulating matchmaking into an optimization problem, we pair players in or- der to maximize the overall player engagement, or equivalently, minimize the overall player disengagement. First, we measure a player’s disengagement by their churn risk after each matchmaking decision. Here churn refers to no gameplay within a period of time, such as a week. Second, we model all players who wait in the matchmaking pool as a complete graph, where each player is a node, and an edge between two players is their sum churn risks if paired. The churn risk depends on individual player states at the moment of matchmaking. Last, we can achieve engagement opti- mized matchmaking efficiently by solving a minimum weight per- fect matching (MWPM) problem that finds non-overlapping pairs with the minimal sum of edge weights on a complete graph.

Table1: Anexampleabouttheimpactofplayerstatesontheir engagement. Data is from a popular PvP game made by EA. Average churn risks vary drastically upon players’ recent three match outcomes ((W)in, (L)ose or (D)raw). Churn risk is mea- sured by the ratio of the players who stop playing within a pe- riod time (7 days in this table) after a match. The churn risk of some states with repeated losses (5.1%) is almost twice as much as those of other "safer" states (2.6%-2.7%).

EOMM provides a solid theoretical framework for matchmaking analysis. With it, we prove that equal-skill based matchmaking is a special case of EOMM on a simplified and often inapplicable assumption about player states. The generic EOMM instead proves to be optimal across a wide range of contexts.

For system development, EOMM is both flexible and computa- tionally feasible. The optimization objective can be tuned for var- ious interests, e.g., in-game time, or even spending. Furthermore, EOMM consists of three components: a skill model, a churn pre- diction model and a graph matching model. All can be efficiently implemented and independently upgraded. We built a simulated system based on real data of a popular game made by Electronic Arts, Inc. (EA), showing significant improvement in enhancing player engagement by EOMM against equal-skill based and other matchmaking methods.

In sum, this paper contains the following contributions. First, we propose an engagement optimized matchmaking framework, i.e., EOMM, which solves matchmaking as an optimization problem of maximizing the overall player engagement. Second, we provide theoretical analysis about the optimality of EOMM and the condi- tions of the applicability of existing matchmaking methods. Last, we build a simulated system using real game data to show signif- icant advantages of EOMM in retaining players over the existing matchmaking methods.

The rest of this paper is organized as follows. After reviewing the related work, we will present the formulation of matchmaking as an optimization problem on a graph. Then we describe theoretical findings comparing EOMM and other matchmaking methods. We then show the case study applying EOMM on real data. Finally, we will conclude with a discussion of the results and future directions.

2. PREVIOUS WORK

2.1 Skill Modeling

The motivation behind skill rating is to rank players and to enable skill-based matchmaking. Dating back to 1952, the Bradley-Terry model [7] was developed to deal with repeated pairwise compar- isons among a group of subjects. In the Bradley-Terry model, a player i is assumed to have a fixed, positive skill scalar, ri, and the winning probability of player i against player j is the ratio of player i’s skill in the sum of skills of both players. In its original form, the Bradley-Terry model estimates player skills only after observing all pairwise comparisons. While feasible for small groups of players, requiring O(n2) matches is prohibitive for large player pools. One can show that the Bradley-Terry model is equivalent to a logistic regression model [2] in which each coefficient wi corresponds to log(ri ).

The Elo system [16] addresses the relative skill ratings in player- versus-player games, such as chess, with a probabilistic model. Elo captures player performance, pi, as a random variable following a one dimension Gaussian distribution with a mean, ri, and a fixed variance, β2, shared by all players. In the Elo system, ri gets up- dated depending on the extent of agreement between expected out- comes and real outcomes. For example, a low skill player beating a high skill player yields a large update in adjusting their skill means closer. Unlike the original Bradley-Terry model, ri can be updated at an ongoing basis, i.e., as soon as after every match of player i. The Glicko system [20], a Bayesian ranking rating system, was later introduced. Besides mean player skill, ri, it also models the belief about a player’s skill as RDi (rating deviation). As they play an increasingly number of games, the belief about their skills become stronger hence RDi decreases. However, RDi increases when a player ceases to play for long time. To achieve high effi- ciency, Glicko uses an approximation Bayesian algorithm to update ri and RDi.

Neither the Bradley-Terry model, the Elo system or the Glicko system was initially applicable to team-oriented games until works such as [23, 24, 29] to generalize these models. For example, TrueSkill system [23] extends the Elo system to games with flex- ible numbers of players and teams.

Researchers have proposed more advanced skill models to capture player skills in multiple facets. The works in [8, 38] model player skills in multi-dimensions such as offensive and defensive abilities. Delalleau et al. [10] proposed a neural network based skill model which learns latent skill embeddings of players and is claimed to outperform TrueSkill in a team based game. There are skill models proposed for specific game genres, such as [11] for chess, [9, 40] for MOBA games and for [3] RTS games.

In our paper, we will compare EOMM with skill based match- making methods which leverage skill models. Skill models can also facilitate EOMM in the decision of player assignment.

2.2 Matchmaking Strategies

There has been much research ensuring physical criteria of match- making services such as network connection quality [1, 27, 28]. Besides physical criteria, matchmaking can be seen as a player modeling technique [45] that extracts player information and delivers adaptive gaming experience [33]. A fair amount of matchmaking systems assume that skill balanced games are good for engage- ment [21] and hence resort to skill rating algorithms to identifying similarly skilled opponents. Mys ́lak and Deja [32] suggests addi- tional information about player preferences in in-game avatar roles can further improve fairness-based matchmaking systems. A few researchers have explored methods to improve player engagement through matchmaking. Delalleau et al. [10] proposed to train a neural network based architecture which predicts player enjoyment based on their historical statistics. They measured enjoyment by directly asking players for feedback after each match. Whether or not this method can effectively collect sufficient feedbacks has not been demonstrated with real data. Jiménez-Rodrıguez et al. [25] proposed that matchmaking could be based on preferred roles by players. They argue that a fun match should have players act in roles with perceivably joyful role distribution. However, it is still a conceptual, heuristic-based method without experiment showing that such matchmaking system indeed improves concrete engage- ment metrics. To our best knowledge, we have not seen any existing matchmaking method that formally treats matchmaking as an optimization problem to maximize player engagement.

 Last 3 Outcomes
 Churn Risk
DLW | LLW | LDW | DDD ...
WWW

...

DLL | LWL | LDL WWL
LLL
       2.6% - 2.7% ... 3.7%

... 4.6% - 4.7% 4.9% 5.1%

             2. 2.1

2.3 Player Engagement Prediction

Player engagement can be seen as an objective measurement of user experience in games [6]. Player engagement can be embodied by many specific metrics, such as time or money spent in the game, the number of matches played within a time window, or churn risk. We define churn risk as the proportion of total players stopping playing the game over a period of time. Churn prediction has been applied within various disciplines for decades, such as telecommunications [17], online advertisements [46] and insurance [31]. Video games have also sparked a num- ber of churn analysis studies. For instance, Weber et al. [43] built a regression model to predict the number of games played. They also used the model to aid game design by identifying the most in- fluential features on player retention. Hadiji et al. [22] established the fundamental in churn prediction in free-to-play (F2P) games by suggesting definitions of various churn behaviors, proposing universal behavioral telemetry, and comparing different machine learning models across five commercial F2P games. Runge et al. [37] not only trained a churn prediction model for a casual social game but also showed how the game can leverage the model to in- crease the effectiveness of promotions to players. In EOMM, we employ churn prediction as an important building component in engagement optimization.

2.4 Graph Matching

In a graph G = (V,E), a matching is a set of pairwise non- adjacent edges [44]; that is, no two edges share a common vertex. A perfect matching is a matching with every vertex in G incident on exactly one edge in the matching. In a weighted graph G, a mini- mum weight matching (MWM) is the matching with the lowest sum of edge weights. A minimum weight perfect matching (MWPM) is the perfect matching with the lowest sum of edge weights.

As will be shown in Section 3, the EOMM framework converts the problem of determining optimal match assignment to the prob- lem of seeking MWPM on a weighted graph. MWM/MWPM have broad applications in other fields, including creating pairs following specific rules in chess tournaments [34], schdeduling training sessions among NASA shuttle cockpit simulators [4] and transmit- ting images over networks [36]. In a similar spirit, Ólafsson [34] leverages MWPM algorithm to determine opponents. Their goal, however, was to create matches maximally adhering specific rules of chess tournament, which is different than ours to optimize for player engagement.

The first attempt to solve MWPM is the polynomial time blos- som algorithm proposed by Edmond [14, 15] in 1965. Since then, researchers have steadily improved upon this algorithm. We will compare and discuss those improved methods later when we introduce EOMM.

Player pool 𝒫
Complete graph 𝒢
           Engagement Optimized Matchmaking (EOMM)
ENGAGEMENT OPTIMIZED
MATCHMAKING

Figure 1: Model matchmaking on a complete graph. Each node represents a player, and every edge is associated with the sum engagement metric of two players if paired. EOMM amounts to finding an optimal pair assignment on G.

3.1 Optimization Objective

In practice, matchmaking is applied to a pool of players, P = {p1,··· ,pN}, who are waiting to start 1-vs-1 matches. We assume N to be an even number such that all players can be paired. The objective of EOMM is to maximize the overall player engage- ment, or equivalently, minimize the overall player disengagement. We use churn risk as a concrete metric of disengagement. The term “churn” is used by convention, which actually represents a status of disengagement, i.e., a player not playing any games within a subsequent time frame, not necessarily a permanent churn. We denote the churn risk of player pi after matchmaking with player pj as ci,j , which is a function of both players’ states, i.e., ci,j = Pr(pi churns|si , sj ) = c(si , sj ). A player state is a collection of features that profile an individual player, including but not lim- ited to install date, skill, play frequency, performance and etc. We will elaborate on learning ci,j in the subsequent sections. Note that ci,j ̸= cj,i since two players in a paired match may be impacted differently. We use a list of player tuples, M = {(pi , pj )}, to denote a matchmaking result, i.e., a pair assignment, in which all players in P are paired once and only once. Defining the overall player disengagement as the sum of individual churn risks, EOMM seeks for an optimal pair assignment M∗ such that: M∗ = argmin 􏰂 c(si,sj)+c(sj,si) (1) M (pi ,pj )∈M We construct a graph, G, to model this environment (see Fig- ure 1). Each player pi is a node of the graph, who has a player state, si, before matchmaking. The edge between two players pi and pj is associated with a weight ci,j + cj,i , which is the expected sum disengagement metric if they are paired. Note that G is a com- plete graph in that all pairs of players can be possibly connected. Once all ci,j are computed, finding M∗ in Eqn. 1 is converted to a minimum weight perfect matching problem, i.e., finding a pair assignment with the minimal sum weights of edges on graph G.

3.2 Predicting Churn Risks

We learn the function ci,j = c(si,sj) as a churn prediction problem. In its original form, the churn risk ci,j of player pi af- ter matchmaking depends on the states from both the player and their opponent. Unfortunately, the well-established churn predic- tion studies cannot be employed here because they only use features of players themselves without considering those of opponents. Also, naively feeding both player states as input will double the

In this section, we will introduce the EOMM framework which formulates matchmaking as an optimization problem. In contrast with the existing matchmaking methods that heuristically pair sim- ilarly skilled co-players, EOMM aims to match players in an opti- mal way that maximizes overall player engagement. Here we will describe the details of match assignment for 1-vs-1 games. We will discuss how EOMM can be extended to the matches with more players in the final section.

feature dimension, which makes the prediction unintelligible and harder since much more training data is needed. One way to simplify the prediction of ci,j is to base it only on player pi ’s own state, si , and the resulting match outcome, oi,j , from the view of pi. This works because the opponent’s state, sj, such as skill, play history and style, does not directly interact with player pi’s churn risk ci,j. It, however, influences the upcoming match outcome, which is directly perceivable by player pi and thus affects pi’s churn. Once the match outcome oi,j is known, ci,j be- comes conditionally independent to the opponent’s state, sj . For- mally, this property is represented as:

assignment, M∗ , on a complete graph, G , which has the minimal sum weights of edges. c(si, sj , oi,j ) = c(si, oi,j ) (3)

In this paper, we assume that game outcomes are sampled from a finiteset,O,suchasWin,LoseandDraw.Forexample,oi,j =W means that pi wins over pj, while oj,i = L represents the same outcome from the view of pj . To predict game outcomes, we em- ploy the standard skill models [16, 20] that are widely adopted in the video game industry. These models use both players’ skills, which are a proxy of their entire player states, as the input for the prediction. We denote player pi ’s skill representation as μi , which is, for example, Elo score [16] or Glicko mean and RD [20]. Note that μi is part of player state si. As a result, we have:

Pr(oi,j |si , sj ) ≈ Pr(oi,j |μi , μj ), (4) Putting them together, we can efficiently predict the churn risks of paired players in Eqn. 1: c(si, sj ) + c(sj , si) (5) = 􏰂 oi,j ∈O ≈ 􏰂 oi,j ∈O Pr(oi,j|si,sj)(c(si,sj,oi,j) + c(sj,si,oj,i)) (6) Pr(oi,j|μ ,μ )(c(si,oi,j) + c(sj,oj,i)), ij (7)

where the first equality is a marginalization on game outcome, oi,j . In the approximate equality, the conditional independence of ci,j on sj given oi,j (Eqn. 3) and the game outcome prediction (Eqn. 4) are used. Now c(si , oi,j ) can be efficiently learned based on any preferred churn prediction model. The input features are the updated player state based on the predicted game outcome of the hypothetical matchmaking, i.e., supdate ← si and oi,j . We can decompose i the original player state as si = [oKi ,sˆi], where oKi is a vector of the latest K game outcomes (for example, oKi = LWLDL when K = 5), and sˆi represent the rest of features in si. If pi is hypothetically matched with pj , si will be updated as:

supdate ←si and oi,j (8) i = [oKi , sˆi] and oi,j (9)

also updated after the new match. For example, the total number of games played increments by one. 3.3 Finding the Optimal Pair Assignment Given the predicted churn risks of each pair of players, i.e., the weight of every edge in G, EOMM reduces to a minimum weight perfect matching (MWPM) problem. The goal is to find a pair = [oK+1, sˆupdate] ii (10) We use sˆupdate to indicate that non-game-outcome features are i For a graph with N node, the brute-force way is to exhaustedly 􏰀􏰁N compare all N /2 2 possible pair assignments and find the best N/2

 Pr(pi churns|si , sj , oi,j ) = Pr(pi churns|si , oi,j ), which can be written in a concise form:

(2)

one, but the time complexity is too high to be feasible in practical systems. Fortunately, many polynomial time algorithms exist for the MWPM problem. For example, several algorithms can solve the problem in the worst time complexity O(N3) [18, 26]. If en- gagement measurements are pure integers, there exists a slightly faster algorithm [19] with running time O(N2 3 log K) where K is the largest magnitude of an edge weight. There also exist greedy algorithms, such as [12] and [13], with faster running time to find suboptimal solutions. Moreover, MWPM can be solved in parallel as proposed by [35].

4. THEORETICAL FINDINGS

Besides generating optimal matchmaking assignments, EOMM provides a framework to conduct theoretical analysis on other match- making related problems. We use this framework to compare EOMM with other matchmaking strategies under different hypo- thetical situations to obtain many insights. Without loss of gener- ality, we focus our discussion on 1-vs-1 games with possible game outcomes sampled from Win, Lose and Draw.

Using the same notation in Section 3, we investigate a pair of players pi,pj ∈ P,i ̸= j. When c(si,sj) = c(oi,j), i.e., a player’s churn risk only depends on the game outcome of the up- coming match, regardless of all other states. This simplification for Eqn. 8, where supdate only considers oi,j but ignores si, has i interesting implications.

• If c(W in) + c(Lose) > 2 · c(Draw), i.e., the sum churn risk of two matched players in a tied game is lower than that in a non-tied game. Under this circumstance, the equal-skill based matchmaking is equivalent to EOMM, as both strive to form matches with Draw outcomes as many as possible. This explains the intuition and popularity behind equal-skill matchmaking. But we should be very aware of its conditional applicability, while EOMM is instead always optimal.

• If c(Win) + c(Lose) < 2 · c(Draw), equal-skill based matchmaking is actually worst among all matchmaking schemes, as its goal to create close matches contrarily mini- mizes the overall player engagement. Although this situation contradicts with the common intuition that fair matches are good, it is possible for a real game. Therefore validating the assumptions with real game data is critical before applying an equal-skill based matchmaking algorithm.

When c(si , sj ) = c(si ), i.e., a player’s churn risk is determined by his state before matchmaking, then it does not matter whom they will play. In this case, EOMM can do no better than a ran- dom matchmaking. Random matchmaking, from this perspective, is not as trivial as we thought. It is a relative safe and stable base- line choice in lack of prior information. While equal-skill based method can perform the worst under certain conditions, random matchmaking will never fall into the worst case. The analysis above shows that the existing matchmaking meth- ods, such as equal-skill based and random matching, arise within the EOMM framework on different conditions. Practitioners can safely apply EOMM while gathering more information about their game and players.

Figure 2: Predicted win probability vs. real win probability. Real win probability is the ratio of matches, with similar pre- dicted winning probabilities, whose outcomes are real “Wins”.

5. CASE STUDY

To test the proposed matchmaking framework, we ran simulation which is configured based on the real data from a popular PvP game made by Electronic Arts Inc. (EA). In the simulation, we compared different matchmaking methods applied to the same player popu- lation. In the end, EOMM retained significantly higher number of players than other matchmaking methods.

5.1 Data Collection

We collected 1-vs-1 matches from a popular game made by EA. There are three possible match outcomes, namely Win, Lose and Draw. In total, we collected 36.9 million matches played by 1.68 million unique players in the first half of 2016.

5.2 Preparation

To create a realistic environment for simulation, the following models and functions are needed. We compute them based on real game data. Player Skills We need to establish a distribution of player skills for the population we simulate on. The distribution is learned from real game data. We sorted the collected real matches temporally and applied Glicko [20] to compute each player’s final skill. For each player i, the skill vector is represented by mean ri and vari- anceRDi,i.e.μi =(ri,RDi).Insimulation,weassumethatthe game and player skills are stationary. The population’s skill distri- bution is constant, where each player’s skill does not change any more over time. While Glicko scores can be used to estimate the winning prob- ability of player i over player j, Pr(i > j|μi,μj), they cannot provide the probability of draws. We defined a set of rules to allow the estimation of win/lose/draw probabilities from Glicko scores:

Pr∗(i = j) = 20% (11)
80% · Pr(i > j|μi,μj)
Pr(i > j|μi,μj) + Pr(j > i|μj,μi)

Figure 3: Predicted churn risk vs. real churn risk. Real churn risk is the ratio of matches, with similar predicted churn risks, which are indeed the last match before churn.

Basically, the draw probability (Eqn. 11) is set to 20% regardless of skill gaps. This is based on our findings that 1) draw outcomes only have −0.05 correlation with the difference of skill means in the collected game data; 2) around 20% matches are draws re- gardless of skill gaps. The win/lose probabilities are normalized such that the probabilities of win, lose and draw sum up to 1. Fig- ure 2 shows that the predicted win probabilities using Glicko scores based on our rules are well aligned with the real match outcomes. Churn Prediction Model We trained a logistic regression model for predicting whether a player will be an eight-hour churner after a match. The input features describe the upcoming match and the player’s 10 most recent matches. A player is labeled as an eight- hour churner if they do not play any 1-vs-1 match within the next eight hours after playing this match. As discussed in Section 3, the term of “churn” is used by convention. It represents “stopping playing” within a period of time, which is a metric of disengage- ment. We use Eqn. 7 to estimate c(si , sj ) + c(sj , si ). The model takes as input the player’s state si before matchmaking along with the upcoming match outcome oi,j .

Specifically, the input features consist of:

• Each of the player’s 10 most recent matches: win/lose/draw status, time passage since the previous match, time passage to the upcoming match, and goal difference against his opponent

• Upcomingmatch:one-hotencodingoftheupcomingmatch’s outcome win/lose/draw

• Other: the number of 1-vs-1 matches played in the last eight hours, one day, one week and one month.

We use 5-fold cross validation and grid search to determine the proper L2 regularization strength when training the model. The predicted probabilities are well aligned with the real churn prob- abilities, in particular when churn risk is less than 0.8, as shown in Figure 3. While the performance of the predictive model still has room to improve, the flexibility of EOMM allows one to easily refine or replace the model if better ones are found.

Pr∗(i > j) = Pr∗(i < j) = 1−Pr∗(i = j)−Pr∗(i > j) (13) (12)

Player States In simulation, each player’s state is sampled from a collection of states, which are established based on real players’ states in the collected data. We first randomly sample a subset of matches. Both players’ states in those matches are gathered to cre- ate this collection. A player state contains the needed features for churn prediction, as well as the player’s skill score.

5.3 Simulation Procedure

In the simulation, we compared EOMM with three matchmaking mechanisms: random matchmaking (RandomMM), which ran- domly pairs available players in the waiting pool, skill-based match- making (SkillMM), which pairs every two consecutive players after sorting them by skills, and worst matchmaking (WorstMM), which does the opposite of EOMM by minimizing the objective func- tion of EOMM. SkillMM always seeks “fair games”. We added WorstMM as a validation.

All methods are applied on the same population (waiting pool), where the same player skill distribution, churn model and player state distribution as described in Section 5.2 are used. EOMM fol- lows Eqn. 7 to estimate churn risk c(si , sj ) + c(sj , si ). We used the perfect matching algorithm [18, 26] implemented by an open- source library [42]. For each matchmaking method M, the procedure within each round of simulation is as follows:

. Create a waiting pool of P players, whose player states are sampled from the player state collection.
. Use M to determine the pair assignment (matchmaking).
. Simulate match outcomes according to the win/lose/draw probability predicted by the skill model
. For each player, simulate if he will churn according to the predicted churn probability by churn model.
. Record the number of retained players.

In experiments, we tested P = 100, 200, 300, 400 and 500. For each setting of P, we repeated the simulation by 10,000 rounds of matchmaking. We compare different matchmaking methods by the average number of their retained players per round, i.e., the players who continue playing in the next eight hours. In order to test statistical significance, we conducted Welch’s t-test between every pair of the matchmaking algorithms.

5.4 Results and Discussion

The results are shown in Table 2. All pairwise differences of re- tained players are statistically significant (p-value < 0.01) except EOMM vs. RandomMM (when P = 100) and SkillMM vs. Ran- domMM (when P = 400). In all other scenarios, EOMM outper- forms the other three matchmaking methods. The results prove the applicability of EOMM to act as an engagement optimizer. When P = 100, EOMM does not retain a significantly higher number of players than RandomMM, and even retains fewer players than SkillMM. It is possibly because that when P is small, the random- ness has higher impact, and also, the room for arranging opponents is smaller. More rounds of simulations might be needed to show significance in this case.

The improvement of EOMM over SkillMM, the most common matchmaking method, in terms of the average number of retained players are 0.3%, 0.9%, 1.1%, and 0.6% when P = 200, 300, 400 and 500 respectively. On average EOMM retains 0.7% more player compared with SkillMM after one round of matchmaking. No- tably the benefit of retention will accumulate over time in a con- stant population. For players who play 20 rounds of matchmaking

Table 2: Average number of retained players per round of matchmaking simulation. 10,000 rounds of matchmaking were simulated.

Method P=100 P=200 P=300 P=400

P=500 258.21 259.24 259.65 261.19

WorstMM 51.50 SkillMM 52.52 RandomMM 51.81 EOMM 51.90

103.39 154.57 206.65 103.96 156.05 207.43 103.97 156.09 207.09 104.24 157.50 209.37

   games within eight hours, there will be 15% more players retained (1.00720 ≈ 1.15) by EOMM over those by SkillMM. The more rounds of matchmaking are conducted, the more significant is the accumulative advantage of EOMM in engagement.

We did not find a consistent climb in retention boost as P in- creased. This may suggest that when the player pool reaches certain size, the choices of opponents are enough to rescue those players on the edge of churn. Beyond this size, a larger player pool may not bring in significantly extra benefits in engagement maximization.

As a validation, WorstMM consistently retains the fewest players in the pools of all sizes. This result verifies the optimum of EOMM from the opposite side. It is also interesting to note that SkillMM does not consistently outperform RandomMM, which is aligned with our discussion in the theoretical findings in Section 4, that is, balanced matches are not always optimal for engagement.

6. CONCLUSION

The paper presents a novel framework to achieve engagement optimized matchmaking (EOMM). It formulates matchmaking as a problem of maximizing the player engagement, and solves the optimization efficiently. EOMM employs three components, a skill model, an engagement predictive model and a minimum weight perfect matching algorithm, each of which can be tailored flexibly for specific applications. We ran simulations whose configurations were based on real data from an online PvP game. The results show that EOMM significantly outperforms all other methods in the number of retained players. EOMM also provides a theoretical framework to analyze various matchmaking algorithms. EOMM provides a measurable and flexible matchmaking frame- work. It has well-defined quantitative objectives that can be mon- itored, evaluated and improved. Within the EOMM framework, the core building components, skill model, churn model and graph pairing model, are uncoupled so that they can be tuned and replaced independently. Moreover, we can even change the objective func- tion to other core game metrics of interest, such as play time, re- tention, or spending. EOMM allows one to easily plug in different types of predictive models to achieve the optimization.

So far we have discussed EOMM in 1-vs-1 game scenarios. This framework also applies to PvP games that involve teams of players, where every component needs to be extended with additional care. The skill model can be simply applied to a team by adding up skills for all team members [23]. For churn prediction, we can use the same idea that one player’s churn risk is conditionally independent with other players’ states given that their influence on the player’s own state, such as the game outcome, is known. Last, the minimum weight perfect matching algorithms for pairs are no longer applica- ble. Instead of a pair assignment, we seek a grouping assignment on a complete graph. A related area to investigate is perfect matching in hypergraphs [5], where an edge can connect more than two vertices. Furthermore, EOMM is not even limited to games. In broad applications, such as friend connection in a social network and 1-on-1 tutoring in online education, EOMM’s formulation and optimization techniques still apply.

In the future, we expect EOMM equipped with more advanced models, such as skill model and churn model, can have higher op- timal bound. We will explore EOMM performance in more realis- tic situations, where practical restrictions are applied, such as net- work connectivity, regions and friend/black lists. More restrictions would result in fewer edges in the constructed graph of EOMM. Last, we will explore EOMM in multi-player games with more than two players involved and efficient algorithms analogous to perfect matching algorithms within hypergraphs.

References

1. Sharad Agarwal, Jacob R. Lorch, Matchmaking for Online Games and Other Latency-sensitive P2P Systems, ACM SIGCOMM Computer Communication Review, v.39 n.4, October 2009 doi:10.1145/1594977.1592605
2. A. Agresti and M. Kateri. Categorical Data Analysis. Springer, 2011.
3. Tetske Avontuur, Pieter Spronck, Menno Van Zaanen, Player Skill Modeling in Starcraft II, Proceedings of the Ninth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, October 14-18, 2013, Boston, MA, USA
4. C. E. Bell. Weighted Matching with Vertex Weights: An Application to Scheduling Training Sessions in NASA Space Shuttle Cockpit Simulators. European Journal of Operational Research, 73(3):443--449, 1994.
5. C. Berge. Hypergraphs: Combinatorics of Finite Sets, Volume 45. Elsevier, 1984.
6. Regina Bernhaupt, Evaluating User Experience in Games: Concepts and Methods, Springer Publishing Company, Incorporated, 2010
7. R. A. Bradley and M. E. Terry. Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika, 39(3/4):324--345, 1952.
8. Shuo Chen, Thorsten Joachims, Modeling Intransitivity in Matchup and Comparison Data, Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, February 22-25, 2016, San Francisco, California, USA doi:10.1145/2835776.2835787
9. Z. Chen, Y. Sun, M. Seif El-Nasr, and T.-H. D. Nguyen. Player Skill Decomposition in Multiplayer Online Battle Arenas. In Meaningful Play, 2016.
10. O. Delalleau, E. Contal, E. Thibodeau-Laufer, R. C. Ferrari, Y. Bengio, and F. Zhang. Beyond Skill Rating: Advanced Matchmaking in Ghost Recon Online. IEEE Transactions on Computational Intelligence and AI in Games, 4(3):167--177, Sep 2012.
11. G. Di Fatta, G. M. Haworth, and K. W. Regan. Skill Rating by Bayesian Inference. In IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Pages 89--94. IEEE, 2009.
12. Doratha E. Drake, Stefan Hougardy, A Simple Approximation Algorithm for the Weighted Matching Problem, Information Processing Letters, v.85 n.4, p.211-213, 28 February 2003 doi:10.1016/S0020-0190(02)00393-9
13. Ran Duan, Seth Pettie, Linear-Time Approximation for Maximum Weight Matching, Journal of the ACM (JACM), v.61 n.1, p.1-23, January 2014 doi:10.1145/2529989
14. J. Edmonds. Maximum Matching and a Polyhedron with 0, L-vertices. Journal of Research of the National Bureau of Standards, 69(1965):125--130, 1965.
15. J. Edmonds. Paths, Trees, and Flowers. Canadian Journal of Mathematics, 17(3):449--467, 1965.
16. A. E. Elo. The Rating of Chessplayers, Past and Present. Arco Pub., 1978.
17. J. Ferreira, M. B. Vellasco, M. A. C. Pacheco, R. Carlos, and H. Barbosa. Data Mining Techniques on the Evaluation of Wireless Churn. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Pages 483--488, 2004.
18. Harold Neil Gabow, Implementation of Algorithms for Maximum Matching on Nonbipartite Graphs., Stanford University, Stanford, CA, 1974
19. Harold N. Gabow, A Scaling Algorithm for Weighted Matching on General Graphs, Proceedings of the 26th Annual Symposium on Foundations of Computer Science, p.90-100, October 21-23, 1985 doi:10.1109/SFCS.1985.3
20. M. E. Glickman. Parameter Estimation in Large Dynamic Paired Comparison Experiments. Applied Statistics, Pages 377--394, 1999.
21. T. Graepel and R. Herbrich. Ranking and Matchmaking. Game Developer Magazine, 25:34, 2006.
22. F. Hadiji, R. Sifa, A. Drachen, C. Thurau, K. Kersting, and C. Bauckhage. Predicting Player Churn in the Wild. In IEEE Conference on Computational Intelligence and Games (CIG), Pages 1--8. IEEE, 2014.
23. Ralf Herbrich, Tom Minka, Thore Graepel, TrueSkill™: A Bayesian Skill Rating System, Proceedings of the 19th International Conference on Neural Information Processing Systems, p.569-576, December 04-07, 2006, Canada
24. Tzu-Kuo Huang, Chih-Jen Lin, Ruby C. Weng, A Generalized Bradley-Terry Model: From Group Competition to Individual Skill, Proceedings of the 17th International Conference on Neural Information Processing Systems, p.601-608, December 01, 2004, Vancouver, British Columbia, Canada
25. J. Jiménez-Rodrıguez, G. Jiménez-Dıaz, and B. Dıaz-Agudo. Matchmaking and Case-based Recommendations. In 19th International Conference on Case Based Reasoning, 2011.
26. E. L. Lawler. Combinatorial Optimization: Networks and Matroids. Courier Corporation, 2001.
27. Youngki Lee, Sharad Agarwal, Chris Butcher, Jitu Padhye, Measurement and Estimation of Network QoS Among Peer Xbox 360 Game Players, Proceedings of the 9th International Conference on Passive and Active Network Measurement, April 29-30, 2008, Cleveland, OH, USA
28. Justin Manweiler, Sharad Agarwal, Ming Zhang, Romit Roy Choudhury, Paramvir Bahl, Switchboard: A Matchmaking System for Multiplayer Mobile Games, Proceedings of the 9th International Conference on Mobile Systems, Applications, and Services, June 28-July 01, 2011, Bethesda, Maryland, USA doi:10.1145/1999995.2000003
29. Joshua E. Menke, Tony R. Martinez, A Bradley–Terry Artificial Neural Network Model for Individual Ratings in Group Competitions, Neural Computing and Applications, v.17 n.2, p.175-186, February 2008 doi:10.1007/s00521-006-0080-8
30. M. Minotti. Comparing MOBAs: League of Legends Vs. Dota 2 Vs. Smite Vs. Heroes of the Storm. http://venturebeat.com/2015/07/15/comparing-mobas-league-of-legends-vs-dota-2-vs-smite-vs-heroes-of-the-storm/. Online; Accessed May, 2016.
31. Katharina Morik, Hanna Köpcke, Analysing Customer Churn in Insurance Data: A Case Study, Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, p.325-336, September 20-24, 2004, Pisa, Italy
32. M. Myslak and D. Deja. Developing Game-structure Sensitive Matchmaking System for Massive-multiplayer Online Games. In Social Informatics, Pages 200--208. Springer, 2014.
33. T.-H. D. Nguyen, Z. Chen, and M. S. El-Nasr. Analytics-based AI Techniques for Better Gaming Experience, Volume 2 of Game AI Pro. CRC Press, Boca Raton, Florida, 2015.
34. S. Ólafsson. Weighted Matching in Chess Tournaments. Journal of the Operational Research Society, 41(1):17--24, 1990.
35. Osiakwan, Akl, The Maximum Weight Perfect Matching Problem for Complete Weighted Graphs is in PC, Proceedings of the 1990 IEEE Second Symposium on Parallel and Distributed Processing, p.880-887, December 02-05, 1990 doi:10.1109/SPDp.1990.143503
36. E. A. Riskin, R. Ladner, Ren-Yuh Wang, L. E. Atlas, Index Assignment for Progressive Transmission of Full-search Vector Quantization, IEEE Transactions on Image Processing, v.3 n.3, p.307-312, May 1994 doi:10.1109/83.287025
37. J. Runge, P. Gao, F. Garcin, and B. Faltings. Churn Prediction for High-value Players in Casual Social Games. In IEEE Conference on Computational Intelligence and Games, Pages 1--8. IEEE, 2014.
38. M. Stanescu. Rating Systems with Multiple Factors. In Master Thesis. School of Informatics, University of Edinburgh, 2011.
39. SuperData. Esports Market Brief: US Accounts for Almost Half of Total Viewership. Https://www.superdataresearch.com/blog/esports-brief/. Online; Accessed Mar, 2016.
40. Mirko Suznjevic, Maja Matijasevic, Jelena Konfic, Application Context based Algorithm for Player Skill Evaluation in MOBA Games, Proceedings of the 2015 International Workshop on Network and Systems Support for Games, p.1-6, December 03-04, 2015, Zagreb, Croatia
41. P. Tassi. Riot's 'League of Legends' Reveals Astonishing 27 Million Daily Players, 67 Million Monthly. http://www.forbes.com/sites/insertcoin/2014/01/27/riots-league-of-legends-reveals-astonishing-27-million-daily-players-67-million-monthly/#26ff8e543511. Online; Accessed May, 2016.
42. J. Van Rantwijk. Maximum Weighted Matching. http://jorisvr.nl/article/maximum-matching. Online; Accessed May, 2016.
43. B. G. Weber, M. John, M. Mateas, and A. Jhala. Modeling Player Retention in Madden NFL 11. In Innovative Applications of Artificial Intelligence (IAAI), 2011.
44. D. B. West. Introduction to Graph Theory. Prentice Hall, 2001.
45. G. Yannakakis, P. Spronck, D. Loiacono, and E. André. Player Modeling. In Artificial and Computational Intelligence in Games, Pages 45--59. 2013.
46. S. Yoon, J. Koehler, and A. Ghobarah. Prediction of Advertiser Churn for Google AdWords. In the Joint Statistical Meetings (JSM) Proceedings, 2010.

}};

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2017 EOMMAnEngagementOptimizedMatchm	Yizhou Sun Su Xue John Kolen Navid Aghdaie Kazi A. Zaman Zhengxing Chen Magy Seif El-Nasr			EOMM: An Engagement Optimized Matchmaking Framework				10.1145/3038912.3052559		2017