Can Machine Learning Increase Your Win Rate In SMITE?

NOTE: the images in this post come from an old version of an interactive Jupyter notebook I made using ipywidgets, a newer version is available here. Unfortunately you have to go through some hoops to view/interact with it. UPDATE: You can now interact with the recommendation system as a Shiny app here.

Also, this post tries to be somewhat interesting to MOBA players, data scientists, and a more general audience, sorry if there’s too much/not enough details!

A Unique Smite Item Recommendation System

MOBAs like Smite, League of Legends, and Dota2 have many (many) statistics websites, with recommended builds for all kinds of playstyles and situations. There seems to be a lack of information about interactions between characters, however. What items should I build when I’m an Apollo (Greek God of Music) against Ra (Egyptian Sun God), or Ymir (Norse Father of the Frost Giants) paired with Ao Kuang (Chinese Dragon King of the Eastern Seas)? Using Smite API data, I tried to apply some simple machine learning techniques to answer these questions and more, with an aim to help newer/ less competitive players to get an insight into what items to build.

**Inspired and supported by David Jones, and his Dota2 and League of Legends recommender at Check it out! Also, check out these papers about other machine learning efforts in MOBA games.

Image result for Smite

Smite Basics

Smite is a multiplayer online battle arena (MOBA), similar to League of Legends or Dota2. It’s played from a third-person perspective, and all the characters are gods from various pantheons. If you’ve never played a MOBA before, there’s a lot to know, but for the purposes of this post here’s a few important things. The main competitive game mode, Conquest, is a 5v5 mode with relatively rigid roles assigned to each player on a team. We will also consider the Duel game mode, a 1v1 game mode on a smaller map. Players start the game with very little gold, gaining more over time for things like kills and assists, and use that gold to buy items that boost various attributes like damage or movement speed. Choosing what items to buy (with only six available slots) and when is a crucial component of success for skilled players.

Can machine learning tell us what items to buy?

Specifically, we want to know which items synergize well with other teammates, or are best against specific enemies.

Data Analysis

As a first run, I was able to collect data on about 70,000 ranked conquest matches over the last month or so. That may seem like a lot, but for any particular recommendation, I only care about the games that my selected gods actually played in, which ends up being only in the hundreds or thousands. As of this writing, there are 82 available gods, which means there are over six thousand partner-specific recommendations to be made.

Now that I have the data, what kind of machine learning can I use? I explored using Naive Bayes probabilities to create a ranking system, calculating the probability of winning a match given the winrate of each item,. Unfortunately, the “naive” part is due to the (in this case, preposterous) assumption that each item pick is completely independent of other items, allied and enemy gods, etc.

Instead, I chose to use logistic regression, trained to classify whether my god won the match or not, given the items they chose to buy and the gods they played with and against.  Logistic regression is nice, compared to opaque methods like random forests or support vector machines, because it is immediately interpretable: the coefficients the regression spits out can be interpreted as a measure of how much having a certain item increases the logodds of winning the game. The larger the number, the more having that item increases the chance of winning (although it’s not a linear relationship). Since I don’t have too much data, I also used L2 (or ridge) regularization, which helps with overfitting without driving too many coefficients to zero.

I could have included other variables, like gold per minute for each character, which greatly increases the model’s ability to accurately predict winning a match or not (having more gold means you are likely winning, who knew?), but I’m more interested in the relationship between the item choices and a winning outcome, so I didn’t include them. I did include allied and enemy gods as features, so that if, say, Anubis almost always loses against Ares, regardless of what items Anubis chooses, the model separates out the effect of playing against Anhur from the effect of buying Rod of Tahuti. Ideally, I should include interaction terms to capture the effect of buying the Rod when playing against an Anhur, but with over 80 items and gods to track, it gets unwieldy real quick.


Building a Recommendation

You might think that the six items that our model believes are best at predicting winning must be the best items. Unfortunately, that’s not the case, for a few reasons. Let’s look at an example*:

* screenshots are from simple interactive dashboard I made in Jupyter Notebook using ipywidgets. An updated version can be found here. A no-hoops-to-go-through-to-interactive-with Shiny app can be found here.



The first table shows item information. Bolded items are recommended items that were not already one of the six most popular items. 
The second table shows relic recommendations, a special type of item that are available for free: one at the beginning of the game, and a second after level 12.

Let’s break this down. In the games where Anhur (Egyptian Slayer of Enemies) fought against Xbalanque (Mayan Hidden Jaguar Sun), Warrior Tabi was the most popular item chosen. Titan’s Bane is the item our model recommends the most, even when you normalize it to the cost of the item. The two most popular relics were Purification and Sanctuary, but our model recommends replacing Sanctuary with Sprint. The rating is just the coefficient from the logistic regression, multiplied by 100 for readability. Our model predicts that having Titan’s Bane increases the logodds of winning by .12. Notice that the items are ordered by rating or by popularity, so this model tells us nothing about the best order to get these items.

As a first pass, the “recommendation” is simply the the most popular items, with the two “new” items with the highest rating/cost value replacing the two worst popular items. In this case, our model finds Executioner and Death’s Toll to be the worst popular items, and replaces them with Titan’s Bane and Wind Demon. Is Wind Demon really a better crit item than Executioner/Deathbringer? Maybe, in this particular match up.

Problem: Boots

Notice that Warrior Tabi has the next worst rating. Does that mean we should toss it? Absolutely not!! That item is part of a special class of items: boots. Everyone gets boots, usually as their first fully built item. It’s fundamentally necessary in order to be capable of moving around the map quickly. Unfortunately, if everyone gets them, then they aren’t very good at predicting winning. That is, if winners and losers both have the item all the time, then it’s useless as a tool to know who will win. So the rating makes perfect sense, the item neither predicts winning or losing. This sort of null result can show up for other items too (so long as it is generally equally carried by both winners and losers). Regardless, any good recommendation must include boots. I’m going to tweak the model to force it to return one boots recommendation, but swapping out only the two worst items usually does a decent job.

Now let’s look at the recommendations for Xbalanque (yes, you are mispronouncing it) when he’s up against Anhur :
They get similar recommendations because they are both part of the Hunter class (ADC role). However in this case, two types of boots are popular: Warrior Tabi and Ninja Tabi. Lucky for us, the chart makes it clear that winners are more likely to use Ninja Tabi. However, replacing Qin’s Sais with Ichaival  might or might not be the best idea, as before cost normalizing, the model prefers Qin’s Sais.

Problem: Luxury Items

Now, predicting whether a match was won is different than predicting whether picking up a specific item helped a team to win. In Smite, some items, known as “luxury items”, are very powerful, but also very expensive. They can have the win-more problem,  where they are very good at helping you win, but only if you are already winning. They show up a lot in winning games, but may be a terrible choice if you are in the middle of a losing game.

As described in this paper, there are ways to try to sift through these causal effects. For example, with propensity score matching, you can tease out the effect of “game state” information (ex: whether you were behind or ahead at different stages of the game), so that the rating more accurately reflects the true general effect that having an item has on winning. While messing around with Dota2 data from, I was able to make some headway with this approach. Unfortunately, the Smite Api lacks any sort of data from the beginning/middle of the game, except first blood and relic information.

So did we buy, say Mantle of Discord because it will help us win, or does winning allow us to buy it? It’s hard to say, but using the cost normalized rating does help mitigate the problem.



When Kumbhakarna (Hindu Sleeping Giant) teams up with Cupid (Roman God of Love),  the luxury item Mantle of Discord shows up as third most recommended item. But we can see that it disappears below Winged Blade on the cost normalized recommendation, because it’s value at predicting winning per gold spent just isn’t good enough. We’ll see more examples of this below.

You may have noticed that our final recommendation has two pairs of boots: Shoes of Focus and Shoes of the Magi. Oops. It’s clear that the model thinks Shoes of the Magi is better when teamed up with Cupid, but my naive find and replace recommendation system doesn’t catch it well.

Problem: Starter Items

There is yet another subset of items, the starter item. They are very cheap powerful early game items, that become pretty useless by the end of the game.  They are meant to be sold and replaced by a more powerful item, but often enough, a losing team member still has her starting item. We already saw this with Death’s Toll in the first example, and with Watcher’s Gift above. Guardians like Kumbhakarna are often behind on gold, especially if they are losing, so it makes sense that the losers are more likely to never get enough gold to replace their starter item with something better. Because starter items often jump to the top in the cost normalized rating, I removed them. However, this doesn’t mean you shouldn’t get a starter item (you have to), only that this model doesn’t value starter items meaningfully, and you should probably get whatever the most popular starter is.

God Synergy

The model is more likely to find success when there are meaningful interactions between the two chosen players during a match. If Kali (Hindu Goddess of Destruction) rarely interacts with Bastet (Egyptian Goddess of Cats) during a match, the item recommendations will probably devolve to her standard build, and isn’t very interesting. For our next example, lets look at a Guardian (support) paired with two different Hunters (ADC). Because of their predefined game roles, they will spend enough of the game together that buying items based on synergy could be very important (the same could be true for the Hunter vs Hunter matchup at the beginning).


We’ll compare the Kumbhakarna + Cupid pairing from above (reproduced above), with a  Kumbhakarna + Hou Yi (Chinese Defender of the Earth) pairing.



We can see that Kumbhakarna’s most popular items are exactly the same for both pairings, suggesting that 1) it’s probably a pretty decent build, and 2) there is possibly a competitive advantage from synergy that players are missing out on. Other than the starter item Watcher’s Gift, the model clearly dislikes one other item, but they are not the same. Heartward Amulet gives nearby allies magical protections and mana regeneration, but it appears as though Cupid finds them more useful than Hou Yi. We also see that in the Hou Yi pairing, luxury item Mantle of Discord stays valuable, even considering it’s cost. More importantly, Spirit Robe and Mantle of Discord both build from the same item tree, suggesting that building their predecessor, Cloak of Concentration, might be a good idea that is often missed.

Measuring Success

What’s the best way to measure if these recommendations are successful?  Normally you would measure how successful your model is by how accurate it is. Our model is built to predict winning, and does so fairly accurately, but that doesn’t measure how good our item recommendations are. We could also easily make our model more accurate by adding in how much gold each character got by the end of the game. It turns out that’s a good indicator of winning. But that would probably make the item recommendations worse.

We can use win rates as a useful indicator of success (which is very similar to a naive baysian approach), but it’s not perfect. Again, a gold-normalized win rate would probably be more meaningful, although I’ll just stick to win rates here. I don’t have enough data to compare win rates between entire builds, but for a given item or pairing of items, we can compare them.

For example, in the Kumbhakarna + Cupid pairing, the highest recommended items,  Sovereignty  and Void Stone, have a win rate 15.6% higher than the two best rated popular items with the best rating, Sovereignty  and Hide of the Urchin. That’s a huge difference, but again, item use correlating with a win does not necessarily mean that the item helped cause the win! Maybe Void Stone gets picked up for that final push in situations when Kumbhakarna and Cupid are already winning.

Although relics are very situational, there are often meaningful increases in win rate relative to the most popular relics. In the first example, Anhur’s recommended relics Sprint and Purification have a win rate 8% higher than the combination of the most popular relics, Purification and Sanctuary. Because the two gods would be fighting for the same lane, they are likely to fight each often, implying that Sprint (which provides movement speed and cleanses Slows) might generally be better against Xbalanque than Sanctuary. Having Sanctuary (which provides 2 seconds of invulnerability) actually lowers your win rate by 9.8% relative to not picking it up, although this doesn’t necessarily mean that having that relic caused those losses. Maybe it’s necessary to get Sanctuary against really good enemy team compositions, and their team synergy leads to the lowered win rate.

Friends vs Foes

Let’s compare Freya (Norse Queen of the Valkyries) and Bellona (Roman Goddess of War) when they are friends or foes.



FvBRelics.pngThe most popular items in this case seem pretty solid. However, of ~1300 matches, Rod of Tahuti together with Spear of the Magus has a win rate of 71.8% compared to Rod of Tahuti and Demonic Grip’s 64.2%.




 Rod of TahutiSpear of the Magus, and Demonic Grip still compete for the top, with our model showing Spear of the Magus with a higher win rate than Demonic Grip, but it appears as though Shoes of Focus might edge out Shoes of the Magi as boots of choice. To me, the presence of Teleport as a recommended relic suggests that it is useful as a win-more item, not that it should be preferred over Purification in the general case.

More Game Modes

While 5v5 Conquest is the standard MOBA mode, there are other popular modes, with different play styles that lead to different item choices.  The starkest contrast would be between 5v5 Conquest and the 1v1 Duel mode. In this mode, directly counterbuilding around your enemy’s strengths, weaknesses, and item choices is extremely important. We would expect our predictions to vary more from match up to match up, reflecting the wider variability and flexibility of the mode.



In ranked conquest, the Anubis (Egyptian God of the Dead) build against Bakasura (Hindu Great Devourer) seems pretty solid. It’s pretty much the standard most popular build for Anubis, and doesn’t cater particularly much toward Bakasura. Again, notice how the luxury item Spear of Desolation gets edged out by Obsidian Shard. The model suggests Blink instead of Purification, which I’m pretty skeptical of.

Now here’s the duel recommendation:


Since duel is all about counter building directly against your opponent, it makes sense that the most popular items would shift quite a lot.

Choosing the protection item Breastplate of Valor (which also reduces cooldown) doesn’t seem to work out well, but again, this could simply reflect that when Anubis is already losing against Bakasura, he buys Breastplate of Valor to try to turn the tide.  It does appear as though Celestial Legion Helm might be a better choice (especially if Bakasura builds crit), but it’s hard to tell. Warlock’s Sash seems like a good call as an important sustain item, and Shoes of the Magi is clearly superior. The model also recommends the relics Frenzy and Meditation over Sprint and Sanctuary, which is.. interesting. Frenzy does have a winrate of 62.5% compared to Sprint’s 37.7% and Sanctuary’s 35.0%.

Final Thoughts

  • While it could use a lot of improvement, the relatively simple model does seem to do a good job of pointing out interesting items that may perform a bit better or worse than the choices of the player base suggest.
  • A major drawback is that you really have to see all of the information and understand some subtleties of Smite item management to take advantage of the model’s insights. Blindly following the recommendations is probably a pretty bad idea.
  • Aside from the drawbacks and problems mentioned in the post, it is unfortunate that recommended items cannot be ordered in the order they should be bought, and that the model can’t recommend a complete (and meaningful) set of items (starter item, boots, end game items, etc). This is partially because of the lack of information available from the SMITE API.
  • Because the interactions with your partner are so much stronger, it seems better suited to Joust and Duel modes than Conquest.
  • Replacing one or two of the most popular items does seem to consistently perform better than just using the most popular items, but I’m planning on working on a different system that can meaningfully measure the distance between entire sets of items. This would actually be a recommender system in a true machine learning sense.
  • Sometimes strange results show up, especially with relic recommendations. The fewer the number of matches, the less trustworthy the results! As I gather more matches, the recommendations should get better (until the next patch anyway).

Let me know what you think!

Data provided by Hi-Rez. © 2015 Hi-Rez Studios, Inc. All rights reserved.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s