Warframe Expected & Nearly Guaranteed Numbers - An Explanation

Created: May 12^th, 2018 | Hotfix 22.19.1 (2018-05-03)[]

Last Updated: December 22^nd, 2021 | Hotfix 31.0.5 (2021-12-21)[]

New calculator using Markov chains

Hello and welcome to my discussion and explanation behind the mysterious Warframe run numbers! It's come to my attention that there are some questions, and general disagreements, behind these numbers so I made this blog to hopefully satisfy all your mathematical concerns.

DEFINITIONS

First, before we get to the nitty-gritty stuff we should define some key terms to get rid of any confusion and dismiss some overly-literal misinterpretations.

1. "at least once": This sounds pretty straight forward but I want to make sure the concept of receiving each component at least once is crystal clear. Let's take Excalibur's page as an example:

Item	Source	Chance	Expected	Nearly Guaranteed
Neuroptics Blueprint	Lieutenant Lech Kril Assassination	38.72%	~ 2 Kills	14 ± 4 Kills
Systems Blueprint	Lieutenant Lech Kril Assassination	22.56%	~ 4 Kills	27 ± 9 Kills
Chassis Blueprint	Lieutenant Lech Kril Assassination	38.72%	~ 2 Kills	14 ± 4 Kills

All drop rates data is obtained from DE's official drop tables. See Mission Rewards#Standard Missions for definitions on reward table rotations.
^{For more detailed definitions and information,
visit here.} In essence, what these values are saying is that in a total of 6-7 runs you are expected to receive at least one Neuroptics, at least one Chassis, and at least one Systems component. This is not saying that 6-7 runs are needed for each individual component; i.e not 18-21 runs in total for all three components.

2. Expected: This means the average number of runs across the playerbase. In other words, the statistically average player will receive all Warframe components in question at least once in $x$ number of runs.

3. Nearly Guaranteed: This term is probably the one that needs the most clarification. What this describes is the number of runs a player would need to give a higher probability of receiving a component at least once. Notice the emphasis and use of "probability" as opposed to "drop chance". Increasing the number of runs does not increase the drop chance of a component, but rather, it increases the probability for a component to have dropped already.

The easiest analogy to this is thinking about flipping a coin. A coin always has a 50% chance to either land on heads or tails when flipping it, and much like drop chances, this 50% chance does not change no matter how many times the coin is flipped. Each flip will always have a 50% chance to be either heads or tails. That said, consider the question "If I flip a coin 2 times, what is the probability that it lands on heads at least once". The chance per flip may be 50% for heads, but the probability that it lands on heads at least once within those two throws is not 50%, it would in fact be 75%.

If you flip the coin thrice the probability for it to have landed on heads at least once within that time now goes up to 87.5%. Four times will result in a probability of 93.75%. 25 times will result in a probability of 99.99999702% (one might say it's nearly guaranteed at that point). Remember, the coin itself is not changing, each flip still has a 50% chance to be heads, but you have increasingly larger probabilities to have gotten heads after each flip.

With this in mind, the purpose of "Nearly Guaranteed" is to tell you the number of runs a player needs for a 99% - 99.99% certainty of having received each component in question at least once. More specifically, in Excalibur's example above, 27 runs are needed for a 99.9% probability, 18 runs are needed for a 99% probability, and 36 runs are needed for a 99.99% probability.

Graphically, Excalibur's at-least-once distribution looks like this

MATH

Okay, now that the boring stuff is out of the way, we can get more into the actual math and statistics of these run stats. For this we will consider three different types of frames.

Contents
1 Type 1
2 Type 2
3 Type 3

Type 1[]

Type 1's are the frames that drop from a single source; e.g each Excal component drops from Lieutenant Lech Kril on War, Mars. To calculate our numbers we will first find the average chance for each part.

For the following example:
Drop chance for Neuroptics = $\mathbf{P}_{\mathbf{1}}$
Drop chance for Chassis = $\mathbf{P}_{\mathbf{2}}$
Drop chance for Systems = $\mathbf{P}_{\mathbf{3}}$

You start out with zero Excal components.
The chance of you receiving a component you do not already own when killing Kril is:
$\mathbf{Average\ Chance\ (AC)}\,\!$ ${\displaystyle = \left[\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\right ]^{-1} \,\!}$ ${\displaystyle = \left[\frac{\mathbf{1}}{\mathbf{38.72}\%+\mathbf{38.72}\%+\mathbf{22.56}\%}\right ]^{-1} \,\!}$ $= \mathbf{100}\%$

You now have one Excal component, two remain.
The chance of you receiving a component you do not already own is:
$\mathbf{AC} \,\!$ ${\displaystyle { = \left[ \frac{\mathbf{P}_{\mathbf{1}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}} ~ + ~ \,\! \frac{\mathbf{P}_{\mathbf{2}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{3}}} ~ + ~ \,\! \frac{\mathbf{P}_{\mathbf{3}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}}\right ]^{-1} \,\! }}$ $= \mathbf{64.31}\%$

You now have two Excal components, one remains.
The chance of you receiving a component you do not already own is:
$\mathbf{AC} \,\!$ ${\displaystyle { = \left[ \frac{\mathbf{P}_{\mathbf{1}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{P}_{\mathbf{2}}}{\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{3}}} ~ + ~ \,\! \frac{\mathbf{P}_{\mathbf{1}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{P}_{\mathbf{3}}}{\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{2}}} ~ + ~ \,\! \frac{\mathbf{P}_{\mathbf{2}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{P}_{\mathbf{1}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{3}}} ~ + ~ \,\! \frac{\mathbf{P}_{\mathbf{2}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{P}_{\mathbf{3}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{1}}} ~ + ~ \,\! \frac{\mathbf{P}_{\mathbf{3}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{P}_{\mathbf{1}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{2}}} ~ + ~ \,\! \frac{\mathbf{P}_{\mathbf{3}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}+\mathbf{P}_{\mathbf{3}}}\times\frac{\mathbf{P}_{\mathbf{2}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{1}}}\right ]^{-1} \,\! }}$ $= \mathbf{28.67}\%$

Now that we have the average chances we can simply calculate the Expected number of runs with basic math.
$\mathbf{Expected} \,\!$ ${\displaystyle = \frac{\mathbf{1}}{\mathbf{100}\%} ~ + ~ \frac{\mathbf{1}}{\mathbf{64.31}\%} ~ + ~ \frac{\mathbf{1}}{\mathbf{28.67}\%} \,\!}$ $= \mathbf{6.0429\ (6-7)}$

This way of calculating the expected is quite long and tedious though. There is another way, which involves recursion and is what the Wiki uses to get those values you see on articles, but that's out of the scope of this blog. To learn more you can find it in the Math module, it's the function called "expected".

To estimate the Nearly Guaranteed (NG) number of runs we need some numerical methods. The equation to estimate "at least once" probability from $n$ percent chances after $x$ number of runs is:
$\mathbf{1} - (\mathbf{1} - \mathbf{P1})^{x} - (\mathbf{1} - \mathbf{P2})^{x} - \cdots - (\mathbf{1} - \mathbf{PN})^{x} \,\!$ $= \mathbf{Desired\ Probability\ (DP)}$

But since we want to know $x$ then we'll need to use numerical methods to solve for it, since we can't rearrange this transcendental equation do so. For our purposes Inverse Quadratic Interpolation will work just fine, so we'll use that.

If you're unfamiliar with IQI it's basically a method for finding the the x-axis value where the root of a function exists. For example, say our equation was much simpler:
$\mathbf{2} * x = \mathbf{4}$

If we did not want to solve for $x$ algebraically we could use IQI to find it for us. First, we must rearrange the equation such that it is equal to zero, like so:
$\mathbf{2} * x - \mathbf{4} = 0$

Next, we will use the IQI function to find at what value of $x$ the equation is satisfied (in MatLab this function is called fzero()).

x = fzero( @(x) 2 * x - 4 , 0); %The ", 0);" at the end is the x value the function will start at when looking for the answer, it could be any number though.

Almost immediately fzero() will spit out the answer 2

Similarly we will use this method to find the $x$ value in our previous equation.

First, we rearrange:
$\mathbf{1} - \mathbf{DP} - (\mathbf{1} - \mathbf{P1})^{x} - (\mathbf{1} - \mathbf{P2})^{x} - \cdots - (\mathbf{1} - \mathbf{PN})^{x} \,\!$ $= 0$

Next, we use IQI to find our $x$ :

x = fzero( @(x) 1 - DP - (1 - P1)^(x) - (1 - P2)^(x) - ... - (1 - PN)^(x) , 0);

We'll do this three times, each time with a different DP value of either 99%, 99.9%, and 99.99%. For our Excalibur example above, the results will be:

x = fzero( @(x) 1 - 0.999 - 2*(1 - 0.3872)^(x) - (1 - 0.2256)^(x) , 0);

$\mathbf{NG} = \mathbf{27}\pm\mathbf{9}$

Note that this is not the exact value, it is simply an estimate. This does reasonably well when the probabilities are not high, or there are multiple of them.

Contents
1 Type 1
2 Type 2
3 Type 3

Type 2[]

Type 2's are the frames that have components that drop from separate locations, e.g Harrow's Chassis drops from Fissure Corrupted enemies, his Neuroptics drop from the Pago, Kuva Fortress Spy mission, and his Systems drop from Defection missions. Because these parts are not in the same drop pool as each other we only need to worry about the probabilities of the individual components, which makes these calculations fast and easy.

For the following example:
Drop chance for Neuroptics = 11.28%
Drop chance for Chassis = 3.00%
Drop chance for Systems in Tier I = 7.52%
Drop chance for Systems in Tier II & III = 11.28%

$\mathbf{Neuroptics\ Expected} \,\!$ $= \frac{\mathbf{1}}{\mathbf{11.28}\%} \,\!$ $= \mathbf{8.87\ (8-9)}$

$\mathbf{Chassis\ Expected} \,\!$ $= \frac{\mathbf{1}}{\mathbf{3.00}\%} \,\!$ $= \mathbf{33.}\overline{\mathbf{33}}~\mathbf{(33-34)}$

$\mathbf{T1\ Systems\ Expected} \,\!$ $= \frac{\mathbf{1}}{\mathbf{7.52}\%} \,\!$ $= \mathbf{13.3\ (13-14)}$

$\mathbf{T2\ \&\ T3\ Systems\ Expected} \,\!$ $= \frac{\mathbf{1}}{\mathbf{11.28}\%} \,\!$ $= \mathbf{8.87\ (8-9)}$

To calculate the Nearly Guaranteed (NG) number of runs we use the same equation as before, however because each part is in its own separate location the equation is no longer transcendental, i.e. $x$ can be solved for algebraically:
$\mathbf{1} - (\mathbf{1} - \mathbf{11.28}\%)^{x} \,\!$ $= \mathbf{99.9}\%$

$(\mathbf{1} - \mathbf{11.28}\%)^{x} \,\!$ $= (\mathbf{1} - \mathbf{99.9}\%)$

$\frac{\ln(\mathbf{1} - \mathbf{99.9}\%)}{\ln(\mathbf{1} - \mathbf{11.28}\%)} \,\!$ $= x$

$\log_{(\mathbf{1} - \mathbf{11.28}\%)} (\mathbf{1} - \mathbf{99.9}\%) \,\!$ $= \mathbf{58}$

$\mathbf{NG\ (Neuroptics)} \,\!$ $= \mathbf{58}\pm\mathbf{20}$

For Type 2's the NG values are the exact values, not just estimates.

Contents
1 Type 1
2 Type 2
3 Type 3

Type 3[]

Finally, the third type is really just Equinox. Equinox drops from a single location like type 1 frames, but unlike any other frame, she has eight components to farm for rather than the traditional three four. Ideally we'd use the same method as in the Type 1 explanation, with five more equations to account for all the parts, however that turns out to be ridiculous.

The equations used for the type one frames are characterized by a pattern of increasing terms. That is, the first equation had just 1 term ([u]), the second had 3 terms ([u + v + w]), and the third had 6 terms ([u + v + w + x + y + z]). To better gauge the pattern we can look at the equations for two components:

$\left[\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}}\right]^{-1}$ ([u]), 1 term
${\displaystyle { \left[ \frac{\mathbf{P}_{\mathbf{1}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{2}}} ~ + ~ \frac{\mathbf{P}_{\mathbf{2}}}{\mathbf{P}_{\mathbf{1}}+\mathbf{P}_{\mathbf{2}}}\times\frac{\mathbf{1}}{\mathbf{P}_{\mathbf{1}}}\right ]^{-1} }}$ ([u + v]), 2 terms

And for four components the pattern would be:

[u] (1 term)
[u + v + w + x] (4 terms)
[u + v + w + x + y + z + u` + v` + w` + x` + y` + z`] (12 terms)
[u + v + w + x + y + z + u` + v` + w` + x` + y` + z` + u`` + v`` + w`` + x`` + y`` + z`` + u``` + v``` + w``` + x``` + y``` + z```] (24 terms)

Infact, the number of terms for any number of components ( $n$ ) in each equation can be found using this, where $r$ is the number of components already owned:
$\frac{n!}{(n - r)!} \text{or} ~ n\mathbf{P}r$

What this means is that if we wanted to come up with the 8 equations needed for Equinox, the number of terms in each equation would be:

1
8
56
336
1,680
6,720
20,160
40,320

So if I wanted to calculate the average chance for just the last component needed, my equation would have 40,320 terms...I'm not doing that. We still have a few options though:

1) Instead, we can recognize that all 8 of Equinox's component drop chances are relatively similar (6 having 12.91% drop chances, the other 2 having 11.28%). Because they're so close to each other we can estimate the average chances by assuming each component has a drop chance of 1/8 (12.5%); that they're all evenly distributed. By doing this we can simply use logic to find the average chances:

You start out with zero Equinox components.
The chance of you receiving a component you do not already own when killing Tyl Regor is:
$\frac{\mathbf{8}}{\mathbf{8}}$

You now have one Equinox component.
The chance of you receiving a component you do not already own is now:
$\frac{\mathbf{7}}{\mathbf{8}}$

You now have two Equinox components.
The chance of you receiving a component you do not already own is now:
$\frac{\mathbf{6}}{\mathbf{8}}$

You now have three Equinox components.
The chance of you receiving a component you do not already own is now:
$\frac{\mathbf{5}}{\mathbf{8}}$

You now have four Equinox components.
The chance of you receiving a component you do not already own is now:
$\frac{\mathbf{4}}{\mathbf{8}}$

You now have five Equinox components.
The chance of you receiving a component you do not already own is now:
$\frac{\mathbf{3}}{\mathbf{8}}$

You now have six Equinox components.
The chance of you receiving a component you do not already own is now:
$\frac{\mathbf{2}}{\mathbf{8}}$

You now have seven Equinox components.
The chance of you receiving the last component you do not already own is:
$\frac{\mathbf{1}}{\mathbf{8}}$

Now that we have the estimated average chances we can simply calculate the Expected number of runs with basic math like before.
$\mathbf{Expected} \,\!$ $= \frac{\mathbf{8}}{\mathbf{8}} ~ + ~ \frac{\mathbf{8}}{\mathbf{7}} ~ + ~ \frac{\mathbf{8}}{\mathbf{6}} ~ + ~ \frac{\mathbf{8}}{\mathbf{5}} ~ + ~ \frac{\mathbf{8}}{\mathbf{4}} ~ + ~ \frac{\mathbf{8}}{\mathbf{3}} ~ + ~ \frac{\mathbf{8}}{\mathbf{2}} ~ + ~ \frac{\mathbf{8}}{\mathbf{1}} \,\!$ $= \mathbf{21.7429\ (21-22)}$

2) We can use Absorbing Markov Chains to calculate the exact answer using the real probabilities (21.874851038209).

3) Or, the recursive method mentioned before in Type 1 works just fine without needing to simplify the probabilities like above as well.

To estimate the Nearly Guaranteed (NG) number of runs we use the same methods as in Type 1:

x = fzero( @(x) 1 - 99.9% - 6*(1 - 12.91%)^(x) - 2*(1 - 11.28%)^(x) , 0);

$\mathbf{NG} = \mathbf{69}\pm\mathbf{18}$

VALIDITY

We have the numbers now, but are they actually correct? I cross-checked my work and numbers with some of the methods other people have used; looking at Reddit and /d/ posts claiming the stats are wrong and this or that is the actual correct way, but every time someone made claims like this the comments would always have disagreements with no conclusions. It seemed that no matter where I looked no one could come to a consensus as to what a correct method really was. As such, I just took things into my own hands and created a program to simulate drops.

In short, the program will count the number of "runs" it takes until all parts have been collected, store said count into an array, then repeat the process. Every new iteration of the program repeating these steps represents a new run. Right now I have the program set to simulate 10,000,000 players, each time filling the array to keep track of how many players received all the parts in x number of runs.

After the program sees that all 10,000,000 samples have succeeded it will plot the data in the array and calculate the average, median, and mode, number of runs. The average, of course, directly corresponds to the Expected number of runs we are calculating with our above methods and equations and will give us a confirmation as to if said methods are the right direction.

For example, recall our Excalibur example and the numbers we calculated:
$\mathbf{Expected} = \mathbf{6.0429\ (6-7)}$

$\mathbf{NG} = \mathbf{27}\pm\mathbf{9}$

Using the simulation we are given plots such as this.

The simulated average (6.0429) is exactly the same as our calculated average, and NG values (28 ± 9) are not too far off from our estimates (27 ± 9).

We can also do the same thing for Equinox:
$\mathbf{Expected} = \mathbf{21.7429\ (21-22)}$

$\mathbf{NG} = \mathbf{69}\pm\mathbf{18}$

Click to view Equinox simulation plot (for reference, Equinox's exact distributions look like this).

The simulated average (21.8749) and NG values (73 ± 18) are also not too far off from our estimates (21.7429 & 69 ± 18). This was simulated with the actual 12.91% - 11.28% values as well, not the simplified 1/8 distribution we used for our calculations.

Keep in mind that this is with 10,000,000 samples, for even more accuracy we could increase the sample size to 100,000,000 or even 1,000,000,000. Though quite honestly I don't think there is much benefit as the simulation are already pretty accurate. Increasing the sample size would only make the simulation take longer for just a minute increase in accuracy.

With that said, I believe the simulation and plots give a more than adequate evidence to the validity of our methods above.

I'd love to read and discuss any of your feedback for how to better these sections. Also, please let me know if you have questions, comments, or concerns about any of my math or explanations throughout this blog. If there is something fundamental I missed, or you're simply confused on something I failed to fully elaborate on, let me know!

Code for the simulation program can be found below in both .m and .txt format for MatLab and R respectively. If you do not own MatLab I would recommend downloading R as it is free. The only down side is you can't make the plots as pretty as I have them here ¯\_(ツ)_/¯. You can also still try downloading the .m or .txt file and use a converter to change it to your desired language, however, I can not guarantee the reliability nor accuracy of said converters:

https://github.com/FINNSTAR7?tab=repositories

Thank you!

Warframe Expected & Nearly Guaranteed Numbers - An Explanation

Created: May 12th, 2018 | Hotfix 22.19.1 (2018-05-03)[]

Last Updated: December 22nd, 2021 | Hotfix 31.0.5 (2021-12-21)[]

Type 1[]

Type 2[]

Type 3[]

Created: May 12^th, 2018 | Hotfix 22.19.1 (2018-05-03)[]

Last Updated: December 22^nd, 2021 | Hotfix 31.0.5 (2021-12-21)[]