Tally

class pydsol.core.statistics.Tally(name: str)[source]

Bases: StatisticsInterface

The Tally is a statistics object that calculates descriptive statistics for a number of observations, such as mean, variance, minimum, maximum, skewness, etc.

The initialize() method resets the statistics object. The initialize method can, for instance, be called when the warmup period of the simulation experiment has completed.

The mean of the Tally is calculated with the formula:

\[\mu = \sum_{i=1}^{n} {x_{i}} / n\]

where n is the number of observations and \(x_{i}\) are the observations.

Example

In discrete-event simulation, the Tally can be used to calculate statistical values for waiting times in queues, time in system of entities, processing times at a server, and throughput times of partial processes.

Attributes:
  • _name (str) – the name by which the statistics object can be identified

  • _n (int) – the number of observations

  • _sum (float) – the sum of the observation values

  • _min (float) – the lowest value in the current observations

  • _max (float) – the highest value in the current observations

  • _m1, _m2, _m3, _m4 (float) – the 1st to 4th moment of the observations

__init__(name: str)[source]

Construct a new Tally statistics object. The Tally is a statistics object that calculates descriptive statistics for a number of observations, such as mean, variance, minimum, maximum, skewness, etc.

Parameters:

name (str) – The name by which the statistics object can be identified.

Raises:

TypeError – when name is not a string

initialize()[source]

Initialize the statistics object, resetting all values to the state where no observations have been made. This method can, for instance, be called when the warmup period of the simulation experiment has completed.

property name

Return the name of this statistics object.

Returns:

The name of this statistics object.

Return type:

str

register(value: float | int)[source]

Record one or more observation values, and calculate all statistics up to and including the last value (mean, standard deviation, minimum, maximum, skewness, etc.).

Parameters:

value (float) – The value of the observation.

Raises:
  • TypeError – when value is not a number

  • ValueError – when value is NaN

n() int[source]

Return the number of observations.

Returns:

The number of observations.

Return type:

int

min() float[source]

Return the observation with the lowest value. When no observations were registered, NaN is returned.

Returns:

The observation with the lowest value, or NaN when no observations were registered.

Return type:

float

max() float[source]

Return the observation with the highest value. When no observations were registered, NaN is returned.

Returns:

The observation with the highest value, or NaN when no observations were registered.

Return type:

float

sum() float[source]

Return the sum of all observations since the statistic initialization.

Returns:

The sum of the observations.

Return type:

float

mean() float[source]

Return the mean. When no observations were registered, NaN is returned.

The mean of the Tally is calculated with the formula:

\[\mu = \sum_{i=1}^{n} {x_{i}} / n\]

where n is the number of observations and \(x_{i}\) are the observations.

Returns:

The mean, or NaN when no observations were registered.

Return type:

float

confidence_interval(alpha: float) Tuple[float][source]

Return the confidence interval around the mean with the provided alpha. When fewer than two observations were registered, (NaN, NaN) is returned.

Parameters:

alpha (float) – Alpha is the significance level used to compute the confidence level. The confidence level equals \(100 * (1 - alpha)\%\), or in other words, an alpha of 0.05 indicates a 95 percent confidence level.

Returns:

The confidence interval around the mean, or (NaN, NaN) when fewer than two observations were registered.

Return type:

(float, float)

Raises:
  • TypeError – when alpha is not a float

  • ValueError – when alpha is not between 0 and 1, inclusive

variance(biased: bool = True) float[source]

Return the variance of all observations since the statistic initialization. By default, the biased (population) variance is returned. The biased variance needs at least 1 observation, the unbiased variance needs at least 2.

The formula for the biased (or population) variance is:

\[\sigma^2 = { {\frac{1}{n}} \left( \sum{x_{i}^2} - \left( \sum{x_{i}} \right)^2 / n \right) }\]

The formula for the unbiased (or sample) variance is:

\[S^2 = { {\frac{1}{n-1}} \left( \sum{x_{i}^2} - \left( \sum{x_{i}} \right)^2 / n \right) }\]
Parameters:

biased (bool) – Whether to return the biased (population) variance or the unbiased (sample) variance. By default, biased is True and the population variance is returned.

Returns:

The biassed or unbiased variance of all observations since the initialization, or NaN when too few observations were registered.

Return type:

float

stdev(biased: bool = True) float[source]

Return the standard deviation of all observations since the initialization. The sample standard deviation is defined as the square root of the variance. The biased standard deviation needs at least 1 observation, the unbiased version needs at least 2.

The formula for the biased (population) standard deviation is:

\[\sigma = \sqrt{ {\frac{1}{n}} \left( \sum{x_{i}^2} - \left( \sum{x_{i}} \right)^2 / n \right) }\]

The formula for the unbiased (sample) standard deviation is:

\[S = \sqrt{ {\frac{1}{n - 1}} \left( \sum{x_{i}^2} - \left( \sum{x_{i}} \right)^2 / n \right) }\]
Parameters:

biased (bool) – Whether to return the biased (population) standard deviation or the unbiased (sample) standard deviation. By default, biased is True and the population standard deviation is returned.

Returns:

The (unbiased) sample standard deviation of all observations since the initialization, or NaN when not enough observations were registered.

Return type:

float

skewness(biased: bool = True) float[source]

Return the skewness of all observations since the statistic initialization. For the biased (population) skewness, at least two observations are needed; for the unbiased (sample) skewness, at least three observations are needed. If there are too few observations, NaN is returned. The method returns the biased (population) skewness as the default.

The formula for the biased (population) skewness is:

\[Skew_{biased} = \frac{ \sum{(x_{i} - \mu)^3} }{n . \sigma^3}\]

where \(\sigma^2\) is the biased (population) variance. So the denominator is equal to \(n . population\_var^{3/2}\).

There are different formulas to calculate the unbiased (sample) skewness from the biased (population) skewness. Minitab, for instance calculates unbiased skewness as:

\[Skew_{unbiased} = Skew_{biased} {\left( \frac{n - 1}{n} \right)} ^{3/2}\]

whereas SAS, SPSS and Excel calculate it as:

\[Skew_{unbiased} = Skew_{biased} \sqrt{\frac{n (n - 1)}{n - 2} }\]

Here we follow the last mentioned formula. All formulas converge to the same value with larger n.

Parameters:

biased (bool) – Whether to return the biased (population) skewness or the unbiased (sample) skewness. By default, biased is True and the population skewness is returned.

Returns:

The skewness of all observations since the initialization, or NaN when too few observations were registered.

Return type:

float

kurtosis(biased: bool = True) float[source]

Return the kurtosis of all observations since the statistic initialization. The biased (sample) kurtosis calculation needs three observations, and the unbiased (population) calculation needs four observations. When too few observations were registered, NaN is returned.

The formula for the biased (population) kurtosis is:

\[kurt_{biased} = \frac{\sum{(x_{i} - \mu)^4}}{n.\sigma^4}\]

where \(\sigma^2\) is the population variance. So the denominator is equal to \(n . pop\_var^2\).

The formula for the unbiased (sample) kurtosis is:

\[kurt_{unbiased} = \frac{\sum{(x_{i} - \mu)^4}}{(n-1).S^4}\]

where \(S^2\) is the sample variance. So the denominator is equal to \((n - 1) . sample\_var^2\).

Parameters:

biased (bool) – Whether to return the biased (population) kurtosis or the unbiased (sample) kurtosis. By default, biased is True and the population kurtosis is returned.

Returns:

The kurtosis of all observations since the initialization, or NaN when too few observations were registered.

Return type:

float

excess_kurtosis(biased: bool = True) float[source]

Return the excess kurtosis of the registered data. The kurtosis value of the normal distribution is 3. The (biased) excess kurtosis is the kurtosis value shifted by -3 to be 0 for the normal distribution. The biased excess kurtosis needs three observations; if fewer observations were registered, NaN is returned.

The formula for the biased (population) excess kurtosis is:

\[ExcessKurt_{biased} = Kurt_{biased} - 3\]

The unbiased (sample) excess kurtosis is the sample-corrected value of the biased excess kurtosis. When fewer than four observations were registered, NaN is returned for the unbiased excess kurtosis. Several formulas exist to calculate the sample excess kurtosis from the biased excess kurtosis. Here we use:

\[ExcessKurt_{unbiased} = \frac{n - 1}{(n - 2) (n - 3)} \left( (n + 1) ExcessKurt_{biased} + 6 \right)\]

This is the excess kurtosis that is calculated by, for instance, SAS, SPSS and Excel.

Parameters:

biased (bool) – Whether to return the biased (population) excess kurtosis or the unbiased (sample) excess kurtosis. By default, biased is True and the population excess kurtosis is returned.

Returns:

The excess kurtosis of all observations since the initialization, or NaN when too few observations were registered.

Return type:

float

classmethod report_header() str[source]

Return a string representing a header for a textual table with a monospaced font that can contain multiple tallies.

report_line() str[source]

Return a string representing a line with important statistics values for this tally, for a textual table with a monospaced font that can contain multiple tallies.

Return a string representing a footer for a textual table with a monospaced font that can contain multiple tallies.