Stairs.  Entry group.  Materials.  Doors.  Locks.  Design

Stairs. Entry group. Materials. Doors. Locks. Design

» How to find a mode in probability theory. Mode and median. Quantiles. Moments of random variables. Skewness and kurtosis

How to find a mode in probability theory. Mode and median. Quantiles. Moments of random variables. Skewness and kurtosis

Fashion- the value in a set of observations that occurs most frequently

Mo = X Mo + h Mo * (f Mo - f Mo-1) : ((f Mo - f Mo-1) + (f Mo - f Mo+1)),

here X Mo is the left boundary of the modal interval, h Mo is the length of the modal interval, f Mo-1 is the frequency of the premodal interval, f Mo is the frequency of the modal interval, f Mo+1 is the frequency of the post-modal interval.

The mode of an absolutely continuous distribution is any point of the local maximum of the distribution density. For discrete distributions, a mode is considered to be any value a i whose probability p i is greater than the probabilities of neighboring values

Median continuous random variable X its value Me is called for which it is equally probable that the random variable will be less or greater Meh, i.e.

M e =(n+1)/2 P(X < Me) = P(X > Meh)

Uniformly distributed NSV

Uniform distribution. A continuous random variable is called uniformly distributed on the segment () if its distribution density function (Fig. 1.6, A) has the form:

Designation: – SW is distributed uniformly over .

Accordingly, the distribution function on the segment (Fig. 1.6, b):

Rice. 1.6. Functions of a random variable distributed uniformly on [ a,b]: A– probability densities f(x); b– distributions F(x)

The mathematical expectation and dispersion of a given SV are determined by the expressions:

Due to the symmetry of the density function, it coincides with the median. Modes have no uniform distribution

Example 4. Waiting time for a response phone call– a random variable that obeys a uniform distribution law in the interval from 0 to 2 minutes. Find the integral and differential distribution functions of this random variable.

27.Normal law of probability distribution

A continuous random variable x has a normal distribution with parameters: m,s > 0, if the probability distribution density has the form:

where: m – mathematical expectation, s – standard deviation.



The normal distribution is also called Gaussian after the German mathematician Gauss. The fact that a random variable has a normal distribution with parameters: m, is denoted as follows: N (m,s), where: m=a=M[X];

Quite often in formulas, the mathematical expectation is denoted by A . If a random variable is distributed according to the law N(0,1), then it is called a normalized or standardized normal variable. The distribution function for it has the form:

The density graph of a normal distribution, which is called a normal curve or Gaussian curve, is shown in Fig. 5.4.

Rice. 5.4. Normal distribution density

properties random variable having a normal distribution law.

1. If , then to find the probability of this value falling into a given interval ( x 1 ; x 2) the formula is used:

2. The probability that the deviation of a random variable from its mathematical expectation will not exceed the value (in absolute value) is equal.

Expected value. Mathematical expectation discrete random variable X, taking a finite number of values Xi with probabilities Ri, the amount is called:

Mathematical expectation continuous random variable X is called the integral of the product of its values X on the probability distribution density f(x):

(6b)

Improper integral (6 b) is assumed to be absolutely convergent (otherwise they say that the mathematical expectation M(X) does not exist). The mathematical expectation characterizes average value random variable X. Its dimension coincides with the dimension of the random variable.

Properties of mathematical expectation:

Dispersion. Variance random variable X the number is called:

The variance is scattering characteristic random variable values X relative to its average value M(X). The dimension of variance is equal to the dimension of the random variable squared. Based on the definitions of variance (8) and mathematical expectation (5) for a discrete random variable and (6) for a continuous random variable, we obtain similar expressions for the variance:

(9)

Here m = M(X).

Dispersion properties:

Standard deviation:

(11)

Since the standard deviation has the same dimension as a random variable, it is more often used as a measure of dispersion than variance.

Moments of distribution. The concepts of mathematical expectation and dispersion are special cases of more general concept for numerical characteristics random variablesdistribution moments. The moments of distribution of a random variable are introduced as mathematical expectations of some simple functions of a random variable. So, moment of order k relative to the point X 0 is called the mathematical expectation M(XX 0 )k. Moments about the origin X= 0 are called initial moments and are designated:

(12)

The initial moment of the first order is the center of the distribution of the random variable under consideration:

(13)

Moments about the center of distribution X= m are called central points and are designated:

(14)

From (7) it follows that the first-order central moment is always equal to zero:

The central moments do not depend on the origin of the values ​​of the random variable, since when shifted by a constant value WITH its distribution center shifts by the same value WITH, and the deviation from the center does not change: Xm = (XWITH) – (mWITH).
Now it's obvious that dispersion- This second order central moment:

Asymmetry. Third order central moment:

(17)

serves for evaluation distribution asymmetries. If the distribution is symmetrical about the point X= m, then the third-order central moment will be equal to zero (like all central moments of odd orders). Therefore, if the third-order central moment is different from zero, then the distribution cannot be symmetric. The magnitude of asymmetry is assessed using a dimensionless asymmetry coefficient:

(18)

The sign of the asymmetry coefficient (18) indicates right-sided or left-sided asymmetry (Fig. 2).


Rice. 2. Types of distribution asymmetry.

Excess. Fourth order central moment:

(19)

serves to evaluate the so-called excess, which determines the degree of steepness (peakedness) of the distribution curve near the center of the distribution in relation to the normal distribution curve. Since for a normal distribution, the value taken as kurtosis is:

(20)

In Fig. 3 shows examples of distribution curves with different meanings excess. For normal distribution E= 0. Curves that are more pointed than normal have a positive kurtosis, those that are more flat-topped have a negative kurtosis.


Rice. 3. Distribution curves with varying degrees of steepness (kurtosis).

Higher order moments are not usually used in engineering applications of mathematical statistics.

Fashion discrete a random variable is its most probable value. Fashion continuous a random variable is its value at which the probability density is maximum (Fig. 2). If the distribution curve has one maximum, then the distribution is called unimodal. If a distribution curve has more than one maximum, then the distribution is called multimodal. Sometimes there are distributions whose curves have a minimum rather than a maximum. Such distributions are called anti-modal. In the general case, the mode and mathematical expectation of a random variable do not coincide. In the special case, for modal, i.e. having a mode, symmetrical distribution and provided that there is a mathematical expectation, the latter coincides with the mode and center of symmetry of the distribution.

Median random variable X- this is its meaning Meh, for which equality holds: i.e. it is equally probable that the random variable X will be less or more Meh. Geometrically median is the abscissa of the point at which the area under the distribution curve is divided in half (Fig. 2). In the case of a symmetric modal distribution, the median, mode and mathematical expectation are the same.

Mode is the most probable value of a random variable. With a symmetric distribution relative to the mean, the mode coincides with the mathematical expectation. If the values ​​of the random variable are not repeated, there is no mode.

The point on the x-axis corresponding to the maximum of the distribution density curve is called the mode, that is, the mode is the most probable value of the random variable. However, not all distributions have a mode. An example is uniform distribution. In this case, determining the center of the distribution as a mode is impossible. Moda is usually referred to as Mo.

There are the concepts of mode and median of a random variable.

Obviously, in the case of a symmetric median, it coincides with the mode and the mathematical expectation.

Based on the fact that the mode is based not on single measurements, but on a large volume of observations, it cannot be considered a random variable. The magnitude of the mode is not affected by various kinds of delays in work and loss of its normal pace.

Sometimes, when analyzing empirical distributions, the concepts of mode and median of distribution are used, "...Mode is the most probable value of a random variable,

An extensive probability-theoretic interpretation of the lottery phenomenon is the concept of probability distribution of a random variable. With its help, the probabilities are determined that a random variable will take one or another of its possible values. Let us denote by y the random variable, and by y its possible values. Then for a discrete random variable , which can take on possible values ​​Y, y2, VZ,. .., yn a convenient form of the probability distribution should be considered the dependence P(y = y), which is usually called a probability series, a distribution series. In practice, for a quick generalized assessment of the probabilistic distribution of risk values, the so-called numerical and other characteristics of the distribution of random results are often used: mathematical expectation, dispersion, mean square (standard) deviation, coefficient of variation, mode, median, etc. (see, for example, etc. .). In other words, for quick and holistic perception the entrepreneur strives (or simply you-

Based on data from the USSR State Statistics Committee on the distribution of the population by average per capita total income, we will try to compare the indicators of average, median and modal income (Table 1). The table shows that the average income in absolute value exceeds the median and modal income, and its growth occurs mainly due to an increase in the proportion of people with high incomes, that is, the use of the average income indicator leads to a significant overestimation of the income level of the bulk of the population and to a large extent hides the process of their differentiation. Modal income values ​​gravitate toward the lower groups of the distribution and deviate from the median income downward. However, the occurrence of a mode in one or another interval is often random; a small change in the distribution will cause the mode to appear in the adjacent interval. For example, in 1989, the most common income level was from 100 to 125 rubles (16.1% of the population received such income), however, due to minor shifts in income that occurred in 1989-1990, the most common interval was the following interval (125-150 rubles) , and the value of fashion itself increased by 15.6 rubles. In addition, the share of the population in the modal income range may exceed other shares only slightly.

In order to characterize the center of the distribution of a logarithmically normal random variable a, you can use, along with the already calculated mathematical expectation Ma, the mode (local maximum density /(a a)) toc1a = exp(t-st2) and

Mode - fashion. The most probable value of a random variable.

FASHION - concept

Among the numerical characteristics of random variables, it is necessary, first of all, to note those that characterize the position of the random variable on the numerical axis, i.e. indicate some average, approximate value around which all possible values ​​of a random variable are grouped.

The average value of a random variable is a certain number that is, as it were, its “representative” and replaces it with rough approximate calculations. When we say: “the average lamp operating time is 100 hours” or “the average point of impact is shifted relative to the target by 2 m to the right,” we are indicating a certain numerical characteristic of a random variable that describes its location on the numerical axis, i.e. "position characteristics".

Of the characteristics of a position in probability theory, the most important role is played by the mathematical expectation of a random variable, which is sometimes called simply the average value of a random variable.

Let's consider a discrete random variable having possible values ​​with probabilities. We need to characterize with some number the position of the values ​​of a random variable on the x-axis, taking into account the fact that these values ​​have different probabilities. For this purpose, it is natural to use the so-called “weighted average” of the values, and each value during averaging should be taken into account with a “weight” proportional to the probability of this value. Thus, we will calculate the average of the random variable, which we will denote by:

or, given that,

. (5.6.1)

This weighted average is called the mathematical expectation of the random variable. Thus, we introduced into consideration one of the most important concepts of probability theory - the concept of mathematical expectation.

The mathematical expectation of a random variable is the sum of the products of all possible values ​​of a random variable and the probabilities of these values.

Note that in the above formulation the definition of mathematical expectation is valid, strictly speaking, only for discrete random variables; Below we will generalize this concept to the case of continuous quantities.

In order to make the concept of mathematical expectation more clear, let us turn to the mechanical interpretation of the distribution of a discrete random variable. Let there be points with abscissas on the abscissa axis, in which the masses are concentrated, respectively, and . Then, obviously, the mathematical expectation defined by formula (5.6.1) is nothing more than the abscissa of the center of gravity of a given system of material points.

The mathematical expectation of a random variable is connected by a peculiar dependence with the arithmetic mean of the observed values ​​of the random variable over a large number of experiments. This dependence is of the same type as the dependence between frequency and probability, namely: with a large number of experiments, the arithmetic mean of the observed values ​​of a random variable approaches (converges in probability) to its mathematical expectation. From the presence of a connection between frequency and probability, one can deduce as a consequence the presence of a similar connection between the arithmetic mean and the mathematical expectation.

Indeed, consider a discrete random variable characterized by a distribution series:

Where .

Let independent experiments be carried out, in each of which the quantity takes a certain value. Let's assume that the value appeared once, the value appeared once, and the value appeared once. Obviously,

Let us calculate the arithmetic mean of the observed values ​​of the quantity, which, in contrast to the mathematical expectation, we denote:

But there is nothing more than the frequency (or statistical probability) of an event; this frequency can be designated . Then

,

those. the arithmetic mean of the observed values ​​of a random variable is equal to the sum of the products of all possible values ​​of the random variable and the frequencies of these values.

As the number of experiments increases, the frequencies will approach (converge in probability) to the corresponding probabilities. Consequently, the arithmetic mean of the observed values ​​of a random variable will approach (converge in probability) to its mathematical expectation as the number of experiments increases.

The connection between the arithmetic mean and mathematical expectation formulated above constitutes the content of one of the forms of the law large numbers. We will give a rigorous proof of this law in Chapter 13.

We already know that all forms of the law of large numbers state the fact that some averages are stable over a large number of experiments. Here we're talking about on the stability of the arithmetic mean from a series of observations of the same quantity. With a small number of experiments, the arithmetic mean of their results is random; with a sufficient increase in the number of experiments, it becomes “almost non-random” and, stabilizing, approaches a constant value - the mathematical expectation.

The stability of averages over a large number of experiments can be easily verified experimentally. For example, when weighing a body in a laboratory precise scales, as a result of weighing, we get a new value each time; To reduce observation error, we weigh the body several times and use the arithmetic mean of the obtained values. It is easy to see that with a further increase in the number of experiments (weighings), the arithmetic mean reacts to this increase less and less and, with a sufficiently large number of experiments, practically ceases to change.

Formula (5.6.1) for the mathematical expectation corresponds to the case of a discrete random variable. For a continuous quantity, the mathematical expectation is naturally expressed not as a sum, but as an integral:

, (5.6.2)

where is the distribution density of the quantity .

Formula (5.6.2) is obtained from formula (5.6.1) if we replace it individual values continuously changing parameter x, the corresponding probabilities - the probability element, the final sum - the integral. In the future, we will often use this method of extending the formulas derived for discontinuous quantities to the case of continuous quantities.

In the mechanical interpretation, the mathematical expectation of a continuous random variable retains the same meaning - the abscissa of the center of gravity in the case when the mass is distributed along the abscissa continuously, with density . This interpretation often allows one to find the mathematical expectation without calculating the integral (5.6.2), from simple mechanical considerations.

Above we introduced a notation for the mathematical expectation of the quantity . In a number of cases, when a quantity is included in formulas as a specific number, it is more convenient to denote it by one letter. In these cases, we will denote the mathematical expectation of a value by:

The notation and for the mathematical expectation will be used in parallel in the future, depending on the convenience of a particular recording of the formulas. Let us also agree, if necessary, to abbreviate the words “mathematical expectation” with the letters m.o.

It should be noted that most important characteristic provisions - mathematical expectation - does not exist for all random variables. It is possible to compose examples of such random variables for which the mathematical expectation does not exist, since the corresponding sum or integral diverges.

Consider, for example, a discontinuous random variable with a distribution series:

It is easy to verify that, i.e. the distribution series makes sense; however the amount in in this case diverges and, therefore, there is no mathematical expectation of the value. However, such cases are not of significant interest for practice. Typically, the random variables we deal with have a limited range of possible values ​​and, of course, have a mathematical expectation.

Above we gave formulas (5.6.1) and (5.6.2), expressing the mathematical expectation, respectively, for a discontinuous and continuous random variable.

If a quantity belongs to quantities of a mixed type, then its mathematical expectation is expressed by a formula of the form:

, (5.6.3)

where the sum extends to all points at which the distribution function is discontinuous, and the integral extends to all areas at which the distribution function is continuous.

In addition to the most important of the characteristics of a position - the mathematical expectation - in practice, other characteristics of the position are sometimes used, in particular, the mode and median of a random variable.

The mode of a random variable is its most probable value. The term "most probable value" strictly speaking applies only to discontinuous quantities; for a continuous quantity, the mode is the value at which the probability density is maximum. Let us agree to denote the mode by the letter . In Fig. 5.6.1 and 5.6.2 show the mode for discontinuous and continuous random variables, respectively.

If the distribution polygon (distribution curve) has more than one maximum, the distribution is called “multimodal” (Fig. 5.6.3 and 5.6.4).

Sometimes there are distributions that have a minimum in the middle rather than a maximum (Fig. 5.6.5 and 5.6.6). Such distributions are called “anti-modal”. An example of an antimodal distribution is the distribution obtained in Example 5, n° 5.1.

In the general case, the mode and mathematical expectation of a random variable do not coincide. In the particular case, when the distribution is symmetrical and modal (i.e. has a mode) and there is a mathematical expectation, then it coincides with the mode and center of symmetry of the distribution.

Another position characteristic is often used - the so-called median of a random variable. This characteristic is usually used only for continuous random variables, although it can be formally defined for a discontinuous variable.

The median of a random variable is its value for which

those. it is equally likely that the random variable will be less than or greater than . Geometrically, the median is the abscissa of the point at which the area limited by the distribution curve is divided in half (Fig. 5.6.7).