محمد حسين كاظم العدال |
استعمال بعض طرائق تقدير دالة المعولية التقريبية لبيانات ملوثة | ماجستير في علوم الاحصاء |
abstract
Some observations deviate, deviate, or depart from the greater part of the observations that are present with them in most natural phenomena, and those deviant calamities are called outliers or contaminations, which in the case of their presence in the data are the traditional capabilities and because of the penetration of the basic conditions of it Failed to give accurate estimates of the parameters of the statistical community from which that data was pulled. Therefore, using the traditional capabilities of data containing polluted values is a real problem due to the inefficiency of these capabilities. Therefore, we must investigate and search for ways to purify these data from gay (pollutants) first Then, the estimation is performed for the purpose of obtaining efficient estimates for the estimated parameters and then to an efficient estimate of the reliability function.
In this thesis, it was estimated that the approximate reliability of the frit distribution in the case of the data contains k polluted values (anomalies) arising from the deviation of the original values of the data from its primary distribution using the anomaly model of Dixit, which will be used to find a common distribution of data in the event that it contains an anomaly, therefore The Ferrite distribution was used as the original distribution of the data and the basic data was contaminated with k of the values followed by the exponential distribution as a first case and k of the values follow the Whipple distribution as a second case, and the parameters of the Ferrite distribution were estimated using the greatest possible method and the method of plotting and the linear method of estimating the parameters of the Distribute in both cases, and then compensate the estimates of those methods in the reliability of the Ferrite distribution function to obtain the approximate reliability of the distribution, and then compare the estimation methods using the criteria of the average integral error squares of IMSE and the best way to estimate the approximate reliability function was found in the presence of contaminated data is The greatest possible method is with an advantage of 34% when the contaminated distribution is exponential and 34% when the contaminated distribution is wett, followed by the method of placement with an advantage of 16% when the contaminated distribution is exponential and 16% when the contaminated distribution is wavel, then the linear plasma method with a priority of 0% For both distributions Then, the estimated reliability by the greatest possible method when the contaminated distribution is exponential has the lowest average square error of integral error (IMSE)) than the estimated reliability when the polluted distribution is wet, but the difference is very small for all simulation experiments. The estimated reliability by the greatest possible method R ̂_le approaches more than the reliability The true method of the placement method R ̂_Mom and the linear placement method R ̂_LMom, where the placement method recorded an advantage over the other methods for some simulation experiments. And the greatest possible method is more suitable for real data, which represents the failure times for the mammogram, which is used to detect breast cancer. The approximate reliability of the mammogram device increases when pollution rates decrease, which are the deliberate stopping times of the mammogram device. The curves of the approximate reliability function are estimated in the greatest possible way for the real data and approach the true reliability.