| ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 8
II STATISTICAL MODELS AND ANALYSES IN AUDITING
1. The Beginnings
The field of accounting encompasses a number of subdisciplines.
Among these, two important ones are financial accounting and auditing.
Financial accounting is concemed with the collection of data about the
economic activities of a given firm and He summarizing and reporting of
Rem in He form of financial statements. Auditing' on He over hand,
refers to the independent verification of He fairness of these financial
statements. The auditor collects data mat is useful for verification from
several sources and by different means. It is very evident mat the
acquisition of reliable audit information at low cost is essential to
economical and efficient auditing.
Them are two main types of audit tests for which He acquisition of
infonnation can profitably make use of statistical sampling. Firstly, an
auditor may require evidence to verify that the accounting treatments of
numerous individual transactions comply win prescribed procedures for
internal control. Secondly, audit evidence may be required to verify that
reported monetary balances of large numbers of individual items are not
materially misstated. The first audit test, collecting data to determine the
rate of procedural errors of a population of transactions is called a
compliance test. The second, coDect~ng data for evaluating the aggregate
monetary error in the stated balance, is caned a substantive test of details.
The auditor considers an error to be material if its magnitude "is such that
it is probable that the judgement of a reasonable person relying upon the
report would have been changed or influenced by the inclusion or
correction of the item". (Financial Accounting Standards Board, 1980)
Current auditing standards set by He American Institute of Certified
Public Accountants (AICPA) do no! mandate the use of statistical
sampling when conducting audit tests (AICPA, l98l & 1983~. However,
the meets of random sampling as the means to obtain, at relatively low
COSt, reliable approximations to the characteristics of a large group of
entnes, were known to accountants as early as 1933 (Carmen, 19331. The
early applications were apparently limited to compliance tests (Neter,
19861. The statistical problems that arise, when analyzing He type of
nonstandard mixture of distributions that is the focus of this report, did not
surface in auditing until the late 1950s. At about that time, Kenneth
Stringer began to investigate Be practicality of incorporating statistical
sampling into the audit practices of his finn, Deloitte, Haskins & Sells. It
was not until 1963 that some results of his studies were commurucated to
the statistical profession. The occasion was a meeting of the American
8
OCR for page 9
Statistical Association (Stringer. 1963 & 1979).
Before summarizing Stringer's main conclusions, we describe Me
context as follows. An item In an audit sample produces two pieces of
inforTnabon, namely, the book (recorded) amount and the audited (correct)
amount. The difference between We two is caned the error amount. The
percentage of items in error may be smalD in an accounting population. In
an audit sample, it is not uncommon to observe only a few items with
errors. An audit sample may not yield any non-zero error amounts. For
analyses of such data, in which most observations are zero, me classical
interval estimation of the total error amount based on the asymptotic
normality of me sampling distribution is not reliable. Also, when the
sample contains no items in error, the estimated standard deviation of the
estimator of the total error amount becomes zero. Alternatively, one could
use the sample mean of the audited amount to estimate Me total mean
audited amount for the population. The estimate of the mean is then
multipled by the known number of items in the population to estimate the
population total. In the audit profession, this method is referred to as
mean-per-unit estimation (AICPA, 19X3~. Since observations are audited
amounts, Me standard deviation of this estimator can be estimated even
when aU items in the sample are error-free. However, because of the large
variance of the audited amount that may arise In simple random sampling,
the mean-per-un~t estimation is imprecise. More fundamentally, however,
when the sample does not contain any item in error, the difference between
the estunate of the tote] audited amount and the book balance must be
Interpreted as the sampling error. The auditor thus evaluates that the book
amount does not contain any material error. This is an important point for
the auditor. To quote from Swinger (1963) concem~ng statistical estimates
("evaluations") of total error:
Assuming a population with no error in it, each of the possible
distinct samples of a given size Mat could be selected from it
would result In a different estimate and precision limit under this
approach; however, from the view point of the auditor, all
samples which include no errors should result in identical
evaluations.
Stringer men reported in He same presentation that he, in
collaboration with Frederick F. Stephan of Princeton University, had
developed a new statistical procedure for his firm's use in auditing that did
not depend on the normal approximation of He sampling distribution and
that could silk provide a reasonable inference for the population error
amount when all items in the sample are error-free. This sampling plan is
apparently the original implementation of He now widely pracused dollar
(or monetary) unit sampling and is one of the first workable solutions
9
OCR for page 10
proposed for the nonstandard mixtures problem in accounting. However,
as it is studied later in this report, the method assumes that errors are
overstatements with the maximum size of an error of an item equal to its
book amount. Another solution, using a similar procedure, was devised by
van Heerden (19611. His work, however, was slow to become known
within He American accounting profession.
En the public sector, statistical sampling has also become an integral
part of audit tools In the Intemal Revenue Service ORS) since the issuance
of He 1972 memo by their Chief Council ORS, 1972 & 1975~. In a tax
examination the audit agent uses statistical sampling of individual items to
estun ate the adjusOnent, if necessary, for an aggregate expense reported in
the tax retum. Statistical auditing may also be utilized by over
gove'Tunental agencies. For example, the Office of He Spector General
of He Deparunent of Heath and Human Services investigates compliance
of the cost report of a state to the Medicaid policy by using statistical
sampling of items. IN these cases, a large proportion of items in an audit
sample requires no adiusonent, i.e., most sample items are allowable
deductions. Since an individual item adjusunent is seldom negative, the
audit data for estimation of the total adjustment is a mixture of a large
percentage of zeros and a small percentage of positive numbers. Thus, the
mixture model and related statistical problems that are important to
accounting firms in auditing also arise in other auditing contexts such as
those associated with IRS tax examinations. Significant differences also
exist in these applications, however, and these win be stressed later.
For concise accounts of the problems of statistical auditing one is
referred to Knight (1979), Smith (1979) and Neter (1986~; the last
reference also includes recent developments. Leslie, Teitiebaum and
Anderson (1980) also provide an annotated bibliography that portrays the
historical development of the subject Trough 1979. In the sections which
follow, however, we provide a comprehensive survey of the research
efforts that have contributed to me identification and beKer understanding
of problems in statistical auditing. We include brief descnptions of many
of the solutions that have been proposed for these problems along with
their limitations. It wild be noticed that the solutions thus far proposed are
mainly directed toward He special need for good upper bounds on errors
when emus are overstatements. This is an important and common audit
problem for accounting firms but in the case of tax examinations, Though
the mixture distnbui~on is similar, the interest is in me study of lower
bounds. Thus in statistical auditing, whether in the private or public
sector, the investigator's interest is usually concerned win one-sided
problems, i.e., of an upper or a lower bound, rather than nvo-si~e~
problems as currently stressed in many texts.
10
OCR for page 11
The next section provides the definitions and notations that are used.
Then in Section 3 though 7' we present venous methodologies that have
been provided in the literature A numerical example is given in me last
section to indurate some of the alternative procedures.
11
OCR for page 12
OCR for page 13
OCR for page 14
OCR for page 15
OCR for page 16
OCR for page 17
OCR for page 18
OCR for page 44
OCR for page 45
OCR for page 46
OCR for page 47
OCR for page 48
OCR for page 49
OCR for page 50
OCR for page 52
OCR for page 53
OCR for page 54
Representative terms from entire chapter:
dollar unit
2. Definitions and Notations
An account, such as accounts receivable or inventory, is a population
of individual accounts. To distinguish He use of the word 'account' in the
former sense from me latter, we dedne the constituent individual accounts,
when used as audit uruts, as tine items. Let Y' and Xi, Be latter not
usually known for an values of i, denote the book (recorded) amount and
the audited (correct) amount respectively, for the inch line item of an
account of N Inne items. The book and audited balances of Me account are
respectively
N
Y=2Y'.
·=1
called the population book amount, and
V
X=~Xi.
i=1
(2.1)
(2.2)
caped Me population audited amount. The error amount of the i-th item
is defined to be
Di = Yi—Xi
(2.3)
When Di > 0 , we can it an overstatement and, when Di
follows:
, z with probability p,
4=~ O wi~probability (imp),
(2.7)
where p is the proportion of items with errors in the population and z =0
is a random variable representing the error amount. z may depend on the
book amount. The nonstandard mixture problem that is the focus of this
report is the problem of obtaining confidence bounds for the population
total error D when sampling from the model (2.7~.
A useful sampling design for statistical auditing is to select items
without replacement with probability proportional to book values. This
sampling design can be modeled in teens of use of individual doBars of
the total book amount as sampling units and is commonly referred to as
Dollar Unit Sampling (D US) or Monetary Unit Sampling (MUS).
(Anderson and Teitlebaum, 1973; Roberts, 1978; Leslie, Teitlebaum and
Anderson, 19801. The book amounts of the N items are successively
cumulated to a total of Y dollars. One may Den choose systematically n
doBar units at fixed intervals of ~ (= Yin ~ doDars. The items with book
amounts exceeding ~ doDars, and hence items that are certain to be
sampled, are separately examined. Items with a zero book amount should
also be examined separately as Hey will not be selected. If a selected
dollar unit fans in the lath item, the tainting Ti (=Dil Yi) of the item is
recorded. Namely, every dollar unit observation is the tainting of the item
that the unit falls in. The model (2.7) may then be applied for DUS by
considering ~ as an independent observation of tainting of a dollar unit. p
is, then, the probability that a dollar unit is in error. Thus, (2.7) can be
used for sampling individual items or individual doBars. In the former, ~
stands for the error amount of an item, and in Me latter for the tainting of a
dollar unit.
In the next section we present some results from several empirical
studies to illustrate values of p and the distribution of z, for both line item
sampling and DUS designs.
13
3. Error Distributions of Audit Populations - Empirical Evidence
Why do errors occur? Hylas and Ashton (1982) conducted a survey
of the audit practices of a large accounting finn in order to investigate the
Ends of accounts that are likely to show errors, to obtain alternative audit
leads for detection of these errors, and to attempt to identify their apparent
causes. Not surpnsingly, their sway shows mat unintentional human error
is the most likely cause of recording erwrs. The remainder of this section
reports the results of several empincal studies about actual values of the
error rate p and actual distributions of me non-zero error z In the model
(2.7~. The sample audit populations are from a few large accounting finns
and each contains a relatively large number of errors. Therefore, the
conclusions may not represent typical audit situations.
A) Line item errors. Data sets supplied by a large accounting finn
were studied by Ramage, Krieger and Spero (1979) and again by Johnson,
Leitch and Neter (1981~. The orientations of the two studies differ in
some important respects. The latter provides more comprehensive
information about the error amount distributions of the given data sets. It
should be noted mat the data sets are not chosen randomly. Instead, they
have been selected because each dam set contains a large number of errors
enough to yield a reasonable smooth picture of the error distribution.
According to the study by Johnson et al. (1981), He median error
rate of 55 accounts receivables data is .024 (~e quartiles are: Q ~=.004 and
Q3=.089~. On He other hand, the median enter rate of 26 inventory audits
is .154 (Q ~=.073 and Q3=.399~. Thus the error amount distribution of a
typical accounts receivable in their study has a mass .98 at zero. A
random sample of 100 items from such a distribution will then contain, on
the average, only two non-zero observations. On the other hand, the error
amount distribution of a typical inventory in their study has a mass .85 at
the origin and sampling of 100 items from such a distribution win contain,
on the average, 15 non-zero observations. The items with larger book
amounts are more likely to be in error Han those with smaller book
amounts. The average error amount, however, does not appear to be
related to the book amount. On the other hand, the standard deviation of
the error amount tends to increase with book amount.
Ham. Losel1 and Smieliauskas (1985) conducted a similar sway
using data sets provided by another accounting firm. Besides accounts
receivable and inventory, this study also included accounts payable,
purchases and sales. Four error rates are defined and reported for each
cRte~o~ of accounts. It should be noted that their smby defines emu
hroadlv. since they include errors that do not accompany changes in
rccordcd amounts.
14
The distribution of non-zero error amounts again differs substantially
between receivables and inventory. The error amount for receivables are
likely to be overstated and their distribution positively skewed. On the
other hand, errors for inventory include both overstatements and
understatements with about equal frequency. However' for both account
categories, He distnbui~ons contain outliers. Graphs in Figure ~ are taken
from Johnson e' al. (1981) and illustrate forms of the ever amount
distributions of typical receivables and inventory audit data. The figures
show the non-normality of error distnbutions.
Similar conclusions are also reached by Ham et al. (1985). Their
study also reports the distribution of erTor amounts for accounts payables
and purchases. The error amounts tend to be understannents for these
categones. Again, the shape of distnbudons are not normal.
By Dollar unit tainangs. When items are chosen with probability
proportional co book amounts, He relevant error amount distr~budon is He
distribution of tintings weighted by the book amount. Equivalently, it is
He distribution of dodar unit faintings. Table 1 tabulates an example.
Neter, Johnson and Leitch (1985) report the dollar unit tainting
distnbunons of the same audit data Hat they analyzed previously. The
median error rate of receivables is .040 for dollar units and is higher than
Hat of line items (.024~. Similarly, the median dolBar unit error rate for
inventory is .186 (.154 for line items). The reason is, they conclude, that
the line item error rate tends to be higher for items with larger book
amount for both categories. Since the average line item error amount is
not related with the book amount, the dollar unit tainting tends to be
smaller for items with larger book amounts. Consequently, the
distnbudon of dollar unit tainting tends to be concentrated around the
ongin. Some accounts receivable have, however, a J-shaped dollar unit
taint distribution with negative skewness.
One significant characteristic of the dollar unit tainting distribution
that is common for many accounts receivable is the existence of a mass at
1.00, indicating that a significant proportion of these items has a 100%
overstatement err. Such an error could arise when, for example, an
account has been paid in fun but the transaction has not been recorded. A
standard parametric distribution such as nonnal, exponential, gamma, beta
and so on, alone may not be satisfactory for modeling such distribution.
Figure 2 gives the graphs of the dollar unit tainting distributions for the
same audit data used in Figure 1. Note Hat the distribution of faintings
can be skewed when that of error amounts is not. Note also the existence
of an appreciable mass at 1 in the accounts receivable example. The
situation here may be viewed as a nonstandard mixture in which the
discrete part has masses at two points.
15
Fiat 1
Ex~m~esof~d~ ofEnor^moun~
(^~^ccoun~Rc~iv~Ie(1060b~0~)
D~TR~UT~N OF ERROR AMOUNTS ~ ACCOUNTS RECE1V^BLE JUDE
~ 3
~ . ~
O ~
~00 -250
12,
7 ~
\
loo o
100 250
Error Amounts
16
Figure 1 (continued,
(B) InventoIy (1139 observations)
DISTRIIBU~ON OF E - - MOUNTS
.s
.4
0 ~
~ .o
A
.2
O ,,
28~ 250
l
/
J
/
/
-100 0 100
E nor Amounts
~ _
2SO
Source: Figures 1 and 2 of Johnson, Leitch and Neter (1981)
17
- /
177,6
Figure 2
Examples of Distribution of Dollar Unit Tainting
(A) Accounts Receivable
o
0 _
i_
~ `1)
~ O
0 _
O _
~ · .
-1. 5 -1. 0 -0. 5 0.0 0.5
Do] lar Unit Taint
18
1 rig
1.0 1.
and
E(oi)=Pi,
(7.12a)
Yar(oi)= P K P' . (7.12b)
Let u =(wO, ,wl00) with 2;wi =n be the sample data of n items. w is
distributed as a multinomial distribution (n, p) when sampling is win
replacement (if sampling is without replacement, approximately). Since
the Dirichlet prior distribution is conjugate with the multinomial sampling
model, the posenor distribudon of p is again a Dinchlet distribution with
the parameter (K p+w). We may define
K'=K+n.
and
wi
{i = n '
(7.13a)
(7. lab)
p, = (Kpi+npi) 9 i=1 100. (7.13c)
Then the posterior distribution of p is DinchIet (K' p'), where
P' - (P'o9 .P'1oo) By We definition of AD:
100 i
D=~ 1~ Pi , (7.14)
the posterior distribution Of ED iS denved as a linear combination Of P'i.
It can be shown that
100 i
E(l1D)= ~ 1ooPi, and (7.15a)
Var(iLD)= ~1 1 {my O )2p'i-(~ ~ P'i)2} (715b)
The exact distnbui~on of ED iS complicated and therefore is approximated
by a Beta distnbution having the same mean and the vanance. Using
simulation, Tsui et al. suggest that K =5. go= 8. p~oo=.IO1 and
rem airiing 99 Pi's being .001 be used as Me prior sewing for Weir upper
bound to perform wed under repeated sampling for a wide variety of
tainting distributions.
44
McCray (1984) suggests another non-pa~etric Bayesian approach
using me muldnomial dis~ibudon as the data generating model. ~ his
model, ED ho bun di~retzed, involving a nor of categones, say
ADS, j=l,, Nil. Me auditor is ~ provide As assessment of Me prior
distribution by assigning probabilities qj to the values, pDj. Men the
posterior distribution of AD iS dete=med to be
~jL(Wl~D,)
Prob (~D =~Dj ~ W) ah L (w! pD4) (7.16)
where
100
L(wl pDj)=max n Pi i
in which Me maximum is taken over all probabilities ( Pi } satisfying
No
~ ~Dj Pj = ED
· t '
'=1
(7.17)
(7.18)
It should be noted that Me two nonpararnetric models introduced
above can incorporate negative faintings; mat is, the auditor defines any
finite lower and upper limits for tainting and divides the sample space into
a Unite categories.
Simulation studies have been performed to compare performances of
these Bayesian bounds with the procedures described in earlier sections.
Dwonn and Grimlund (1986) compares the performance of Weir moment
bound with that of McCray's procedure. Several Bayesian and non-
Bayesian procedures are also compared in Smienauskas (1986~. Gnm~und
and Felix (1987) provides results of an extensive simulation study that
compares the long run perfornances of me following bounds: Bayesian
bounds with normal error distnbution as discussed in A) above, the Cox
and SneU as discussed In C), the bound of Tsui et al. as discussed in D)
and the moment bound discussed in Section 6.
Recently, Tamura (1988) has proposed a nonparametnc Bayesian
model using Ferguson's Dirichlet process to incorporate the auditor's prior
prediction of the conditional distnbution of the enor. It is hypothesized
mat the auditor cannot predict me exact fume of the error distribution, but
is able to describe the expected form. Let Fritz) be the expected
distribution function of z representing the auditor's best prior prediction.
The auditor may use any standard parametric mode] for Fo. Altemadvely,
Fo may be based directly on past data. The auditor assigns a finite weight
Oo to indicate his uncertainty about the prediction. Then the auditor's
45
prior prediction is defined by the Dir~chlet process With the parameter
adz ) = aO Fo(z ).
(7.19)
This means that Prober Liz') is distnbuted according to the beta
distnbu~aon Beta (a(Z')9a( - XtZ'))- The posterior prediction given m
observations on z, say z = (at, .... Zm), iS Men defined by the Dinchlet
process with the parameter
Adz ~ z) = {aO+m ~ {Wm FOd{~~Wm) Em }(Z), (7.20)
where
aO
{aO+m }
and Fritz) is the empirical distribution function of z
function of the mean ~ of z is Even by
.
Get) = Probe) =Prob(T(~)<0)9
where the characteristic function of T(~) is
00
(7.21)
The distribution
(7.22)
¢(v ) (u ) = expE- ~ log ~ I-iu (t -v ) ~ ~ aft )1. (7.22)
The distnbudon of ~ is obtained by numerical inversion of (7.22). The
distnbution function of the mean tainting 11 is, den, given by
Hand ) = Prob (A ~ dl ) = Prob (p id < d() = E (`d < drip I p ). (7.23)
This integration can be done numerically. In this work, a beta distribution
is proposed to model p.
46
-
S. Numerical Examples
In Sections 5 though 7 various methods for setting a confidence
bound for the accounting population error were described. They differ
from Me classical methods of Section 4 in the sense Mat these me~ods do
not assume Mat the sampling distnbudons of their estimators are nonnal.
Among these new developments, we illustrate in this section me
computation of the following upper bounds for the total population error:
Me Stringer bound, the multinomial bound, parametric bounds using the
power function, and the moment bound. In addition, computation of two
Bayesian models developed by Cox and SneB and Tsui et al. will also be
illustrated. Software for computing aB but one of these bounds can be
developed easily. The exception is Me multinomial bound, which Squires
extensive programming unless He number of errors in He sample is either
O or 1. These methods are designed primarily for setting an upper bound
of an accounting population error contaminated by overstatements in
individual items. The maximum size of the error amount of an item is
assumed not to exceed its book amount. These mesons also assume DUS.
Under this sampling design the total population en or amount is equal to
He known book amount Y times the mean tainting per doBar unit
DIP FEZ. We win, therefore, demonstrate the computation of a .95
upper bound for AD using each method. We data used for these
illustrations are hypothetical. Our main objectives are to provide some
comparisons of bounds using the same audit data and also to provide
numerical checks for anyone who wishes to develop software for some of
the bounds illustrated in this section.
A) [lo errors in the sample. When Here are no errors in a sample of
n dollar uriits, the Stnnger, multinomial, and power function bounds are
identical and are given by the .95 upper bound for the population enor rate
p. The bound is therefore directly computed by
Pu (0; 95) = 1 - 05~1In
(~.1)
using the Binomial distribution. For n = 100' ~Su(0;.95) = .0295. In
practice, the Poisson approximation of 3.0/n is often used. The
computation of the moment bound is more involved but gives a very
similar result.
For Bayesian bounds, the value of a .95 confidence bound depends
or1 He choice of He prior about the error distnbunon. Using extensive
simulation, Neter and Godfrey (1985) discovered that for certain priors
tile Cox and Snell bound demonstrates a desirable relative frequency
behavior under repeated sampling. One such setting is to use the following
values for the mean and the standard deviation for the gamma prior of p
47
and Liz, respectively: po=.lO, up= .10, - .40, and cs~=.20. These can be
related to me parameters a and b in (7.11) as follows.
a = ~ O/csp )2,
b = (iLo/C~~)2~2.0.
(~.2a)
(~.2b)
Thus for no enamors in the sample, i.e., m=O, using Me above prior values,
we compute
a = (.10/.10~2 = 1,
b = (.40/.20~2+2.0 = 6.
The degrees of freedom for me F distribution are 2(m~a ~ and 2(m+b), so
for m=0 they are 2 and 12,respec~vely. Since We 95 percendie of F2,~2 is
3.89, and the coefficient, when n=100, is
mz+(b-l)Yo m+a (~11.40 1
= = 00303
n +a /p O m+b 1~1.0/.10 6
the 95% Cox and Snell upper bound is .00303 x 3.89 = .01177.
For another Bayesian bound proposed by Tsui et al. we use the prior
given in Section 7, namely, the Dinchlet prior with parameters K=5.0,
pO=.8, peso = .101, and Pi = .001 for i = 1,..., 99. Given no errorin a
sample of 100 dollar unit observations, me posterior values for these
parameters are K'=K+n=105, and p'O= (K p`~+wo)/K'=~5~.~+1001/105 =
.99048. Similarly, p'~OO= (5~.101~/105= .00481, and p'i=~5~.001~/105~=
.00004762 for i=1,...,99. The expected value for the posterior ED iS Ten
E(11D)=( 1oo+100+ +1900~.00004762+11OO.00481=.007167.
To obtain Var (~D ). we compute E (~D) = .0063731 so that
Var (~D ~ = (ILL? ~ { E (IID ~ ~
The posterior distribution is, then, approximated by me Beta distnbution
having the expected values and the v anance computed above. The two
parameters a and ~ of me approximating Beta distribution B (a,~) are
48
E ( ) rL E (IID ) ~ 1—E (I1D ) ~ 1:
and
r
13 = { 1—E (~D )} I (~D ){ 1 E (~D )}
= .8489
1
1
= 117.46.
The upper bound is Men given by me 95 percentile of me Beta distribution
win parameters .848 and 1 17.46, which is .0~27.
B) One error in the sample. When me DUS audit data contain one
enter, each method produces a different result First of an, for computation
of the Stnuger bound, we determine a .95 upper bound for p, Ad,, (m ,.95)
for m=0 and 1 . Software is available for computing these values (e.g.,
BELBIN in Intemational Mathematical and Statistical Libraries OMSL)~.
We compute 0,.95-.0295 and Mu (1, 95) = .0466. Suppose that the
observed tainting is ~ =.25. Then a .95 Stringer bound is
Pu (0,.95 - to (1, 95 - PI (0,.951) = .0295+.25~.0460.0295) = .0338.
Second, the multinomial bound has an explicit solution for one error.
It is convenient to express Me observed minting in cents so set {=100~.
Denote also a .95 lower bound for p as p'(m ,.95) when a sample of n
observations contain m errors. Then a .95 multinomial bound for m=l is
given by ({p^' +lOOp ioo)/lOO . where pi and p loo are determined as
follows. Let
To = max
Then
and
(100 {)(n-1)
1/n
~ 1 .05 ,^]
Pt'=— p~n-~-P(),(,
P 100 = 1—po—PV
, pi(n-l,.95)
(8.3)
(8.4)
(8.5)
To illustrate the above computation, using {=25 and n=100, we compute
mat p^'(99,.95~= .9534 and
49
~-
·os
25(100)
. ~
. .. ~ (100 25)(10~1)~
so Mat p 0 = .96767. Then by (8.4),
1 .05
lo= 00 .9676799
11100
. _
= .96767,
.967671 = .00326.
Hence, p me = 1.0 - .96767 - .00326 = .0291. A .95 multinomial upper
bound, when m=l, is then .25(.00326)+.0291 = .02988.
Third, we discuss computation of the paramedic bound using me
power function for modeling the distribution of tainting. The density of z
iS
f (z) = ~z~-1 for O
A random sample of sue n from Me distnbution (8.9) and (8.10) is caned
the bootstrap sample. Denote lied as We value of ED = ~ (1)
computed from a single bootstrap sample. The distnbudon Of P*D under
sampling from (8.9) and (8.10) is the bootstrap distribution of ~*D. The
95 percentile of the bootstrap distribution is used to set a bound for ED.
To approximate the bootstrap sampling distnbution, we may use
simulation. Let B be the number of independent bootstrap samples. Then
an estimate of a .95 upper bound is UB such that
:# ~ ~ D
A = 4m 3Im 2,
B=(l/2)malm2
and
G =ml-2m2/m3.
(8.12)
(8.13)
(8.14)
For computation of m' of the sample mean faintings a number of heuristic
arguments are introduced. First of all, we compute the average tainting ~
= .325 of the two observations. Suppose that the population audited is a
population of accounts receivables. Then, we compute, without any
statistical explanation being given, me third data point,"*:
~ =. 81 t 1-.667 tanh( 107)] t 1+.667 tannin / 103] = .3071 .
The tempt in the second pair of brackets win not be used when Me
population is inventory. t* is so conducted that when Were is no enor in
a sample, the upper bound is very close to the the Swinger bound. Using,
thus, Me data points - two observed and one constructed - the first Tree
noncentral moments are computed for z, i.e., the tainting of items in ear.
They are:
Vz 1 = (.25+.40+.3071)/3 = .31903,
,2 = (.252+.402~.30712)/3 = .1056, and
V2 3 = (.2~3~.4~+.30713)/3 = .03619.
The noncentral moments of d are simply p times Me noncentral moments
of z. Using well known properties of moments, the population central,
second and third moments can then be derived from noncentral moments.
These population central moments are used to determine the three
noncentral moments of Me sample mean. Throughout these steps the error
rate p is treated as a nuisance parameter but at this stage is integrated out
using Me normalized likelihood function of p. Then, Me noncentral
moments of the sample mean are shown to be as follows:
m+]
n+2 al '
Vd,2 =
m+1 m+1 m+2 2
n+2 z)+(n-l) n+2 n+3 ~Z,
n
~2
(8.15)
(8.16)
Vd'=
m+1 m+1 m+2
n+2 z'+3(n-~) n+2 n+3 if ~ Vz.2
n2
~ _
(n -l )(n -2) +2 n +3 n +4 vz
n2
Using (8.15) through (8.17), we compute
vd,2 =.14615 x 10-3, and vd,3 =.29792 x 10-5. Then,
m 1 = vd,l= .93831 x 10~2
m2 = Ed 2—Va21 = .581~ x 10 -
(8.17)
Vd 1 =.93831 x 1072,
(8.18)
(8~19)
ma = vd,3 - 3Vd,1 vd,2 + 2v,d,1 = .51748 x 10 - (8.20)
Using these values, we compute A =2.93, B =0.00445 and G =-.00366.
These parameter estimates are used to determine the 95 percentile of the
gamma distribution to set a .95 moment bound. The bound is .0238. For
companson, for the same audit data, the Stringer bound = .0401, the
parametric bound = .0238, and using the prior settings previously
selected, the Cox and Snell bound = .0248 and the Tsui et at. bound =
.0304. Table 4 tabulates the results. Note that when there is no error in
me sample (m =0), the two B. aye Sian bounds, under the headings C&S and
Tsui, are considerably smaller than the four other bounds. The reason is
that the four other bounds assume that aD taints are 100% when there is no
error in the sample. When the sample does contain some errors, the
bounds are closer, as shown for m =l and 2.
S3
Table 4
Comparison of Six .95 Upper Confidence Bounds for AD: Me
Stringer bound, Me Muli~nomial bound, the Moment bound, the
Pararr~etnc bound, the Cox and SneU bound, and the Tsui. et at. bound.
(Sample size is n=IOO)
No. of Errors Str. Mult. Moment Para. C & S Tsui
m=0 .0295 .0295 .0295 .0295 .0118 .0023
m = 1 .0338 .0299 .0156 .0152 .0182 .0255
t=.25
m =2 .0401 .0315* .0239 .0238 .0248 .0304
=.40
t ~=.25
Notc: ~ This value was computed by the software made available by R.
PIante.