Errata
The errata list is a list of errors and their corrections that were found after the product was released. Use the Github issue tracker to submit new errors.
Chapter 1
- p. 12-13, code for Figure 1.7 and Figure 1.8 -
value_counts
creates now column namedproportion
:... X.loc[4, 'proportion'] = 0 # there are no samples with 4 blemishes add a row X = X.sort_index() # sort by number of blemishes ax = X['proportion'].plot.bar(color='grey', legend=False) ... X['Cumulative Frequency'] = X['proportion'].cumsum() ...
Chapter 2
- p. 55, Section 2.2.1.1 - Clarification: The
scipy
package uses the term probability mass function instead of probability distribution function for discrete distributions. Both terms are used in the literature. In Modern Statistics, we use the term probability distribution function for both discrete and continuous distributions.
Chapter 3
-
p. 162, Equation 3.30 - Replace percentile for the first $\chi^2$:
$\left(\frac{(n-1)S^2}{\chi^2{1-\alpha/2}[n-1]}, \frac{(n-1)S^2}{\chi^2{\alpha/2}[n-1]}\right)$
- p. 186, Figure 3.15 - Incorrect distribution used to create the Figure. The updated code creates the following Figure. The additional dashed lines show the approximated HPD.
- p. 190 - Code sample, the
compute_bootci
function cannot identifynp.mean
as a function (see PR). Replace withB = pg.compute_bootci(etchrate, func=lambda x: np.mean(x), n_boot=1000, confidence=0.95, return_dist=True, seed=1)
- p. 196 - Code sample, the print statement needs to contain f-strings. Replace
print('Xbar {Xbar:.2f} / SX {SX:.3f}') print('Ybar {Xbar:.2f} / SY {SX:.3f}')
with
print(f'Xbar {Xbar:.2f} / SX {SX:.3f}') print(f'Ybar {Ybar:.2f} / SY {SY:.3f}')
Chapter 4
- p. 251:
Scale
in the regression summary output corresponds to $S_e^2$.The $S_e^2$ value is shown in the regression summary output as
Scale
. - Exercise 4.27: effects => affects
Chapter 6
- p. 333 - naming inconsistency of formula for polynomial model with seasonality. Replace
poly_formula
withformula
:formula = 'Dow_Jones_Index ~ C(month) + I(trend**3) + I(trend**2) + trend + 1' model_3 = smf.ols(formula=formula, data=dow1941_df).fit()
- p. 357 - Correct value of PMSE
The empirical PMSE is 0.8505.
Chapter 7
- p. 379 - iteritems deprecated in pandas package; code change required:
df = pd.DataFrame([ {satisfaction: counts for satisfaction, counts in response.value_counts().items()}, {satisfaction: counts for satisfaction, counts in response[q1_5].value_counts().items()}, ])
Chapter 8
- p. 400 - Import from
from skfda.preprocessing.dim_reduction.projection
is deprecated. Usefrom skfda.preprocessing.dim_reduction import FPCA
- p. 404 - In equation 8.2, the matrix $V$ needs to be transposed:
$DTM \approx U * S * V’$ - p. 408 - estimating the network using HillClimbSearch now requires setting
max_indegree
to 1est = HillClimbSearch(data=abc) model = est.estimate(max_indegree=1, max_iter=int(1e4), show_progress=False, scoring_method='k2score')
the updated code creates the following Figure 8.11 on p. 414 and requires adjustment of the text in Example 8.4
We can see that changes to the Equipment and Suppliers have the largest ATE.
Index
- P-value, 152, 215