Programming Machine Learning - Dimension numbers correct? (page 118)

Here I am, at long last! Sorry for the slow reply, @mscdit . I was knocked out by the flu.

This looks indeed like a mistake in that diagram (that is even repeated in two consecutive chapters). w1 should be (785, 200), not (785, 201). Then the hidden layer is (M, 200), and it becomes (M, 201) when you prepend the biases.

Thank you for spotting it! I’ll add to the errata for future editions.