Programming Machine Learning: Incorrect formula

Alexandr · 3 November 2020 09:31

Hi everyone!
There is an error on the page 71 in the book “Programming machine learning from coding to depp learning” P. Perrotta. You can’t just change the weighted sum to a sigmoid without first taking the derivative of the sigmoid. See the explanation there https://yadi.sk/i/T75bWimUYPAe7Q

Best regards.

nusco · 3 November 2020 10:51

Hey, Alexandr! Thank you for taking the time to report this, and even take a picture of your notes!

I kinda regret not putting that calculation in the book, because from the explanation I wrote, it looks like the derivative of the sigmoid just disappeared. Actually, the formula in the book does take that derivative into account. The sigmoid isn’t explicit in the final formula, but it’s baked into the values of y-hat.

It’s been a while since I went through the steps myself, but I googled for a step-by-step calculation and found this one: https://medium.com/analytics-vidhya/derivative-of-log-loss-function-for-logistic-regression-9b832f025c2d

My apologies for not writing the steps by hand (busy day!), but I checked the calculation in the link above, and I cannot see any obvious errors. If you do, can you please point them out?

Alexandr · 3 November 2020 11:22

Thank you!
I just read it inattentively. I think it would be a good idea to insert that link (https://medium.com/analytics-vidhya/derivative-of-log-loss-function-for-logistic-regression-9b832f025c2d) in the book.
Best regards.
P.S.
Thank you! I think you’ve written the best book for beginners( I’ve read a lot of them).

nusco · 3 November 2020 12:22

No problem, Alexandr, and thank you for reporting the issue anyway!

shane · 6 April 2022 09:36

Hi @nusco
I am confused a problem in this book.

Hi Nusco
in the highlighted content, I think that is incorrect, because the steeper the curve, the smaller the learning step should be, because when the curve is steeper, just a slight change in w, which will make the loss value change greatly.