Hi everyone!
There is an error on the page 71 in the book “Programming machine learning from coding to depp learning” P. Perrotta. You can’t just change the weighted sum to a sigmoid without first taking the derivative of the sigmoid. See the explanation there https://yadi.sk/i/T75bWimUYPAe7Q
Hey, Alexandr! Thank you for taking the time to report this, and even take a picture of your notes!
I kinda regret not putting that calculation in the book, because from the explanation I wrote, it looks like the derivative of the sigmoid just disappeared. Actually, the formula in the book does take that derivative into account. The sigmoid isn’t explicit in the final formula, but it’s baked into the values of y-hat.
My apologies for not writing the steps by hand (busy day!), but I checked the calculation in the link above, and I cannot see any obvious errors. If you do, can you please point them out?
Hi Nusco
in the highlighted content, I think that is incorrect, because the steeper the curve, the smaller the learning step should be, because when the curve is steeper, just a slight change in w, which will make the loss value change greatly.