Here I am. Hello, @samuiweb_gm!
The trick with the bias can be confusing, so let me try to explain it here.
In Chapter 2, we use a line to approximate the data. here is its equation:
ŷ = x * w + b
So we calculate the output ŷ based on the value of the inputx. We do it with two variables, or “parameters”: w and b.
By contrast, in Chapter 4 we have multiple inputs: x1, x2, and so on. So we start by calculating the output based on those inputs, each given a weight… and a final bias, like we did before:
ŷ = x1 * w1 + x2 * w2 + x3 * w3 + b
The trick in “Bye, bye, bias” is all about turning that b into just another weight (let’s call it w0), by associating it with an artificial input:
ŷ = x1 * w1 + x2 * w2 + x3 * w3 + x0 * w0
The last two formulae are the same as long as we do two things:
- We rename
btow0. - We add an artificial input
x0that has a value of 1, so that when we multiplicate it byw0, nothing changes.
So, to answer your question directly: b is still a variable, and it’s become a weight like any other. What we added is another input, and that one has a constant value of 1. By doing that, we can remove all the code that deals with the special case of b, and treat all the weights and the bias the same.
Does that make it clear?