Programming Machine Learning: Bias in linear regression

samuiweb_gm · 17 June 2020 14:11

Hi there,
bought “Programming Machine Learning” (great book) from pragprog.

A question (from a newbie) for the author, or anyone can help:
At the end of the chapter 4 in the paragraph “Bye Bye, Bias” is described how to add the bias parameter to the multiple linear regression algorithm.
Why the bias is maintained constant at every iteration when in the “single” linear regression in chapter two it is variable?

Thank you!

dimitarvp · 17 June 2020 15:04

You can probably mention them directly I reckon: @PragmaticBookshelf.

@AstonJ should it be enough that such questions are put in the PragProg Customers category, or are mentions desirable at all?

AstonJ · 17 June 2020 15:12

This section is fine - no mentions needed

PragProg will be keeping an eye on the section (and their authors on threads related to their books) so they will see the thread at some point, though keep in mind they are busy so it might not be right away.

I am also currently working on our first iteration of book portals which will make it clearer where to post and find threads relating to specific books. Hoping to get the first version of this up early next week (to begin with we’ll just use the standard portal template, then customise this after reviewing how it’s used).

nusco · 18 June 2020 09:39

Here I am. Hello, @samuiweb_gm!

The trick with the bias can be confusing, so let me try to explain it here.

In Chapter 2, we use a line to approximate the data. here is its equation:

ŷ = x * w + b

So we calculate the output ŷ based on the value of the inputx. We do it with two variables, or “parameters”: w and b.

By contrast, in Chapter 4 we have multiple inputs: x1, x2, and so on. So we start by calculating the output based on those inputs, each given a weight… and a final bias, like we did before:

ŷ = x1 * w1 + x2 * w2 + x3 * w3 + b

The trick in “Bye, bye, bias” is all about turning that b into just another weight (let’s call it w0), by associating it with an artificial input:

ŷ = x1 * w1 + x2 * w2 + x3 * w3 + x0 * w0

The last two formulae are the same as long as we do two things:

We rename b to w0.
We add an artificial input x0 that has a value of 1, so that when we multiplicate it by w0, nothing changes.

So, to answer your question directly: b is still a variable, and it’s become a weight like any other. What we added is another input, and that one has a constant value of 1. By doing that, we can remove all the code that deals with the special case of b, and treat all the weights and the bias the same.

Does that make it clear?

samuiweb_gm · 18 June 2020 11:46

Much more clear, thank you.
Anyway all the rest was already very well explained until now, I can only suggest this book to everyone!

PS
I wrote in English for the public, but I’m Italian: Grazie Paolo!

nusco · 18 June 2020 12:26

Prego!