user-avatar
Today is Sunday
November 24, 2024

Archives: April 2017

April 22, 2017

Explanation of vectorized form of the for loop calculation of gradient descent in Exercise 1 in ML class in couresa

by viggy — Categories: Uncategorized — Tags: , , , Leave a comment

Source: Storing this for future reference

If you are wondering how the seemingly complex looking for loop can be vectorized and cramped into a single one line expression, then please read on. The vectorized form is:

theta = theta – (alpha/m) * (X’ * (X * theta – y))

Given below is a detailed explanation for how we arrive at this vectorized expression using gradient descent algorithm:

This is the gradient descent algorithm to fine tune the value of θ: enter image description here

Assume that the following values of X, y and θ are given:

m = number of training examples
n = number of features + 1

Here

m = 5 (training examples)
n = 4 (features+1)
X = m x n matrix
y = m x 1 vector matrix
θ = n x 1 vector matrix
xi is the ith training example
xj is the jth feature in a given training example

Further,

h(x) = ([X] * [θ]) (m x 1 matrix of predicted values for our training set)
h(x)-y = ([X] * [θ] – [y]) (m x 1 matrix of Errors in our predictions)

whole objective of machine learning is to minimize Errors in predictions. Based on the above corollary, our Errors matrix is m x 1 vector matrix as follows:

To calculate new value of θj, we have to get a summation of all errors (m rows) multiplied by jth feature value of the training set X. That is, take all the values in E, individually multiply them with jth feature of the corresponding training example, and add them all together. This will help us in getting the new (and hopefully better) value of θj. Repeat this process for all j or the number of features. In matrix form, this can be written as:


This can be simplified as:

[E]’ x [X] will give us a row vector matrix, since E’ is 1 x m matrix and X is m x n matrix. But we are interested in getting a column matrix, hence we transpose the resultant matrix.

More succinctly, it can be written as:

Since (A * B)’ = (B’ * A’), and A” = A, we can also write the above as

This is the original expression we started out with:

theta = theta – (alpha/m) * (X’ * (X * theta – y))

April 15, 2017

The book of 0 and 1

by viggy — Categories: Uncategorized — Tags: , , Leave a comment

Imagine I had a nice fat and fancy notebook with great binding completely empty. I ask a young child who knows how to write 0 and 1 beautifully to fill that notebook completely, line by line, page by page with 0s and 1s. I give full liberty to the child to write it any sequence. At the end of the week, after 7 long days of writing the child returns and gives back the book very happy with the accomplishment. Now in a way of appreciation, I give the child a golden pen, which writes with golden ink and ask it to write a small sequence of 0 and 1 on the binding of the book.
Finally my book of 0 and 1 is ready. If at this moment, I would have come to you and tell you to buy the book from me for a small amount of money. Unless you are really feeling generous or pity on me, you would of course tell me that the book is worth nothing.

Now at this stage, I ask another random person, to choose any page of the book and any sequence of 0 and 1 and try to feed it into any of the machine he/she poses. What might happen. The machine understands the sequence and either return some gibberish or return something which we can understand. If the machine returns gibberish, I ask the person to change the machine and try again till it returns something more meaningful.

I do the same with many different people. Each choose their own random sequence from the book, their own machine to interpret it and then try to get something meaningful out of it what they can understand.

Now I have a book of some value as people are able to make some meaning out of it. Wonderful. I start a game, asking people to find the longest sequence in the book which makes sense. The game goes viral, there are now many machines which are able to make meaningful statements out of the sequence of the book. Very soon, we design machines specifically so that it returns meaningful statements just for the sequence of the book.
This book is the book of religion.