Showing posts with label Tech. Show all posts
Showing posts with label Tech. Show all posts

Tuesday, March 30, 2010

Online resources to study machine learning

I have been collecting information on online resources for machine learning. I share them as it will surely help several grad students. If you have some interesting ML links please do post them in the comments.

Definition
  • Many people are not clear about the boundaries (as flimsy as they may) between traditional AI, Neuroscience and Machine Learning. This introduction by Tom Michell ought to help.
Where to study from?
  • Lectures by Andrew Ng (Stanford) are said to be a good start.
  • Hundreds more of ML lectures are available at videolectures.net. These videos are good to learn more about topics that you are interested in.
  • Long list of textbooks to read (links 1 and 2). This text is free online.
How to keep up with the field?
  • ML is a fast changing field and you need to update yourself with the latest papers. The journals/conferences I found to be relevant are:
  1. JMLR
  2. ICML
  3. NIPS
  4. COLT
  5. Pattern Recognition
  6. TPAMI
  7. Machine Learning
  • I found these blogs to be useful too
  1. http://hunch.net/
  2. http://www.reddit.com/r/MachineLearning/
  3. http://metaoptimize.com/qa/
  4. http://mark.reid.name/iem
  5. CIML
Programming tools
  • Not everyone can afford Matlab. Fortunately Python (with its Numpy extension) can be used as a substitute.
  • Before implementing any standard algorithm look for online ML libraries with it. (tip: some libraries work only on *nix environments)
Machine learning has a lot in common with statistics, in fact it can be said to be derived from statistics. This link has a funny take on why ML is more popular.

Addendum (06/01/2010)
Here are some links to online lecture videos on related topics. There might be other course videos out there. Do your own search if you have time.
  • Calculus: Several course videos here. My pick
  • Linear Algebra: The famous MIT course by Gilbert Strang
  • Probability theorem: From UCLA
  • Convex optimization: Stanford lectures by Stephen Boyd. Part 2 of the course is available on youtube

Wednesday, February 25, 2009

Netbooks update

I bought one. A Dell inspiron mini 9 for $180 ($169 + tax) from the outlet website. Its the minimum configuration with 512 MB RAM and 4 GB SSD memory running Ubuntu Linux. For the price its simply great. I will compliment the system with a 4 GB or 8 GB SDHC card. Eagerly waiting for the delivery. Today they will release the mini 10 model and so expect more deals on mini 9 and mini 12. I don't think there will be a cheaper deal in the near future.
 
Another good deal that I read about was the sale on ASUS Eee PC 900a in Best Buy stores. The reported prices varied from $160 - $200 in different areas. It is also a good deal but it should be noted that it has very low backup. Its reported to have less than 1 hr 30 mins whereas the mini 9 has about 3 hrs 30 mins.
 
Once I get the system I will also explore online file storage and remote desktop access to my laptop and work server. I might add more RAM later. For Windows netbooks there are several options like Zumo drive, MS Live Mesh and Gladinet for extra storage and remote access. But the Linux models are always cheaper ones.

Sunday, February 22, 2009

Netbooks - now is not the time.

I have been scouring the net to find a good deal for a netbook. For the uninitiated, a netbook is a small laptop with a screen thats around 9 inches wide diagonally and weighs about 1 kg. Its processing power is limited and cannot be used to run heavy programs like games or matlab. Most models cant run windows vista, but who cares about that? The most popular models are Asus ee PC, MSI wind, Acer aspire one and Dell inspiron mini. Lenovo, HP, Toshiba and Sony also have models though they not as popular as the others due to the high cost.
 
Apart from its small size and light weight, the good ones also have upto 7 hrs of battery life. Its usually priced between $250 - $400. I think it makes sense to buy one as a second computer for use while travelling and also while going for lectures when you might want to look up pdfs or chat on gtalk. My current laptop is about 3.5 kgs and its a pain carrying it around.
 
That said when I looked deeper into the details this does not seem to be the best time for such a purchase. The machines have a processor called the Intel Atom. It is power efficient but not powerful. Benchmarks show that the performance of a 1.6 GHz Atom is almost same as that of an older Intel processor, celeron M 900 MHz. The Intel platform seems to be unnecessarily costly, atleast thats what the tech forums say.
 
Other processors for netbooks like an ARM based Freescale chip, which would be much more power efficient, are slated for release later this year. Another reason to wait is the impending release of NVIDIAs ION platform which will make the systems much more graphics capable. Also the pricing seems to be wrong. Dell sells their bare bones model for $249, but had just 3 weeks back sold a few refurbished peices in the outlet website for $174. I believe when newer models are released many of the older ones will be available for less than $200. So wait, if not for the newer chips atleast for lower prices.

Tuesday, February 3, 2009

Singularity

One of my first blog entries was on singularity. If the enthusiasm seen in the net is any indication this concept has gained wide attention and now there is a singularity institute (1) and even a singularity university (2). Check out the many blogs (3) for more fascinating news on this.

Singularity refers to a technological development that will forever change the fate of the human race. A line from 2001: A space odyssey comes to my mind: "Any sufficiently advanced technology is indistinguishable from magic" (his third law). Its appeal to science fiction literate geeks is obvious. I belong to this category and have been following the evolution of the idea for sometime now.

Among the paths proposed towards singularity are the creation of an artificial intelligence or the creation of a superior human brain using genetics and symbiotic electronics (like in a cyborg) or a combination of both with feedback resulting in resonance. I have no idea about the second path, but the first one is familiar. My research is related and I have read some of the scientific literature in the area since it is my personal goal too. It is with that knowledge that I make this daring statement "we don't yet have the technology to do this".

That might not sound surprising to many, but I will still go ahead and clarify. We have methods that show promise in very limited applications. But not much has been done to tie these methods together. Reinforcement learning, genetic programming, Godel machines are keywords that will return related results on google (also). But I couldn't find any evidence of work done to create intelligence. May be the old AI idea of coding everything can help. There is so much work done in the field on the separate methods I feel its time we started doing extensive research on creating a true meta-learner. This is where all the singularity enthusiasts can help, they can provide a framework that might in the future be the base from which to develop true AI.

So the university and institute have a lot of work to do and I wish them all the success. I must say that the community also seems to have a lot of nut cases. Some of them see this as a shot at immortality, they will upload their brains to a machine and when possible download it to a cloned brain. Ha! And just imagine the impact immortality will have on our society.

Thursday, January 22, 2009

TED talks

Technology- Entertainment - Design

Experts from these three walks of life share their ideas every year during the TED talks. Now we can see them on youtube.

http://www.ted.com/

http://www.youtube.com/user/tedtalksdirector

These are truly remarkable people most of whom have not only poineered new techniques but are also so eloquent that they can convey the crux of their ideas in about 20 minutes. Each person gives a talk on new ideas from their field like physics or philosophy and using their own methods makes sure that everyone gains something from the lecture.

And the themes are very diverse. Today I watched videos on string theory, the power of memes, insight into the workings of the brain and the problem with too much choice. They were all engaging and informative. I feel wiser and entertained at the same time. Perfect.

Thursday, May 15, 2008

Tale of interviews: The last rejection

Nvidia too has rejected me now. Before the interview I had searched the web for their usual questions and found them to be easy. But the interviewer seemed to have his own method. He asked questions on one of my projects and tried to probe deeper into the area of computer architecture to gauge my knowledge. On seeing that everything was hollow in there he threw me an easy question:
Given 4 registers A,B,C and D write to B in such a way that the bits in C that correspond to 1's in A are copied to B and bits in D that correspond to 0's in A are copied in linear time.

He also asked me something about mutex and semaphores....... blah blah. But I still wonder why they took 10 days to make a decision about me. And yah the interview was for a software intern in device drivers development for their mobile team.

Thursday, April 3, 2008

Tale of interviews: Interview Experiences Part Deux

I had two more interviews since the first part of this post. The first one was with NVIDIA. The interview call lasted all of 3 minutes and 34 seconds. The interviewer, a senior executive in the company, called me 11 minutes late. After some preliminaries he asked me what my areas of interest were - whether I preferred mobile or graphics team. I had to pick one and I picked mobile.

Then he said "AHA! But I am from the graphics team. Good bye. Take care." Well it was something like that. But he did mention that he would forward my resume to the other team and now I have got another interview call from the company. I hope it too wouldn't be an anticlimax with the interviewer referring me to yet another team.

And now about the second interview. It seems my performance in the Google interview wasn't that bad after all. I had my second interview round yesterday. Please note that there are usually only two rounds of interviews in Google and if I get through this one I would hit the jackpot. But of course I didn't do it well. Obviously that's the reason why I am posting this. Remember the Shakespearean verse, "All ye, pity not thy fate but blog thy misery till thee plague all ye mates". OK I will cut the non-sense and go right into the questions:

1) Write code to generate spam messages. You are given a passage. Jumble the words from the passage and make a message of arbitrary size. A simple example can make it clear. For example the passage can be "Jumble the words from the passage and make a message of arbitrary size. A simple example can make it clear. For example the passage can be". The generated message should contain words from the passage with each word being followed only by one of the words that follow it in the original passage, like in "passage can make a simple example the words from the passage".
2) Give test cases for this code.
3) What is the complexity of the code?
4) Do you know oops? Good. What is an interface?
5) When you type in an address in your browser what are the steps that results in the web page getting displayed?
 
I am not a cruel person and so won't make you read through my answers. I mean you just might laugh your guts out.

[Update 04/07/2008 : I got a reject from Google. On the bright side now I know that the universe is back to normal]

Thursday, March 27, 2008

How to find class rank from mean and standard deviation?

For some of the courses my TAs dont give the class rank, but only mean and standard deviation. But I can calculate my percentile from this information in the following manner:
1. Calculare standard score - http://en.wikipedia.org/wiki/Z_score
2. From the table in http://www.medfriendly.com/standardscoretopercentileconversion.html you can find your score percentile. This is a measure of how many standard deviations away from the mean you are.

We thus calculate the percentile rank of a score which is the percentage of scores in its frequency distribution which are lower. A normal distribution of scores is assumed and hence this is an approximation. There are many people who have wanted this information and this post is as much for their benefit as for my future reference.

Friday, March 21, 2008

Interview experiences

Once upon a time there lived two giants Miso and Goog. They amassed great empires and were greedy to expand their realms. They were in need of minions to do their dirty biddings. Who they gonna call?

Yes I was fortunate enough to get interview calls from Microsoft and Google. This post is about how I wasted both the oppurtunities. Both were phone interviews. First was the MS interview. I wasn't very tense. I was absolutely unprepared and had infact taken a mid-day nap before it. And when the interviewer threw me an easy question to see if I was worth his while I was dumbstruck. I knew it was a very easy question, but I just couldnt answer it. Two weeks after the interview I got their reject mail. It was as expected.

After two more weeks I get an email from Google (everyones dream company) about an interview. Today, after another two weeks, the interview happened. It wasn't as bad as the MS one, but again I didn't perform well. What irks me is that this time I had over two weeks notice and I still botched it up. So a friendly advice to all readers - before applying for a job, start your preparation. What? You knew it already? Fine. OK OK. I am an idiot.

Heres an excerpt from the Google e-mail:
"Your technical phone interview may include questions from one or more of the following areas: coding, algorithms and design and problem solving. For more information on Algorithms you can visit:
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=alg_index

Don't forget to have an active dialogue with the interviewer and talk through how you are solving the problems throughout the interview. I recommend you spend some time exploring our website to get into the right mind frame. Google Labs is a good starting point and can be found at http://labs.google.com. I also recommend you take a look at our resume and interview tips pages as well at
http://www.google.com/support/jobs/bin/static.py?page=students.html&sid=tip
"

The questions asked were:

MS -
  1. Implement in Cpp Fibonacci number generation both recursively and non-recursively. Give their complexities.
  2. Given two strings how would find the characters they have in common. Give complexity.
  3. Give test cases for unit testing the algorithm above.
  4. A lot of questions were asked about the projects in my resume.

Google -
  1. Some basic questions about a project in my resume.
  2. Given a set of coins of various denominations how would you find the minimum number of coins that is required to make up $x.
  3. Given two strings determine if they are anagrams using O(n) algorithm.
  4. Given a matrix of numbers with all rows sorted left to right and all columns sorted top to bottom, how would you find the position of a number x in the matrix. What is the complexity?

Links that you might useful if you are also preparing for interviews (will be expanded as I find more):
http://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html