January 25, 2014

Deep learning libraries in python


Theano is a generic mathematical expression evaluation library. The main advantage is, it compiles to either CPU or GPU. By default it uses cpu. 


Pylearn2 is based on theano. So you can get advantage of running on GPU.

Download from https://github.com/lisa-lab/pylearn2

  • Logistic Regression 
  • K-means 
  • Multilayer Perceptron (MLP)
  • Restricted Boltzmann Machines (RBM)
  • Deep Boltzmann Machines (DBM)
  • Ising 
  • Auto Encoder: Autoencoders, denoising autoencoders, and stacked DAEs.
  • Maxout Networks (Maxout)
It is not limited to above models and any one can add new model. Above models are pre implemented.


It is also GPU accelerated deep learning library. Its based on PyCuda.

  • Logistic Regression
  • Neural network regression
  • Muti-Task neural net
Download from https://github.com/hannes-brt/hebel


GPU accelerated. Based on cudamat. 

  • Feed-forward Neural Nets
  • Restricted Boltzmann Machines
  • Deep Belief Nets
  • Autoencoders
  • Deep Boltzmann Machines
  • Convolutional Neural Nets


Python tutorial on Restricted Bolzmann Machines - https://github.com/echen/restricted-boltzmann-machines
Modular Restricted Bolzmann machine implementation. Its based on theano. https://github.com/benanne/morb
Matrix Library for cuda - https://github.com/deeplearningais/CUV

January 17, 2014

Linear Separability

In geometry, a two dimensional plane is linearly separable if we can divide the plane using a line. In case of n-dimensions it is a hyperplane. In Machine learning, we use this property to classify the data. Lets first understand what do we mean by Learning here ?. 

Learning vs Memorising

Oxford dictionary defines learning as "the acquisition of knowledge or skills through study, experience, or being taught". Memorising is the ability to remember/call what ever we have seen before. The key difference is, if we learn correctly we can able to answer the things which we haven't seen before. While learning, we generalise the things we have seen before and apply the same to new things we haven't seen. Achieving 100% memorisation in computers is very very easy. Its matter of how fast we can go through the things we have seen and get the answer. The challenge lies in how we make computer learn. To achieve that we will not define any explicit rules for any particular problem rather we will create a model where we will give the computer ability to generalise from examples we have given. The key for learning is ability to generalise

Linear separability

Lets take following example. For now lets ignore how we will convert out problem to cartesian coordinates. 
The problem is if a new point comes, consider the green cross in below picture. In what category we will put it into ?. Is it belongs to blue cluster or red cluster ?.
Most of us will agree that it belongs to red cluster. Because it is close to it. What we do intuitively is we draw a link between these two clusters and we will decide based on what side it is falling into.

For two dimension it is a line and for multi-dimensions it is a hyperplane.

If the data can be separated using a hyperplane (straight line in case of two dimension) then we call the data linearly separable. What we defined now is a simple algorithm to classify the data into two classes. We defined a simple classifier now that can divide the data into two classes. How can we do that if we have more than one class ?.

Multi-class classification

Once we have an algorithm to classify the data into two classes, we can generalise it to multiple classes. There are two approaches for doing that. 
  • One vs Rest (one vs all): Here we will have n classifiers. Meaning we will have n hyperplanes. First we will check whether it belongs to class1. We will check this by drawing the hyperplane between class1 and rest of the classes (we will mix all other classes into one). We will repeat this for all the classes. 
  • One vs One: This is more complicated but is useful for certain class of problems. Here we will have (n * (n - 1)/2) classifiers. For 4 classes we will have 6 classifiers. We will compare class1 vs class2, class1 vs class3, class1 vs class4, class2 vs class3, class2 vs class4, class4 vs class4. Whichever class comes more times we will classify it as that class. 

Revisiting Linear separability

Not all problems are linearly separable. Take following example. 

We can't draw any line that separates these two classes. Only way is to draw an ellipse. These kind of problems are solved into two ways. As i said before, draw an ellipse instead of line. Meaning, we are using non-linear function to classify the data. 

The other way (ex. kernel trick in svm) is to project the data to higher dimension and check whether it is linearly separable. If we project above data into 3rd dimension we will see it as, 

Clearly, above data is linearly separable using a hyperplane. In the next blog, we will see how we can use this for sentiment analysis. 


  • http://en.wikipedia.org/wiki/Linear_separability
  • SVM Example: http://stackoverflow.com/questions/9480605/what-is-the-relation-between-the-number-of-support-vectors-and-training-data-and
  • Multi-class classification: http://scikit-learn.org/stable/modules/multiclass.html
  • http://www.statsoft.com/textbook/naive-bayes-classifier/

September 6, 2009

Hello World Arduino !

I got my arduino board today. If you don't know what arduino is, "Arduino is an open-source electronics prototyping platform based on flexible, easy-to-use hardware and software. It's intended for artists, designers, hobbyists, and anyone interested in creating interactive objects or environments". (from http://www.arduino.cc/)

I have written sort of hello world of Arduino !. Blinking LED. Follow the excellent tutorials from ladyada.net.

Similar Boards.
At first I was confused between freeduino, sanguino, seeduino etc. These are all clones of arduino (There are many more. Checkout wiki).Although arduino hardware and software references are available opensource, the name "Arduino" should not be used in the derived works. So these boards often uses a name that ends with "duino". Often you find freeduino and sanguino. There is no difference between freeduino and arduino. Sanguino is a more powerful varient of arduino. You can find the difference between arduino, sanguino and arduino mega here.

Where to buy ?.

The board I got is from makershed. There are lot of starter kits available. Before buying any of these check out comparison of these boards here.

If you are in India then you don't have much options. Makershed is shifting the boards to India but the shipping cost is 25$. You can find freeduino and sanguino board from Bhasha e-store. It is Pune based company. The price of these boards is very less Rs 1000 (20$) and you can get useful components (including 16x2 lcd, breadboard and few more components) for about Rs500 (10$). This is the cheapest of all.

Useful links.


August 8, 2009

Disassembling Inspiron Laptop

On a weekend (2 months back I guess) I didn't know what to do :) and I decided to disassemble my laptop. Here is the guide. Mine is Dell Inspiron 1420.

First remove the battery pack from the system.

Before disassembling, identify the important parts of the system.
1. Battery
2. CPU Fan

3. RAM
4. Hard disk
The nice thing here is that if we want to upgrade one specific part, we can do it without touching other parts. For example, if we want to upgrade the hard disk we can just remove screws shown in part 4.
We have to start disassembling from front side. First time I started from backside and I couldn't continue further. As shown in figure, pull the highlighted part and you can able to remove the entire frame (including the frame that has power button).

The next step is to remove the keyboard.
Be careful while removing the keyboard. Back side there will be a connection to mother board.
After removing the screws, pull it up and remove the connection to the motherboard.

Remove this connector and keyboard will be disconnected from the rest of the system.

Next step is to remove the monitor. First remove the supported cups (1 and 2 in figure). Just pull them off by holding the thick edge.

Remove the connection to monitor (4 in above figure)

Now the hard part comes. Remove the wifi cables. Be patient while remvoing. Here, reassembling takes more time than disassembling.
We have removed all the connections to the monitor.

Now remove the rest of screws and you are set.

August 1, 2009

Appengine Task Queue API & Gotchas

I recently added a new feature for Appsd. It shows the recent iphone related news from twitter. For this we search twitter every 10 minutes and gather all links. It shows the highest tweeted links. I used task queue api and cron for crawling. Here are things I learned from task queue api.
  1. As of now Task queue api is in experimental state. So you have to import it from google.appengine.api.labs.taskqueue. When it got stabilized you have to import it from google.appengine.api.taskqueue. So proper way of importing is, first try to import from google.appengine.api.taskqueue. If it fails then import it from labs.

  2. If the response of the request is not HTTP 200 OK then appengine tries to reexecute the task after some time. This is a big problem. It has two side effects. Task might have failed because of some temporal problem or because of some bug or exception. If it is the second case, reexecuting the task after some time won't solve the problem. Second problem is, it basically exhausts your 10000 quota limit.

  3. Now there is no way to increase the quota of tasks from 10000 requests. So if each task is not taking more time then pack more tasks into single task. For instance I process multiple urls in single task.

  4. If you change schema, don't forget to support the old schema. (Refer 2nd point)

  5. when you change something from post to get then do support get request and return nothing otherwise task will fail. (Again refer 2nd point)
Update: Task queue uses exponential backoff scheme to prevent single error exhausting your quota. see this post.

June 11, 2009

Startup City

Last Saturday i attended Startup city at Nimhans convention centre. It was second time i attended. These were few companies that i found interesting.

MedFlo Telematics: They Develop telemedicine kiosks. This project is already started and they deployed few kiosks in rural areas of Kerala.
Vidteq: They show driving directions. Instead of showing just maps they show actual video directions. Among the companies that participated in conference, this is the only service i used before. Though this is a very good thought, i find streaming a bit sluggish. They should provide even more options like controlling the stream speed etc to make it more usable. This is simplified version of Google street maps.
Iken Solutions: They developed recommendation engine for different domains.
Latlong: Its a location based service. You can get the results by SMSing or by installing their custom software. Since GPS devices are not prevalent in India they maped tower information to location.
Ennovasys: Provides supply chain management solutions. Companies can directly track their inventory directly using their software and it is integrated with Google maps.
Panini: They developed Indian language keypad for mobiles. They did statical language modeling to predict the word you are typing. Accordingly they change the characters in the keypad. So you can type faster. You can send the text throughSMS and they even compress the text while sending. So you can send many characters in one SMS. The main disadvantage with this software is, the same software should present on other mobile to read.

May 29, 2009

Market oriented programming

Apart from CS , the other subject I like the most is Economics . Why economics ? The interesting aspect of economics is, we convert each and every problem into problem of 2 variables namely supply and demand . Thus the complexity of analyzing more variables is reduced to that of 2. I liked the simplicity of this kind of analysis.

Basic principles of economics can be applied to any subject that deals with analyzing the interaction between limited set of resources and players. Economics' analysis explains why certain behavior is happening and how it ought to have happened etc. If the behavior is not what it ought to have been then we change the rules to bring it back to intended one.

Examples of where economics have been applied ( from Wiki ), Economic analysis is applied throughout society, in business, finance and government, but also in crime, education, the family, health, law, politics, religion, social institutions, war, and science.

When I was doing my masters I had given talk on Market Oriented Programming. I am sharing the slides. In this talk I had taken few examples where economic analysis had been applied to computer science problems. As I said earlier, economic analysis is best suited for analyzing Resource allocation problems. We face this problem quite often in computer science. For example,

1. How to allocate the storage space to users. (You can say fixed, but is it really good ?. There are many who are not using any allocated spaces and others who always wants more space. How to draw the line here ?)
2. Load balancing
3. Adsense. Perhaps this is the best example of how the theory of economics can be applied to a CS problem.
4. Facebook is planning a feature where you will buy some credits and give that credits to friends notifications and posts. More on it http://news.cnet.com/8301-17852_3-10212452-71.html


I haven't explained about it in my slides. But I wanted to give overview of it. The problem for Google is ( Here we are taking Google as an example and its true with any other search engine), what kind of Ads it has to show on a page ?. What ever Google do should satisfy three entities, 1. User, 2. Advertiser and 3. Google themselves. So, it has to achieve the following objectives.
  1. It should show relevant Ads to user. If it is not showing relevant ads then users find it less interesting and nobody look at them and it leads to Ad Blindness. Eventually users won't follow ads.
  2. It should satisfy the advertiser. Advertiser should get as many users as possible to their site through the Ads. This is possible only when it's Ad is shown at right place and at right time. For this, each advertiser bids for certain ad terms (Its like describing what Ad they are going to show ) and pays amount to Google based on users visiting the advertisers page. (i.e how many users actually clicked the Ad) Potentially there will be many advertisers, to maximize their chance of showing the ad they have to bid more.
  3. Google should maximize its revenue from above transactions.

How Google achieved these objectives ?
  • 1 and 2 are of related and assumed Google did its best to show relevant Ads. :)
  • 3. Google should maximize its revenue from advertisers. This is done by introducing competition among advertisers. Google uses a variant of vickrey auction (In vickrey auction, highest bidder wins, but the price paid by him is the second-highest bid). Since there will be few Ad places Google should find a way to choose few from these advertisers. ( For example, if you search in google, in your right hand side at the most google shows 8 Ads. So here we have only 8 Ad places or slots )

Whatever mechanism Google chooses,it should not have any side effects. For example, lets assume that Google chooses the advertisers based on the bid amount. What is the problem with this approach ?. Lets see whether it achieves our objectives.
  1. Users should get relevant Ads: This objective may or may not be achieved and it depends on honesty of advertisers. For example, when a user is searching for BMW cars, he should not get advertisements about Ferrari, Toyota cars. This is possible if Ferrari and Toyota bids for BMW keyword. So advertisers may lie and Google should consider it. Second reason why advertisers can lie is, remember they pay only on number of clicks, so advertisers may bid for many irrelevant terms for high price. They do this to increase the brand awareness. Since they are bidding high, their Ads will always show up and users will not click these Ads because these are irrelevant Ads and eventually advertisers will not pay money to Google. The Advantage advertisers get is they show their Ads on lot of pages by paying little money. If Google is not showing relevant Ads, as discussed, it will increase the Ad revenue.
  2. Advertisers should get users to their page: Again this may or may not be achieved for the same above reasons.
  3. Google should maximize its revenue: At the first glance it appears that google got maximum revenue but its not true. Since it may not be showing relevant Ads, it increases the Ad Blindness and looses its revenue. It may be good for short term and definitely not good for long term. Yahoo used the above approach for long time and eBay exploited it. (I got this info from Prabhakar Raghavan's Talk)
Actually Google uses combination of Quality score and Bid amount. Quality score is again combination of CTR (Click through ratio and it is ratio of number of clicks of Ad to number of times they showed the ad), Relevance and Landing page quality. See below YouTube video to know more about Google Ad auction. By using Quality score google eliminates the above problems and still making lot of money :) .

In economics Mechanism design is the study of designing rules of a game or system to achieve a specific outcome, even though each agent may be self-interested. These systems are formally analyzed using Game theory.


Introduction to the Google Ad Auction ( http://www.youtube.com/watch?v=K7l0a2PVhPQ )
Example of how bittorrent client uses mechanism design to improve file sharing. http://torrentfreak.com/bittorrent-to-be-pimped-by-nobel-prize-071019/