In a project to categorize a list of news websites last week we wrote a Python script, based on a script from the great book Programming Collective Intelligence.
Although the explanation given in the book was of great help and we ended up with our own working script, the sample code gave us a hard time at the start. Due to at least one typo in the code, we just could not get the code to run.
It was like the original code could not process the list with URL’s stored in a txt file.
Fortunately we were not the first who ran into this problem. With some help of the department of computer science at the the Old Dominion University we started with a working script as a base for our own classifier.
A working version of generatefeedvector.py can be found at: http://www.cs.odu.edu/~hany/teaching/cs495-f12/lectures/lecture_4/code/generatefeedvector.python
Just in case, the code for a working generatefeevector.py is given below as well. Don’t forget you need some extra modules for Python installed and a textfile with URL’s to get this working. Continue reading