Recently while I was setting buildbot for the continuous integration of my project, my friend told me he was drawing ideas about community building into a mindmap. So why don’t I also try to jot down what I learn from buildbot as mindmap…after one and half hours fun with xmind, here is the result then.
These pictures really reminds me of those sleepless nights back to university days. Now the fractal pictures even look more beautiful with the slick interface of Flcikr.
Last time I wrote a program to download all videos in playlists,would it be nice if I can watch all these video one by one while downloading? What we have to do is starting two threads, one for downloading and put the videos in queue and the other thread trying to get videos from the queue and playing them.
Sometimes I want to watch the videos of my or even my friend’s Youtube playlists in my laptop while I am not online. Therefore I google around and found some tools to download the videos. Most of them either web-base or firefox plugin just let me download one video each time except youtube-dl . So now the problem become how to extract the video url links (or youtube id) from my playlists and fetch them to youtube-dl, and fortunately with Youtube API we can do that easily. The first step is download http://www.arrakis.es/~rggi3/youtube-dl/youtube-dl and then rename it to youtubeDL.py for code reuse as a Python module. Then we can start to extract the video links. Here is a quick implementation.
import feedparser, urllib, re, sys#download http://www.arrakis.es/~rggi3/youtube-dl/youtube-dl and then rename it to youtube-dlfrom youtubeDL import FileDownloader, YoutubeIE, MetacafeIE, YoutubePlaylistIE, DownloadError
def retrieve_playlist(username):
playlists_url = 'http://gdata.youtube.com/feeds/api/users/%s/playlists'%username
feed = feedparser.parse(playlists_url)
playlists = []for en in feed.entries:
title = en.title
id_num = en.id.split('/')[-1]
pages = gen_playlist_pages(id_num)
playlists.append(dict(title=title, id_num=id_num, pages=pages))return playlists
def gen_playlist_pages(id_num):
playlist_pages = []
page = 'http://gdata.youtube.com/feeds/api/playlists/%s'% id_num
pages = []for i inrange(4):
params = urllib.urlencode({'start-index':1+50*i, 'max-results':50})
_page = '%s?%s'%(page, params)
pages.append(_page)return pages
def get_video_links_from_playlists(playlists):
video_lists = []for pl in playlists:
video_links = []for p in pl['pages']:
feed = feedparser.parse(p)for en in feed.entries:
ifre.search(r'watch',en.link):
video_links.append(en.link)
pl.update(dict(video_links=video_links))
video_lists.append(pl)return video_lists
def download_videos(video_lists):
youtube_ie = YoutubeIE()
metacafe_ie = MetacafeIE(youtube_ie)
youtube_pl_ie = YoutubePlaylistIE(youtube_ie)for vl in video_lists:
outtmpl = vl.get('title','no_playlist_title') + u'/%(stitle)s-%(id)s.%(ext)s'
fd = FileDownloader({'outtmpl': outtmpl})
fd.add_info_extractor(youtube_pl_ie)
fd.add_info_extractor(metacafe_ie)
fd.add_info_extractor(youtube_ie)try:
retcode = fd.download(vl.get('video_links'))except DownloadError:
# yes, we should handle this... maybe laterpasssys.exit(retcode)if __name__ == '__main__':
pls = retrieve_playlist('your_youtube_username_here')
video_list = get_video_links_from_playlists(pls)
download_videos(video_list)
Just put youtubeDL.py and this script file as playlists-dl.py in the same directory, and change ‘your_youtube_username_here’ to your user name then run python playlists-dl.py then all the video clips in all your playlists will be downloaded :).
Todo:
1. let user can specify username in command line.
2. if user specify the playlists id, then just download videos in those playlist.
3. use multi-thread to save total download time.
4. play the downloaded videos while other download still going on …
5. …
I have been reading Design Patterns in Ruby. It’s a quite intrestring read, the author explain the GOF patterns in a very clear and readable way. For example, in chapter 3 “Varying the Algorithm with the Template Mothod”, the author explaing the ‘Template Method’ pattern quite clearly by employing a example project which I did have to do something similiar many years ago– spewing out reports with some contents but in many different formats such as HTML, PDF, plain text …etc. In chapter 14 ‘Easier Object Construction with the Builder’, the author simulates the computer building process by using ‘Builder’ Pattern. The “Builders in the Wild” section in this chapter mentioned that MailFacotory in Ruby is a builder pattern example which remind me that email.mimi.text.MIMEText in Python is also a builder pattern example. Anyway, it’s good to see finally there is a design patterns book from the perspective of dynamic language.
Besides MIT SIMILE Timeline, I just discovered XTimeline, an even more easy way to build the timeline on the webpage. Well, I mean more easy if you do it manually. To give it a try, I discovered this article Timeline: Countdown to the 2010 Olympic Games by Google. Here is how it looks for the first three events.
I have been reading Programming Collective Intelligence by Toby Segaran recently. It’s quite interesting and inspiring. In chapter 2 , the author talks about how to make recommendation by using preference datas of a group people. Usually, to do this, iwe get the rating data, then we implent an reasonable algorithm to calucalte the ‘metric’ (or score) by taking advantage of the data. Then we make recommendation by the order of metric/score.
For example, if I know want to extend my music collections, asking other people who also have the same favorite band with me might be the best bet. Thanks to Audioscrobbler, the Last.FM web service, we can collect this data quite easy. With pyscrobbler, a set of python bindings to AudioScrobbler APIs based on ElementTree, we can write a function to get the rating data and calculate the score.:
from audioscrobbler import AudioScrobblerQuery
importoperator,sys#n total numbers of bands to be recommended.def getRecommendations(favoriteBand,n=10):
#since audioscrobbler return 50 fans by default, so we use 50 as full score.
FULLSCORE = 50
fans = [f.element().get('username')for f in AudioScrobblerQuery(artist=favoriteBand).fans()]
bands= {}for f in fans:
for a in AudioScrobblerQuery(user=f).topartists()[:FULLSCORE]:
name = a.name.__str__()
rank = int(a.rank.__str__())#so rank #1 will get score=FULLSCORE, rank #2 will get score=FULLSCORE-1, ...etc.
score = FULLSCORE - rank + 1if name in bands:
bands[name] += score
else:
bands[name] = score
#we do not return the artist the user just pass in.del bands[favoriteBand]
recom = sorted(bands.items(), key=operator.itemgetter(1),reverse=True)return recom[:n]
Here are the two examples to run this function in IPython shell:
In [3]: getRecommendations(’U2′)
Out[3]:
[('Red Hot Chili Peppers', 757),
('Coldplay', 669),
('The Beatles', 532),
('Nirvana', 423),
('Aerosmith', 409),
('R.E.M.', 401),
('Moby', 381),
('Queen', 379),
('Pink Floyd', 378),
('Green Day', 376)]
In Statistics Canada census, there was the result of employed labor mode of transportation for different areas in Canada. Although the data is quite straightforward, it would be nice to have a ‘map’ view for this data which will allow us easy to see the distinction geographically. It turns out that this is an interesting topic for Google map mashup.
By using Exhibit 2.0, a Javascript library which makes creating interactive easily, all we left to do is dumping the raw csv data with the latitude and longitude of each location into proper json format. With geopy, simplejson and Python build in module csv, here is an implementation:
importcsv,refrom geopy import geocoders
import simplejson
COLUMN = ["geocode","place_of_residence","total_Mode_of_transportation","car_truck_or_van_as_driver","car_truck_or_van_as_passenger","total_Sustainable_transportation","public_transit","walked","bicycle","other"]
GMAPKEY="Google_Map_API_Key_here"def transformcsv2json(file,jsonfile='output.js'):
reader = csv.reader(open(file))
column = COLUMN + ['city','province','latlng','label']
items=[]
fc = 0def getlatlng(place):
# For some unknown reasons, sometimes gecoding fails several times before succeedsfor i inrange(25):
try:
g = geocoders.Google(GMAPKEY)
place, (lat, lng) = g.geocode(place)
latlng = str(lat) + ', ' + str(lng)return latlng
except:
print"|"*60,"Waring! ", place, " ", i, " times geocoding failed!"raiseExceptionfor row in reader:
city, province = mapLocationName(row[1])if city and province:
place = city + " " + province
try:
latlng = getlatlng(place)
rowplus = row + [city, province, latlng, place]
transItem = dict(zip(column,rowplus))
items.append(transItem)except:
#We take a note of the place whcih fails in geocoding but still keeping on transforing next data row anyway.print"?"*60,"Error! ", place, " ", "geocoding failed!"
fc += 1print"#"*60, 'total number of geocoding fail: ', fc
f = open(jsonfile,'w')
f.write(simplejson.dumps({'items':items}, ensure_ascii=False))
f.closedef mapLocationName(location):
#Statistic Canada province abbreviation is different from google maps
mapP = {'Alta.':'AB', 'B.C.':'BC', 'Man.':'MB','N.B.':'NB','N.L.':'NL','N.S.':'NS','N.W.T.':'NT','N.U.':'NU','Ont.':'ON','P.E.I.':'PE.','Que.':'QC','Sask.':'SK','Y.T.':'YT'}try:
city, province = location.split(',')
province = re.findall(r'\((.*)\)',province)[0]ifre.search(r'\/',province): province = province.split('/')[0]
province = mapP[province]exceptValueError:
city = location
province = ''print'Warning! ', location, ' parsing failed'return city,province
if __name__ == '__main__':
transformcsv2json('placeofresidence.csv','placeofresidence.js')
Flickr Leech is a slick site I often visit. One can browse mutiple pictures by favorite tags, user name, interestingness …etc at the same time. Today when I was looking at all those beautiful pictures from all over the world again, an idea came to me … Maybe I can download those pictures automatically as my screensaver slides? Then I started to dig into Flickr API , it turns out the answer is yes indeed. Here is my quick hack with Python: flickrDownload.py . After excuting
python flickrDownload.py dog 200
I was starting download the most popular 200 pictures which were tagged with “dog” from Flicke. Then I open f-spot in Ubuntu and tag all these pictures with dog,
and go to Edit->Preference to set them as my screensaver slides.