The thoughts of a web 2.0 research fellow on all things in the technological sphere that capture his interest.

Thursday, 10 July 2008

Python for API Dummies

Python is a really simple programming language for the novice programmer. As such I held an afternoon's "workshop" for a couple of PhD students in my front room:

The aim of the workshop was to provide sufficient information about programming in Python so that at the end of the afternoon the user could:
-Install Python libraries
-Download information through various APIs
-Manipulate the downloaded information.
As it was necessary to create an extensive slide show, covering everything from installing Python to getting data from the Yahoo API, I thought it may potentially be of interest to other novice users who don't know where to start.





It doesn't necessarily include the quickest or most efficient way of doing things, but it is simple and does the job.

If you have any questions about specific points, feel free to ask...the questions can't be more stupid than the questions the PhD students asked...and some of the slides could probably benefit from further explanation.

Labels: , , , ,

posted by David at | 1 Comments Links to this post

Sunday, 6 July 2008

UK Public Data APIs: It's only the start...

Last year a report for the Cabinet Office concluded that the government should make more of its data publicly available, this was followed last month by Gordon Brown announcing that online maps with crimes plotted on them will be made available, now we are beginning to see some of the public data finally being made available.

Already there seems to be an embarrassment of riches: a neighbourhood statistics API from the Office of National Statistics; Transport information from Transport Direct; Health care services and information from the NHS...and the list just keeps going on. Despite messing about with APIs for a number of years, the quantity of data available means that I have no idea where to start. The good news is that the Power are offering up to £20,000 to develop any ideas that you may have.

Whilst I have not had a chance to play about with any of the data yet, I do have one criticism: The use of a .co.uk domain name (i.e., www.showusabetterway.co.uk). As the Power of Information Task Force has a government email address (i.e., poi@cabinet-office.gov.uk), why didn't they use a government domain name? Such domain names are restricted, and therefore provide a indication of legitimacy.

Labels: ,

posted by David at | 0 Comments Links to this post

Sunday, 6 April 2008

Average photos per Flickr Member: ZERO

67% of Flickr members have no photos! Whilst Lotka's law teaches us that the majority of contributors to a community make very few contributions, I was still surprised at the number of members with no photos; after all, I am not talking visitors to the site, but those who have taken the trouble to join. What is the point of joining Flickr if you are not going to put photos on the site?

Data was collected about the number of photos for 324 randomly selected users. 216 had no photos, an additional 58 had less than 20, with only 50 having over 20:


Really I should have a look at whether these missing users are active in other ways, (e.g., members of groups, leavers of comments), but this was little more than an aside as I spend my time messing about with Python. I have now loaded Python on my main computer as well as my Eee PC, and can barely believe how easy it is!

Labels: , ,

posted by David at | 4 Comments Links to this post

Wednesday, 2 April 2008

Programming Python on the Eee PC

Since Friday I have been spending a lot of time programming in Python on the Eee PC, the more I program the more I like both the language and the ease if having it on the Eee PC. Over at the Beeb Bill Thompson poses the question "Who will write tomorrow's code?", I suggested last week that the Eee PC (and other similar devices) may be the answer, and now I more convinced that ever.

Already I have been writing codes in python that use the Twitter, Flickr and Digg API, programs that can form the basis of numerous articles that I will never get around to writing...it's SO easy (with the possible exception of installing the simplejson library that the Twitter library relies on). Just wish some other sites would roll out APIs (e.g., Stumbleupon and Reddit).

So, do we all need to become top-class programmers? No. But if you can program, even to a basic level, the web becomes a lot more exciting and interactive place.

Labels: , , , ,

posted by David at | 1 Comments Links to this post

Friday, 21 March 2008

Giga-blast from the past

It is all too easy to forget about some of the alternative search engines out there, and I must admit that I can't remember the last time I used Gigablast. It was therefore good to read on ResearchBuzz that Gigablast are now offering site search, which I have now added to the right-hand frame of my blog (too often people overlook the blog search in the blogger toolbar/banner).

Gigablast seems to have had a bit of make-over since I last visited (when it looked something like THIS), and now it even has a very limited API. Personally I would like to see the API extended and a few advanced operators, surely that's an easy way of getting a competitive advantage over the other search engines.

Personally I hate the growth of Google search, and love any opportunity to support other search engines.

Labels: , , ,

posted by David at | 0 Comments Links to this post

Friday, 7 September 2007

Catching up with Google Search

My inability to have successfully found a decent mobile RSS feed reader means that everytime I have even a couple of days off I return to a bloglines account that has thousands of posts waiting to be read...nonetheless, three days later, I am finally on top of them all again.

The good news that I have returned to is that Google Search are opening up more of their data for university researchers, this following quickly on the heels of Microsoft's new Webmaster Portal. Although the two programs are aimed at different communities, they are both likely to open up a wealth of information to those interested in webometric research.

Whilst access to the search engine data is welcomed, I'm guessing that the more sensitive additional information about how search engines are crawling and indexing web pages will continue to be a secret, and such information is perceived as necessary in the scientific community if much of the research is to be taken seriously.

Labels: ,

posted by David at | 2 Comments Links to this post