Webometric Thoughts

August 31, 2008

Twittering with Python …on the web

Filed under: Python,Twitter,programming — admin @ 9:20 pm

Programming is really addictive, especially when you are bad at it. Whilst the proficient programmer can deal with their problems in a matter of minutes, bad programmers can spend hours on the simplest of problems. Today I decided to start messing about with some server-side programming for the first time; now I find myself wondering what happened to my Sunday.

This particular form allows anyone to post to your Twitter account (and then displays the comments that have already been posted):

Enter Comment:

Now all I need to do is think of a use for anonymous twittering….

The code:
>import cgi
>import urllib
>print “Content-type: text/html\n\n”
>data = urllib.urlencode({“status” : form["status"].value})
>res = urllib.urlopen(“http://USERNAME:PASSWORD@twitter.com/statuses/update.xml”
, data)
>lines = urllib.urlopen(‘http://twitter.com/statuses/user_timeline/16066835.rss’)
>for line in lines:
>>>if line.find(‘title’) <>-1:
>>>>>>print line

August 30, 2008

Remembering the Olympics: BBC thrashed NBC!

Filed under: BBC,olympics,streaming — admin @ 12:53 pm

Whilst the Olympics seems AGES ago now, I’ve only just come across some of streaming figures for the BBC and NBC:
NBC – 75.5 million streams
BBC – 40 million streams
Obviously comparisons are never simple, with different time zones likely to effect the choice of the Internet over the TV, but as the USA has five times the population of the UK 75.5 million streams doesn’t look very impressive.

It seems likely that London 2012 will make these figures seem ridiculously small. Hopefully the BBC will be offering more than just the 6 streams (it’s amazing that there was still so much that I wanted to see and couldn’t), and live mobile streams will be readily available.

UPDATE (11/09/08): Whilst I thought the BBC’s 40 million streams thrashed NBC, as the BBC is actually claiming 50 million streams, I guess the BBC thrashed NBC then delicately placed a cherry on top.

August 29, 2008

Maplin’s Minibook

Filed under: Eee PC,Eee PC 901,Maplin's minibook — admin @ 3:53 pm

I remember a time when being seen in public with my Eee PC 701 would always get people asking questions, these days they are seemingly everywhere. A stroll into Dixons (or whatever my local branch is called these days) at lunchtime found four different mini-notebooks available. The one that really caught my eye, however, was the one in Maplin’s:

Whilst it isn’t the most powerful of the mini-notebooks, or the most aesthetically pleasing, with a 7inch screen you can’t argue with the weight: 0.65kg!! That makes the 0.922kg of the Eee PC 701 look decidedly over-weight, whilst the Eee 901 will barely be able to hold its head up in public at 1.14kg.

Maybe I need Maplin’s minibook for those days when I am feeling too lazy to carry my 701…or should I just hold out for the one that comes with a helium filled balloon?

I am also slightly surprised at the claim on the ‘minibook’ trademark, after all, wasn’t the Eee PC 701 first launched in the UK as the RM minibook?

August 28, 2008

Internet Explorer 8 – Beta 2

Filed under: browser — admin @ 7:57 pm

Almost 6 months after the first beta was launched, today saw the launch of Internet Explorer 8 – beta 2. With a new privacy mode – widely labelled ‘porn mode’ by the newspapers – it has managed to get more irrational responses than you would expect from the launch of a new browser:

Sick just another way to allow married men to access porn without the wife finding out, bet this and other sites are the reason courts have seen previously very secure 15 year plus marriages hit the rocks when wives dont live up to a sexed-up husband’s hard core porn expectations.
- Dee, Hampshire, England, 28/8/2008 18:39

I can see this “improvement” being very popular with paedophiles. Transfer all their pics to private online photo browsers, and no need for them to be worried about having their computers seized by the police when this is released.
Mike, London, England

Who knows why people get over-excited at ‘publishing’ their angry/ignorant comments on the big public sites, but it reminds me that there was a great post in the Guardian a couple of weeks ago about it: What about are rights?

Blogroll is dead; long live customizable Reddit

Filed under: Reddit — admin @ 9:30 am

Back in November I complained about the static nature of most blogrolls. I then changed my own blogroll, and to my knowledge never changed it again (despite the best of intentions). As such the Read Write Web’s post on the customizable nature of Reddit has spurred me into action, and I have now created a Webometrics Reddit for those stories/pages I find interesting.

Whilst I have called it ‘Webometrics’, in the same way that my blog covers whatever I fancy writing about on a particular day, it will highlight anything I find of interest, with the ‘hottest’ five stories shown in the sidebar of my blog. The reddit has been left open, so if you want to submit a link feel free…in fact I would encourage it.

Email is dead; long live ….

Filed under: email,spam — admin @ 8:24 am

I have always enjoyed checking to see if I have any post. Whilst it is usually a bill, the offer of a credit card, or the promotion of Sky TV, there is always the chance that it could be something far more interesting and exotic (such as the free book of stamps I received last week). When the Royal Mail’s did away with the second-post, I overcame the loss by checking my emails more often. Now, however, I find that my email is increasingly unreliable.

My email problems started back in March when Hotmail started only selectively sending my emails. Whilst this was seemingly a glitch with Hotmail, as a opposed to a problem with email as a communication medium, the seeds of doubt were sown. Since then my university has installed an email filtering system to deal with spam. Not a selective filtering system that is applied to those people who have a problem with spam, but across the board!

I now find that unless I spend twice as long constructing an email, e.g., filling it with excess text and making sure there are no mentions of banks or money (as a general rule my emails rarely mention viagra anyway), only half of my Hotmail emails reach their university destination without being quarantined for a few hours first. In addition a number of emails sent from university addresses to me are mysteriously disappearing. Yes, I could do away with my Hotmail account and use my university account, but that does not seem to be a practical solution. Most people (especially students) have one account that they already check regularly when they arrive at a university, forcing them to check another email address will merely mean communications sent to that email address are left for days or weeks on end before being retrieved.

Whilst I’m sure that some IT departments are more selective when they roll out spam guards, unless you are aware of the exact filtering systems in place at the email’s destination, you can never be sure whether your email has reached it’s destination or not. As such email, in its current state, is as good as dead. Unfortunately there doesn’t seem to be anything capable of replacing it, but if someone does come up with a solution there will be a lot of money to be made.

August 21, 2008

Iterasi: Create your own archive!

Filed under: archive,national archives,web archives,webometrics — admin @ 12:27 pm

The UK’s web archive is pretty rubbish, therefore Iterasi (highlighted by TechCrunch) is a great addition to the web.

Rather than merely bookmarking a URL, you can archive the actual page, and can continue archiving the page on a regular basis if you so wish. The only downsides to the site are that it only allows you to archive on a daily basis (for the front pages of news sites you may want to archive more regularly), and it only archives when your computer, with its list of scheduled saves, is turned on.

The potential for webometric studies is obvious, it would seem as though even the most technologically incompetent of us can now simply collect longitudinal data. For example, Google searches may be collected on a daily basis to see how the results or the number of hits changes…and once you have archived a page, it’s very simple to then embed the page:

It also has potential for bloggers; when they discuss a page or story bloggers can now be sure that their readers will have access to the page that they saw rather than an updated version. How content providers will react to the archiving of their content is yet to be seen.

August 19, 2008

A week of Asking not Googling

Filed under: Ask,Google,search engines — admin @ 1:32 pm

In response to Google’s continued growth in the search engine market, last week I decided to attempt to give up Google Search. For the last week my search engine of choice has been Ask:

Whilst I have regularly used different search engines over the years, a fundamental shift came in my searching behaviour in about 2000. Pre-2000 there was no single search engine that dominated my search activities, I went all over the place: HotBot, Yahoo, Lycos, AltaVista. I even Asked Jeeves on occasion. Since 2000, however, Google has dominated my life, with only occasional visits to those few search engines that continue have their own index. Trying to break eight years of Google dominated search has a number of difficulties.
Habit – You don’t have to think about typing in Google, you fingers seemingly hit the keys before you have even decided what you want to search for; this is not an easy habit to break. I quickly remember, however, as soon as I have typed GOOGLE, or after I have typed in my search terms, that I am trying to give up Google Search, and force myself to go to Ask and type in the queries again (however tempting the Google results may look). I think Google is a habit that can be broken, but it isn’t easy.
Trust – After using Google for eight years, I find that I trust them to do their job as a search engine. Whilst I understand the limitations of any single search engine (i.e., that Google’s search engine is not exhaustive, and that other search engines will have different pages indexed), tens of thousands of queries have taught me what to expect from the Google index. When I don’t get the results I need from Google I make a judgement as to whether I need to try somewhere else or adjust my search terms; if I don’t find what I want on Ask my first reaction is to question the quality of the search engine. Trust is something that can only come with time.
The Google Package – Google now offers more than just web search: blogs, emails, news, image search, blog search, scholar. No other search engine provides such a variety of products that are of the same quality (the UK version of Ask News is currently rubbish); you can’t help but return to a certain extent. Vigilance is constantly necessary if we are to stop falling back into our old Google habits.
The Gold Standard – Despite trying to break away from Google search, I still care how it ranks my pages. Most people use Google. Most of my traffic comes from Google. It means more to have a high Google rank than a high Ask rank, and you can’t help but check.

Whilst it was always going to be difficult to move away from Google, Ask’s search compares favourably, you just have to make an effort to break the Google habit, and give Ask enough time to build up a level of trust. For all Google’s growth, there are two things I prefer about Ask: The Skins (the polka dot background always cheers me up); and the URL is 3 keystrokes shorter. How many man hours would have been saved if Google had called themselves Goo?

August 15, 2008

OfCom: The Communication Market 2008

Filed under: Office of Communications,SMS,internet use — admin @ 7:51 am

You can’t help but love the annual Communications Market Report from the Office of Coomunications; it provides massive amounts of detail and information on the UK’s communication market. As with last year’s document, it is a rather large document (365 pages), so not for the faint hearted. Luckily, however, the BBC have published a nice summary with the most interesting graphs.

My favourite graph is the “Time Spent Using Communications Services”, it’s interesting to see how your own usage compares to others:

Personally I was shocked at how low internet usage was…at the moment, due to the Olympic coverage, I tend to be online for a couple of hours before I even get out of bed! Closer inspection of the OfCom document finds that the 24 mins does not include time spent watching streaming media, which partially explains the low number of internet minutes. My personal graph would look some thing like this (for personal/non-work use):

Whilst it seems as though there are not enough hours in the day, the portability of laptops like the Eee PC means that we can always be online. It is a rare occasion when I am watching the TV without the internet, whilst I only really listen to the radio when cooking the dinner. My fixed phone usage is zero due to the phone line still not being connected by Virgin!

I thought that the most surprising finding was the continued rise of SMS use (up 28% on last year). Personally I think it must be close to a peak now; with an increasing number of mobile internet applications becoming available, such as Nokia Chat and other instant messaging services.

August 14, 2008

Older Posts »

Powered by WordPress