The thoughts of a web 2.0 research fellow on all things in the technological sphere that capture his interest.

Friday, 22 January 2010

Semantic Webometrics - A few thoughts

The other day an academic colleague asked what I was working on at the moment, in my answer I included - semantic webometrics - unsurprisingly he wanted some more detail. However 'working on' would be a bit of an exaggeration, 'have a few ideas but nothing on paper yet' would have been more appropriate. As such I thought I'd write down some of my rough thoughts on semantic webometrics.

Webometrics
For those who may have stumbled upon this blog from a non-webometric background, Webometrics as defined by Björneborn (2004), and as used by most of the webometrics community, means the:
...study of the quantitative aspects of the construction and use of information resources, structures and technologies on the Web drawing on bibliometric and informetric approaches.
Many of these quantitative studies have focused on hyperlinks. For example, investigating whether there is a correlation between a university's inlinks (a.k.a. backlinks) and a university's research ranking, or whether the interconnectedness of organisations in a region (as seen through interlinking web sites) can give an indication of a region's level of innovation [outrageous self-citation].

One of the problems with many of these link-analyses is that they include a lot of noise. For example, when counting a university's inlinks you will be counting both those from an academic highlighting a university's quality research, and those from the disgruntled student highlighting his most hated tutor. Traditionally we have tried to understand the extent of this noise through large scale content analysis - the extremely tedious manual classification of web links and web pages.

The semantic web
A semantic web is one where information on the web is structured so that it is meaningful to computers. Well known examples of the semantic web include FOAF ontology allowing people to express the relationships with one another (e.g., the FOAF of Tim Berners-Lee) and the use of microformats for certain types of structured content including contact details (as included at www.davidstuart.co.uk) and reviews (which are now indexed by Google as Rich Snippets). This extra information information can be used to reduce the amount noise and enable meaningful webometric studies.

Semantic webometrics
So when I say semantic webometrics I mean - webometric studies that make use of the additional information included in an increasingly semantic web.

For example, a semantic webometic study of the connection between an institution's inlinks and research ranking would take into consideration who had placed the links and the attributes that they had associated with them. A semantic webometric study of the relationships between organisations would look at the explicit relationships contained in FOAF files as well as the implicit information on web pages.

Conclusions
Unfortunately there is relatively little semantic information embedded in the majority of web pages/sites, and where it is widespread, e.g., with the nofollow link attribute, webometricians have yet to develop the tools to make use of them.

As such we need to take an information-centred approach to semantic webometric research rather than a problem-centred approach. Whilst still small, there is an increasing amounts of semantic data being embedded in the web all the time, webometricians need to investigate what is available and how they can use it.

Labels: ,

posted by David at | 0 Comments Links to this post

Monday, 4 January 2010

Predictions: What are they good for?

At this time of year (or rather a few weeks ago if they weren't drowning under a pile of work) technology bloggers all around the world make predictions about the coming year, and reflect upon the predictions they made the previous year. Looking back on my previous predictions I can't help but realise how slowly the world of technology moves.

Last year's predictions
1. N97 takes Nokia back to the top of the pile. Unfortunately I have only come across one person with an N97 in the past year, Apple and its apps continue to beguile everyone in their path.
2. Distributed social networks will shrink Facebook traffic. Unfortunately Google Wave launched too late in the year, and with too many problems, for it to make any real impact. But the notion of a distributed system has been well and truly planted in people's minds.
3. Project Kangaroo will hit UK desktops.The legal watching of video online is increasing, with new entrants in the market such as Blinkbox, but unfortunately Project Kangaroo fell foul of the Competition Commission.
4. The general public continue to ignore QR codes. Despite my pessimism QR codes have actually started to creep into some unexpected places. For example, the University of Bath in numerous places, including their library catalogue. Whilst they have become more popular than I imagined, they are still ignored by most of the public.
5. No Google alternative will emerge. Yahoo Search closes up shop, Bing has more money than sense, and Google marches on.

This year's predictions-On a similar theme
1. iPhone + Augmented Reality = Increased Market Share. I hate the iPhone because if you want to install anything on an iPhone you have to check it's OK with Apple first, for which they will take 30% cut of the price of the app. Unfortunately the centralised app-store is the reason so many people like it. It simplifies the process of downloading new applications, and as we see an increase in glossy augmented reality mobile applications the iPhone will continue to be perceived as the obvious choice.
2. Google Wave takes off. Despite hating Google, I'm backing Google Wave for two reasons: i) We need something better than email, ii) I really want to see an open distributed system. It still has a lot of teething problems, but nothing that can't be overcome.
3. Project Canvas fails. Project Kangaroo failed because of the complaints of Murdoch, and I'm sure Project Canvas will as well, especially if we see a Tory government after the next election.
4. No change in search. Market share will stay the same and no one will embrace the potential of the wisdom of the crowd. Search strikes me as one of the more antiquated areas of the web, with little real innovation occurring. I think things will start to change in 2011, if the semantic web takes a foothold this year.
5. The year of the Semantic Web. After years of talk, I have the feeling that this could be the one where we start to see the semantic web making an impact both through the opening up of large data sets, and the marking up of web pages with microformats. As someone who is fed up with poking and tweeting, I'm looking to the semantic web to inject a bit of life into the web.

As for Twitter, I don't really care. I'm bored of it now.

Labels:

posted by David at | 0 Comments Links to this post

Sunday, 3 January 2010

2009 in Books: 47

Whilst I have little doubt that the web is a wonderful thing, I personally waste a lot of time online reading half-formed, half-baked, off-the-cuff opinions. There are a lot of things that are better said in 300 pages than 140 characters. Unfortunately my mindless clicking online leaves far less room for books than I would like. At a minimum I would expect to read 50 books in a year, unfortunately (thanks to that ever encrouching web) 2009 saw me read a mere 47, or rather, finish 47 books; my shelves are littered with half-read books which if I return to I will feel it necessary to start again from the start.

The work related books: 16
'Work' can be stretched to cover a multitude of subjects that I am interested in, from sociology, through the narrative, to Second Life.

Unfortunately some of the work related books are far less enjoyable. Often (although not always) these were the ones that I had offered to review for a journal and therefore have to struggle through to the end.

Whilst some books are always worse than others, without a doubt Knowledge Networks: The Social Software Perspective (Premier Reference Source) was not only the worst book I read this year, but one of the worst publishing efforts I have ever seen.

Other non-fiction: 19
There isn't much of a theme to the rest of my non-fiction, although I possible got a bit carried away with books about Samuel Johnson.


The one with least merit is The Impulse Factor: Why Some of Us Play it Safe and Others Risk it All; don't even think about buying this book. The keen-eyed wondering what happened to book number 19, it was HOW TO USE BOOKS, I can only presume that it was the lack of picture that mean't Amazon would let me add it to a widget.

The Fiction Books: 12
Curiously my fictional reads of 2009 both started and ended with an Adrian Mole, and there are the usual inclusion of personal favourites such as Grisham and Irving. But beyond that it is a curious selection of odds and ends.



Conclusions
Clumped together it looks a slightly bizarre collection, especially the fiction shelves (I believe Mr Majeika was free in a cereal box a previous year), but there again I suppose a lot of people's do. As with every other year I shall resolve to read far more in 2010; maybe I should also resolve to read better books in 2010.

Labels:

posted by David at | 0 Comments Links to this post

Sunday, 27 December 2009

The Cat, the Bullfighter, and Google Books

As a general rule I take the web for granted. Although I'm old enough to remember [a lot of] life before the web, because I was aware of services like Prestel and had dialed up the local BBS years before, I merely saw the web as a natural progression.

Occassionally, however, an inconsequential event does make me stop and realise how much we really take for granted. Last night I was curled up with Erving Goffman's Frame Analysis: An Essay on the Organization of Experience (as I've mentioned before his works provide userful frameworks for understanding the social web). On page 424 he uses an example of a cat 'frame breaking' as it circled the bull ring whilst the bullfighters were hiding behind barriers from the bull. Despite being nearly midnight I could go over to my computer, go to http://books.google.com/ and browse or search my way to the relevant issue of Life magazine to see the full-page picture of the event.

A book published in 1974, referenced a photograph published in 1955, and I could see that photo in a matter of moments. Something that would have been impossible for the vast majority of people who have ever read Frame Analysis over the past 35 years.

Labels: , ,

posted by David at | 0 Comments Links to this post

Saturday, 26 December 2009

Do you need a Pocket Projector?

My research group were recently assigned a large amount of cash for new equipment. Beyond the usual list of desktops, laptops, and netbooks, I decided to ask Twitter for more interesting suggestions.
The idea of a portable projector appealed from both a work and a social perspective. Offering the opportunity for demonstrating webometric presentations on the fly (it's a very visual subject), as well as watching films and TV on the big screen. The best Pocket Projector I could find was the Adapt ADPP-305 Pocket Projector. Luckily it arrived on Christmas Eve, so I have had a couple of days to test it out - albeit it mostly for watching Christmas TV.

The projector promises up to 100 inches, although you'd want a very dark room for it to be a clear 100 inch image. So far I have connected my laptop and my Wii to the projector, and run it off the mains, although there is also a 4GB internal memory, and a battery if you want to leave the laptop and leads at home.

There's no doubt it is a nice bit of kit (although one of the tripod legs is a bit lose on mine - it's nothing a bit of glue wont sort out), providing a good picture, and is reasonably priced.

As projectors continue to improve I can imagine the traditional TV being squeezed out by computers on the one side, and projectors for the big screen experience on the other. On Christmas Day I projected the Gruffulo onto the wall at 70 inches; a 70 inch flat screen would not only cost thousands, but would continue to take up space when not being used.

Labels: ,

posted by David at | 0 Comments Links to this post

Wednesday, 16 December 2009

The meaning of 'citations' on the social web

It has long been recognised in the world of scientific publishing that the rich get richer and the poor get poorer as authors vie for attention: The Matthew Effect. On the social web the problem is exacerbated through a combination of greater variation in the quality of publishing, the ease of 'citing', and social rewards for citing first. This can result in highly cited works of very dubious quality.

Publishing has traditionally been limited by space: If you publish one article you won't have space to publish another, therefore you should choose wisely. Online the cost of space is negligible, so sites often publish stories that are pretty worthless. A good example of this is an Econsultancy post on "What would a Tory government mean for SEO?"

As a pointless article with absolutely no substance it should attract little attention. However it is on a very popular site, so (according to Topsy) it nonetheless gets 50 retweets (admittedly one is mine). It's a pattern that's regularly repeated all over the web, with numbers dwarfing a mere 50 retweets. A post that may be described as "naive and lazy" by one person, can easily find itself retweeted over a thousand times if the right person is posting it and the mob want to jump on the bandwagon.

Whilst the cost of space is negligible, few online publishers (and promoters) take into consideration the cost of time to the reader. Whilst users don't have to subscribe to a feed, many will have subscribed to feeds as the sites were working hard to build a reputation, unfortunately they remain in the feedreader when the sites are starting to coast. Maybe it's time that I put Econsultancy in the same bin as Mashable, which tipped from being mostly useful to mostly pointless over a year ago for me.

Whilst many of the posts on this site will fall firmly in the "not worth the attention" category, my audience has little expectation of anything else :-)

Labels: ,

posted by David at | 0 Comments Links to this post

Thursday, 10 December 2009

SpongeBob Top Trumps - With 2D Codes

Back in September, on a trip to Walsall for a Social Media Curry, I picked up a pack of 3D SpongeBob Top Trumps. Unlike the traditional Top Trumps, the latest versions have an interactive element with 2D barcodes printed on the back of some of the cards which can be read by a web cam with special software.

Earlier this week, Top Trumps finally released the necessary SpongeBob software. Why the cards were on sale almost three months before the software I don't know, but at last I can have my photo take with Spongebob.

The actual SpongeBob software is a bit rubbish, and really wasn't worth a three month wait, but it does show some of the potential of 2D codes for bridging the gap between the real and virtual worlds.

Labels: , ,

posted by David at | 0 Comments Links to this post