Webometric Thoughts

October 3, 2008

It’s Porn Friday!!!

Filed under: Click,Hitwise,webometrics — admin @ 9:21 am

It’s not that today has been designated the official porn day of the year, merely that Friday is the day when adult web sites get most of their traffic. That’s just one of the facts scattered throughout Bill Tancer’s Click: What Millions of People are Doing Online and Why It Matters, albeit the most memorable:

Whilst very much a popular book, rather than an academic book, it’s a worthwhile read from a webometric perspective. If nothing else you can curse the limited amount of data we have access to in comparison to our commercial counterparts: Whereas we have to count links, they get to follow click-streams; following the mood and reactions of people around the world.

Whilst there is obviously big money to made with the Hitwise data, as well as with the data of their competitors, maybe they would find the data even easier to sell if it had been shown to stand up to the rigour of the academic community and the peer review process. My door is always open :-)

October 2, 2008

Google 2001 v. Google 2008

Filed under: Google 2001,webometrics — admin @ 9:41 am

In honour of their 10th birthday Google brought back their oldest available index a couple of days ago: Google 2001. This provides a great opportunity for looking at how the web has changed, especially the growth of certain terms in comparison to others.

As a webometrician, the obvious choice is to see how ‘webometrics’ has grown. However with changes in the index size the results are only meaningful in comparison to another result. In this case I have decided on ‘Mike Thelwall’, the hyper-productive author of over 100 papers in the field, who, luckily, also has an unusual name.

Whilst there were a similar number of documents at the start, and both have grown at an extremely fast rate, webometrics has grown at the faster rate. Scientific proof that there is more to webometrics than Mike Thelwall!

It would be nice if Google opened up some other indexes so that more points to the graph could be added.

September 24, 2008

Google Insights for Search: Term order is all important!

Filed under: Google,Insights for Search,Iran,cybermetrics,webometrics — admin @ 9:40 am

Unfortunately most poor academics don’t have access to the same data as Bill Tancer, instead we generally have to make do with the crumbs from Google and the other search engines. This morning however, I was reminded about how careful we need to be when using the tools the search engines offer us.

Today I was using Google Insights for Search to compare the term cybermetrics and webometrics. Whilst I am part of the Statistical Cybermetrics Research Group, as a group we tend to discuss ‘webometrics’. Google Insights for Search clearly shows that whilst there was once a time when cybermetrics ruled supreme, webometrics is now far more popular.

More importantly, however, I also noticed that Iran wasn’t highlighted on the map for the term ‘webometrics’, despite Iran have a (relatively) strong webometrics community.

Basically, because Iran does not appear in the results for ‘cybermetrics’ (which was my first search term), it is not calculated for ‘webometrics’. If I had added the term ‘webometrics’ first, then the term ‘cybermetrics’ the map would have looked very different:

The solution would seem to be to include a universal search term first, but those that immediately spring to mind are not necessarily the sort that you would want appearing on a corporate slide-show.

September 5, 2008

Webometrician v. Webometrician: Who will conquer the world first?

One of the joys of Google Analytics is watching the map slowly filling up as you get traffic from different parts of the world. However, whilst North America and Western Europe quickly fill up, other parts of the world have been more reluctant to visit my Webometric Thoughts. Almost a year after I started using Google Analytics there has still been no traffic from many countries in Africa.

Oh, what a tangled web we weave… is wondering how to start filling his map, hoping to attract visitors from Ukraine, Belarus, Georgia, Armenia, and Moldova. Whilst I am also waiting for some traffic from Belarus and Georgia, at least I can sleep comfortably in the knowledge of 28 visits from the Ukraine, 2 from Armenia, and 1 from Moldova.

Whilst the gauntlet has been thrown down by Kim at Oh, what a tangled web we weave…, I would expect the Belarusian, Georgian, and Armenian traffic to arrive by the end of the week (especially as I have sensibly included the demonyms as well as country names). And whilst Kim has decided to include the terms Google and Facebook in his post to increase the liklihood of traffic, I’m going with the Google Insights for Search suggestions of Minsk, Tbilisi, and Yerevan.

Update: Ooops…just realised I was chasing Armenian traffic after already having had Armenian traffic. So it should really say “I would expect the Belarusian, Georgian, and EXTRA Armenian traffic to arrive by the end of the week”

August 21, 2008

Iterasi: Create your own archive!

Filed under: archive,national archives,web archives,webometrics — admin @ 12:27 pm

The UK’s web archive is pretty rubbish, therefore Iterasi (highlighted by TechCrunch) is a great addition to the web.

Rather than merely bookmarking a URL, you can archive the actual page, and can continue archiving the page on a regular basis if you so wish. The only downsides to the site are that it only allows you to archive on a daily basis (for the front pages of news sites you may want to archive more regularly), and it only archives when your computer, with its list of scheduled saves, is turned on.

The potential for webometric studies is obvious, it would seem as though even the most technologically incompetent of us can now simply collect longitudinal data. For example, Google searches may be collected on a daily basis to see how the results or the number of hits changes…and once you have archived a page, it’s very simple to then embed the page:

It also has potential for bloggers; when they discuss a page or story bloggers can now be sure that their readers will have access to the page that they saw rather than an updated version. How content providers will react to the archiving of their content is yet to be seen.

August 14, 2008

August 6, 2008

Google Insights for Search: What next?

Filed under: Barack Obama,Google,Insights for Search,webometrics — admin @ 1:17 pm

In addition to Google Trends, Google are now offering Google Insights for Search (http://google.com/insights/search/#)(via TechCrunch). Not only can you filter the terms by category, for example helping to distinguish between Apple (Computers & Electronics) and apple (Food & Drink), but it will also give a nice visual representation of the geographic data.

We can now quickly see that the Iran is the country most interested in webometrics:

The maps also offer a whole new type of vanity searching. The “David Stuart” brand has yet to make major inroads in Africa, Asia or South America. I was grateful, however, to find that my own vanity searches had not overly effected the results (at a city level London is the hub rather than Wolverhampton).

Some bloke called Barack Obama, on the other hand, seems to have made inroads all over, with the exception of the Middle East.

The obvious question, based on the directory structure of the Insights for Search URL (http://google.com/insights/search/#), is what other insight services are Google going to offer? Insights for Maps? Insights for Shopping? Insights for News?

Webometricians are NOT Web Celebrities!!

Filed under: Celebrity Meter,Wired,webometrics — admin @ 9:45 am

When it comes to being a web celebrity, it is not surprising to find that webometricians are near the bottom of the pile; a fact I blame on our spending too much time counting other people’s links rather than creating content worth linking to. Anyway, Wired have created a nifty little application (highlighted by Media Futurist) that can help you determine your ‘web celebrity’ score by using data from Google’s Social Graph.

At the moment it only bases your score on MySpace, Twitter, and your blog/web site, so your score depends a lot on how much you use these sites; my thousands of Facebook friends and hundreds of delcious bookmark followers mean nothing. Nonetheless, true to Webometric Thoughts fashion, a comparison of the three main webometrics blogs/bloggers(only using their twitter and blog addresses):

Holmberg’s Oh what a tangled web we weave… :
2 (twitter) + 4 (blog) = 6
Thelwall’s Webometrics Blog :
10 (twitter) + 15 (blog) = 25
My Webometric Thoughts:
6 (twitter) + 7 (blog) = 13

To give these numbers a bit of perspective, Barack Obama’s current ranking is 9,069 (4,509 without MySpace). Thelwall may have won this battle, but we are all losing the war. It would be interesting to see, however, how the Celebrity Meter compares with a qualitative evaluation of web celebrity, such asForbes’ list of the top 25 web celebrities.

Whilst ‘web celebrity’ is just a bit of fun, it does show the potential of the Google Social Graph data, and as far as I am aware no webometrician has used it to any practical purpose yet.

July 25, 2008

A Webometric Thesis

Filed under: link analysis,webometrics — admin @ 10:33 am

The finishing of a PhD is more of a whimper than a bang. It has been seven months since I handed in my thesis, and despite having had only the most minor of revisions (total time approximately 4hrs), I have only just received the certificate for my masterpiece:

Whilst there are often complaints about the inability of government to work as effectively as ‘the marketplace’, we should all be grateful that academia is not in charge of the country; nothing would happen for years on end.

As many weeks have also passed since I sent my thesis to the University’s electronic repository, and it still hasn’t appeared online, I have decided to put it online myself.

Web Manifestations of Knowledge-based Innovation Systems
Innovation is widely recognised as essential to the modern economy. The term knowledge-based innovation system has been used to refer to innovation systems which recognise the importance of an economy’s knowledge base and the efficient interactions between important actors from the different sectors of society. Such interactions are thought to enable greater innovation by the system as a whole. Whilst it may not be possible to fully understand all the complex relationships involved within knowledge-based innovation systems, within the field of informetrics bibliometric methodologies have emerged that allows us to analyse some of the relationships that contribute to the innovation process. However, due to the limitations in traditional bibliometric sources it is important to investigate new potential sources of information. The web is one such source. This thesis documents an investigation into the potential of the web to provide information about knowledge-based innovation systems in the United Kingdom.

Within this thesis the link analysis methodologies that have previously been successfully applied to investigations of the academic community (Thelwall, 2004a) are applied to organisations from different sections of society to determine whether link analysis of the web can provide a new source of information about knowledge-based innovation systems in the UK. This study makes the case that data may be collected ethically to provide information about the interconnections between web sites of various different sizes and from within different sectors of society, that there are significant differences in the linking practices of web sites within different sectors, and that reciprocal links provide a better indication of collaboration than uni-directional web links. Most importantly the study shows that the web provides new information about the relationships between organisations, rather than just a repetition of the same information from an alternative source. Whilst the study has shown that there is a lot of potential for the web as a source of information on knowledge-based innovation systems, the same richness that makes it such a potentially useful source makes applications of large scale studies very labour intensive.

Obviously the above abstract will have all but the greatest dullard champing at the bit, and I have therefore made it available in both PDF and Word Document formats.

June 10, 2008

Is the web linguistically on the left or right?

I am currently in the middle of reading David Crystal’s (2006) ‘Language and the Internet’, an interesting book that, when it started mentioning style guides, got me wondering about whether style guides could be used to determine whether the UK web space was politically on the left, or on the right. The leading broadsheets from both sides of the political debate have publicly available style guides (i.e., The Telegraph and The Guardian), and the differences could be used for the basis of such a linguistic-webometric investigation.

My personal favourite style guide section is The Telegraph’s Banned Words. Whilst the banning of terms such as ‘Europhobe’ have obvious political motivations, you have to wonder whether it was really necessary to explicitly ban referring to ‘perverted Scout leaders’ (Whilst Google Trends does not show the phrase to be endemic, that may be because of the Telegraph’s quick action). It is interesting to note, however, that despite the Telegraph’s authoritarian values, they seem seem to be very lax with their own language, the supposedly banned ‘mass exodus’ was used only a few days ago. Surely there will be letters to the editor!

Unfortunately these days search engines try to be helpful, and ignore many of the differences. For example, ‘Yahoo’ and ‘Yahoo!’ are both treated as the same, when any fool would know that the exclamation mark reflects the searching for more conservative opinions on the search engine. It would be nice to be able to turn a search engine’s ‘helpful’ features off occasionally.

« Newer PostsOlder Posts »

Powered by WordPress