One of the hundreds of posts in my feed-reader this morning was about the British Library electronic theses service (via SCIT blog). As my own thesis should be included I decided to indulge in a bit of vanity searching. Result: EThOS has a long way to go.
I would expect my thesis to turn up for the term ‘webometrics’, in fact it is about the only term for which someone might actually want to read it. Unfortunately the only webometric thesis belongs to Xuemei Li:
My thesis does however turn up for the wholly inappropriate ‘bibliometrics’:
Seemingly the reason for my appearance under ‘bibliometrics’ and not ‘webometrics’ is that ‘bibliometrics’ appears in my abstract whereas ‘webometrics’ does not. Whilst this may seem reasonable at first, theorectically the University of Wolverhampton are taking part in the project and their record includes a number of keywords carefully selected me, including ‘webometrics’. The British Library also fails to provide a link to my thesis, despite it being scattered over the web like confetti: “Not yet available for download”.
Young academics brought up on Google Scholar, with full text searching and links to the numerous copies on the web, are unlikely to see the value in EThOS and its traditional OPAC style. Whilst I’d like to see an electronic thesis online service that seperates the wheat from the chaff, with full text searching and links to the documents, and believe that librarians could aid in retrieval with classification of such documents, this is not what EThOS is currently offering. It’s still in Beta, and likely to improve, but it has a frighteningly long way to go and you do wonder whether they should have buddied up with one of the big search engines to produce a more user friendly version.
Archiving the web is massive job, and whilst the Internet Archive does as good a job as can be expected from a single centralised organisation, there really is a need for better national web archives. I am brought to the beginnings of a little rant by ResourceShelf highlighting Canada’s new government web archives. Whilst I am sure that this will do a great job of archiving the Canadian government’s web sites, it is a drop in the ocean of the number of Canadian web sites that could and should be kept, and seems an awfully long time coming.
However the British really don’t have a leg to stand on when it comes to complaining about archives, as our own archive is a particularly sorry affair, based on the crawls of less that 3,000 web sites. Rather than following the route of the Canadian web site and covering a particular domain exhaustively, it chooses instead to select various sites (with permissions of web owners) of interest to the consortium members. As they say themselves “there is a danger that invaluable scholarly, cultural and scientific resources will be lost to future generations”, personally I feel these archives do little to stop the vast majority of resources being lost.
We wouldn’t accept a national library that contains such a pitiful selection of books, but somehow we allow such pathetic web archives to continue. In the UK I would like to see:
-The British Library given the right to copy every web page, in the same way as it has the right to a copy of every book (there should be no need to ask for permission and selection misses too much).
-It should be provided with the expertise and money to archive the whole of the .uk domain.
-If necessary, to appease those who confuse the public world of the web with the private mutterings of a conversation and thoughts of a diary, it should allow pages to (on request) be deep-archived* for a period of time rather than permanently deleted.
Whilst I am sure that institutions such as the British Library are trying to improve the UK’s web archive, the current outputs seem remarkably underwhelming for a supposedly rich nation at the forefront of the world’s knowledge-economy.
*deep-archived…I couldn’t think of a term to describe something that was in the archive but not public, private seems to have different connotations.