Archive Page 2

Search Engine Tips

Hakia, a semantic search engine, uses Yahoo’s BOSS and does a good job at neatly clustering results (e.g. try a search for a country) and also has a reasonable ranking – something many new search engines don’t manage well. By contrast, the much-hyped Powerset, another recent semantic search engine, is simply disappointing.

Cluuz aims at graphically clustering search results. So far, I have been sceptical towards visualizations of search results, because the mind-map kind of graphs (e.g. in Thinkmap or Quintura) are either rather trivial or useless. But Cluuz (or similarly, KartOO), cluster their sources rather than semantic characteristics. Not that this is entirely new – Clusty (previously Vivisimo; I’ve used the example in the post on information literacy vs usability) does it in form of a regular list – but the focus on sources is one I much appreciate, and showing relationships between sources might prove promising.

A last tip for today: While searchme‘s sequential presentation of search results is too playful for my liking, because I need a quick overview to make my choice what I want to take a closer look at. However a really nice feature is the integration of stacks which allows you to create visual bookmarks – something completely different, because it is not about seeking new, but rather showing known objects (cf Theresaneil’s extensive post on seek or show).


Introductions to the Semantic Web

The idea of the Semantic Web has been around for a long time. Tim Berners-Lee articulated the idea in his plenary talk of the first W3 conference in 1994 and published a famous article in Scientific American in 2001. Basically, the idea of the Semantic Web is to give data a format in which computers can process their meaning. However, the idea hasn’t really picked up speed in all those years. This has changed recently, and a series of articles (incidentally published on netzwertig, a blog produced by Zeix’ sister company Blogwerk) explains the nuts and bolts of the Semantic Web in plain German. Part 1 gives a general introduction (Semantisches Web Teil 1: Was steckt hinter dem Begriff?), part 2 explains the technical background (Die technische Umsetzung) and part 3 gives practical examples (konkrete Anwendungsbeispiele).

I’ve been looking for English equivalents to these articles. The most similar I’ve come across so far is Read Write Web’s The Road to the Semantic Web which explains why the Semantic Web could be important to us («The promise is that we will be doing less of what we are doing now – namely sifting through piles of irrelevant information») next to giving a a short introduction to the data formats used for computer processing. Two later articles on the same blog go into more detail why the implementation of the Semantic Web is proving so difficult. Not only is the technical background hard to understand (Difficulties with the Classic Approach), but transforming the data into a computer readable format is a lot of work, and so far, the reward of the market for taking pains to do this has not been given (Top-Down: A New Approach to the Semantic Web).

The many comments to «Difficulties with the Classic Approach» show that most people attribute the lack of success of the idea of the Semantic Web to the difficulties of its practical implementation: «It’s way too technical and scientific and not really practical for the mere mortal». Albeit its capital S, the subject of semantics plays a minor role in the discussion. And here, in my opinion, lies the crux of the matter: For highly structured and standardized data as for addresses, people, books etc., the corresponding metadata are simple to generate from the semantic point of view, and accordingly, these are the areas in which commercial tools are evolving. For the rest of the contents of the web, the idea of mapping ontologies against each other is daunting at best. Comment No. 17 to the post mentioned above gives some practical examples of the difficulties encountered even with structured data. And another trend on the web, tagging, exactly takes the fuzziness of meaning into account, particularly stressing the importance of connotations, i.e. emotional associations with words, for human beings. I hope to deal with that subject on this blog soon.

Don’t make me think?

The article Is Google making us stupid? published in The Atlantic examines the way the Web influences the way we read – and think. Many users have the impression their attention span for reading longer articles and books has decreased with the use of the Web, and the author supports this impression with results from neurological and psychological experiments. The way we skim information on the Web has an influence on the way our brain works, i.e.

We can expect […] that the circuits [inside our brain] woven by our use of the Net will be different from those woven by our reading of books and other printed works.

As a usability consultant, I feel I am not unaccountable for this development. We support the hectic, unconcentrated behavior of our users and proclaim rules for writing online texts quite contrary to what we learnt at school: Not to substitute nouns with synonyms to make a text more varied, not to use figures of speech or word-play (not to mention irony or allusions) because they are imprecise or ambiguous. Not only do we support the behavior the author of the article bemoans, but trends in artificial intelligence which treat terms open to interpretaton as bugs to be fixed. What is at stake is that

In the quiet spaces opened up by the sustained, undistracted reading of a book, or by any other act of contemplation, for that matter, we make our own associations, draw our own inferences and analogies, foster our own ideas.

So see for yourself if you are still able to concentrate on an article of over 4000 words (more than 5 A4 pages on my printer) peppered with historical examples in the best scholarly tradition. If you can, I’m sure you will enjoy – that it makes you think.

Zotero research tool

I admit I’m stretching the category of «information literacy». But I’ve just discovered Zotero, and I’m enthusiastic about it. It’s a Firefox extension to import, manage, annotate, tag, search for, export, … research sources. With one click, it imports bibliographic data from library catalogs as well as Amazon, Google Books and others, and the same functions can be used for capturing websites.

Is there a future for books?

A study of the reading habits of German produced some interesting results, e.g.

  • people spending much time on the Web read more books than others and vice-versa
  • people who read a lot spend more time reading than they did in the past (for which I couldn’t find a definition), while people who read little indicated they read even less, i.e. the gap between the «information rich» and the «information poor» is widening.
  • Harry Potter – not surprisingly – has a great effect on reading. Kids who read Harry Potter read significantly more books than those who don’t read this series.

The study (in German): PriceWaterhouseCoopers, Haben Bücher eine Zukunft? found on Infobib.

Metadata revival?

Metadata is a big thing with archivists and other people concerned with context, but I must admit that in all my professional years, I have never worked on a web project which actually used the Dublin Core Metadata set. The most probable reason that people don’t seem to bother much about metadata – at least in a standardized form – is that popular search engines don’t seem to take them into account. Or at least they didn’t until recently.

Let’s have a look at a (very!) brief history of search result designs:

Search engines I rememer from my early web experience returned results looking somewhat like this (our government portal hasn’t made much progress since):

Search result for passport on

Then up comes Google and introduces a design which has become pretty much standard:

Google Search for \

More recently, both Google and Yahoo have started introducing structured search results:

Yahoo Search Gallery, Country Profile Armenia

Search results are also increasingly shown in clusters based mainly on format (text in general, news, entries from encyclopedias; images, video etc.):

Yahoo India \

So adding metadata in some kind of standardized form does seem to be a recent trend for

  • clustering search results and
  • displaying search results.

Metadata provided by the creators of web sites are used for these displays. However, these metadata are explicitly not used for search algorithms, as an article on Yahoo and the Future of Search reports. Metadata provided by the creators tends to bias the outcome, and the analysis of broader text corpus by powerful search engines provides more signifcant results than metadata out of context.

Still, the increased use of metadata are pointing to interesting directions:

  • Search results are becoming more context-sensitive. Metadata help the user to choose the appropriate context, e.g. for disambiguation or clarification of a query. Search interfaces are taking the iterative nature of search into account and getting closer to the process of questions and answers users require to clarify their needs.
  • Possible actions after having found the desired content are beginning to be transferred to the search sites (search engines becoming portals may – or may not – be part of the development). Users can view details, maps or reviews, check opening hours, buy tickets or conduct site-search without having to leave the search results page. This is enabled by a deeper integration of applications into results.

Google Search for NASA

Site-search from search results page

Making the complicated simple

My motto as an information architect:

Making the complicated simple (quotation by Charles Mingus)

From inspireUX – thanks for the hint, Jan.