Tuesday, April 08, 2008
Hi commons researchers,
I just did this analysis of Google's and Yahoo's capacities for search for commons (mostly Creative Commons because that's in their advanced search interfaces), and thought I'd share. Basically it's an update of my research from Finding and Quantifying Australia's Online Commons. I hope it's all pretty self-explanatory. Please ask questions. And of course point out flaws in my methods or examples.
Also, I just have to emphasise the "No" in Yahoo's column in row 1: yes, I am in fact saying that the only jurisdiction of licences that Yahoo recognises is the US/unported licences, and that they are in fact ignoring the vast majority of Creative Commons licences. (That leads on to a whole other conversation about quantification, but I'll leave that for now.)
(I've formatted this table in Courier New so it should come out well-aligned, but who knows).
Feature | Google | Yahoo |
------------------------------+--------+-------+
1. Multiple CC jurisdictions | Yes | No | (e.g.)
2. 'link:' query element | No | Yes | (e.g. G, Y)
3. RDF-based CC search | Yes | No | (e.g.)
4. meta name="dc:rights" * | Yes | ? ** | (e.g.)
5. link-based CC search | No | Yes | (e.g.)
6. Media-specific search | No | No | (G, Y)
7. Shows licence elements | No | No | ****
8. CC public domain stamp *** | Yes | Yes | (e.g.)
9. CC-(L)GPL stamp | No | No | (e.g.)
* I can't rule out Google's result here actually being from <a rel="license"> in the links to the license (as described here: http://microformats.org/wiki/rel-license).
** I don't know of any pages that have <meta name="dc:rights"> metadata (or <a rel="licence"> metadata?) but don't have links to licences.
*** Insofar as the appropriate metadata is present.
**** (i.e. doesn't show which result uses which licence)
Notes about example pages (from rows 1, 3-5, 8-9):
I just did this analysis of Google's and Yahoo's capacities for search for commons (mostly Creative Commons because that's in their advanced search interfaces), and thought I'd share. Basically it's an update of my research from Finding and Quantifying Australia's Online Commons. I hope it's all pretty self-explanatory. Please ask questions. And of course point out flaws in my methods or examples.
Also, I just have to emphasise the "No" in Yahoo's column in row 1: yes, I am in fact saying that the only jurisdiction of licences that Yahoo recognises is the US/unported licences, and that they are in fact ignoring the vast majority of Creative Commons licences. (That leads on to a whole other conversation about quantification, but I'll leave that for now.)
(I've formatted this table in Courier New so it should come out well-aligned, but who knows).
Feature | Google | Yahoo |
------------------------------+--------+-------+
1. Multiple CC jurisdictions | Yes | No | (e.g.)
2. 'link:' query element | No | Yes | (e.g. G, Y)
3. RDF-based CC search | Yes | No | (e.g.)
4. meta name="dc:rights" * | Yes | ? ** | (e.g.)
5. link-based CC search | No | Yes | (e.g.)
6. Media-specific search | No | No | (G, Y)
7. Shows licence elements | No | No | ****
8. CC public domain stamp *** | Yes | Yes | (e.g.)
9. CC-(L)GPL stamp | No | No | (e.g.)
* I can't rule out Google's result here actually being from <a rel="license"> in the links to the license (as described here: http://microformats.org/wiki/rel-license).
** I don't know of any pages that have <meta name="dc:rights"> metadata (or <a rel="licence"> metadata?) but don't have links to licences.
*** Insofar as the appropriate metadata is present.
**** (i.e. doesn't show which result uses which licence)
Notes about example pages (from rows 1, 3-5, 8-9):
- To determine whether a search engine can find a given page, first look at the page and find enough snippets of content that you can create a query that definitely returns that page, and test that query to make sure the search engine can find it (e.g. '"clinton lies again" digg' for row 8). Then do the same search as an advanced search with Creative Commons search turned on and see if the result is still found.
- The example pages should all be specific with respect to the feature they exemplify. E.g. the Phylocom example from row 9 has all the right links, logos and metadata for the CC-GPL, and particularly does not have any other Creative Commons licence present, and does not show up in search results.
Labels: ben, Creative Commons, quantification, search