Wednesday, April 30, 2008
Google Code Search lets you search for source code files by licence type, so of course I was interested in whether this could be used for quantifying indexable source code on the web. And luckily GCS lets you search for all works with a given licence. (If you don't understand why that's a big deal, try doing a search for all Creative Commons licensed work using Google Search.) Even better, using the regex facility you can search for all works! You sure as heck can't do that with a regular Google web search.
Okay, so here's the latest results, including hyperlinks to searches for you to try them yourself:
And here's a spreadsheet with graph included: However, note the discontinuity (in absolute and trend terms) between approximate and specific results in that (logarithmic) graph, which suggests Google's approximations are not very good.
Okay, so here's the latest results, including hyperlinks to searches for you to try them yourself:
- all (by regex: .*) : 36,700,000
- gpl : 8,960,000
- lgpl : 4,640,000
- bsd : 3,110,000
- mit : 903,000
- cpl : 136,000
- artistic : 192
- apache : 156
- disclaimer : 130
- python : 108
- zope : 103
- mozilla : 94
- qpl : 86
- ibm : 67
- sleepycat : 51
- apple : 47
- lucent : 19
- nasa : 15
- alladin : 9
And here's a spreadsheet with graph included: However, note the discontinuity (in absolute and trend terms) between approximate and specific results in that (logarithmic) graph, which suggests Google's approximations are not very good.
Labels: ben, free software, licensing, quantification, search