Tag Archives: google

What price a byline? (Or: what’s wrong with Knol)

A reader criticised my frequent referencing of Wikipedia in my last post, on the basis that everyone knows what WP is and that indeed some of us have Firefox extensions[1] to make quickly consulting it easy. I admitted he had a point, prompting another reader to protest that it doesn’t matter where the links go to, as long as they’re informative and well-written. The degree to which they were both right was strikingly indicative of how far WP has come. Given that it’s so often the first search result on Google for a huge number of queries, making explicit links to it can seem like adding links to dictionary.com for longer semantemes[2]. And the reason I reference it so often is that its collective writing style and usual accuracy is ideal for a quick introduction to what may be unfamiliar ground.

But its status as the #1 go-to place for so many Google queries didn’t go unnoticed in Mountain View. Yesterday Google finally released their long-in-development About.com-mimic Knol. A “Knol” is an unnecessary neologism coined by Google to mean a “unit of knowledge”, but seems the basic idea is to compete with Wikipedia on the authoritative content front, by meeting one of the oft-heard (albeit not so much anymore, if only due to exhaustion) criticism of WP: that you can’t trust it because you don’t know who wrote it. Knol’s point of differences with WP are then as follows:

  • You can have more than article on a topic.
  • Articles are signed by their authors
  • Advertising will be displayed, and it will be split with authors.
  • The level of collaborative editing allowed on each article is controlled by the author, as is the licensing.

I’ve been reading through a few of its articles, and what’s striking me is what they’ve lost by not having universal editing. So often WP was compared to the Encyclopedia Brittanica. Knol tries to compromise between the two, but in doing so completely erodes the role of the editor. The person who doesn’t actually write the content, but polishes it to a publishable standard, and makes it consistent with the rest of the corpus of work. Today’s featured Knol is on Migraines and Migraine Management. It’s written by a neurologist, so you know it’s authoritative, and it doesn’t allow public editing, so you know it hasn’t been tampered with.

But compare it with WP’s article on Migraines, and you’ll see just how needing of an editor it is. It’s written as if it’s intended for paper, with non-hyperlinked cross references “Migraine is defined as a headache that [TABLE 2]:”. “[TABLE 2]”, is a JPEG image at reduced size. There’s no reason for that and not an actual HTML table. (Additionally Google, there’s no reason for the images to be inline with the content like that. Consider a Tufte-like layout, where the tables and references and footnotes can go out to the side).

Throughout Knol you’ll find all sort of bad design practice. I swear I saw image hotlinking in one article before. But in particular, a lot the seed articles seem to be HTML dumps of articles already written by medical professionals, like this one. It’s closed collaboration, so unlike WP, you can’t just drop in and quickly format that into something presentable (at present there’s no change in style, the intra-page headings are just capitalised, there’s an odd amount of whitespace, and the page title itself isn’t capitalised).

There’s two big surprises here, given that this is a Google project, and how long it’s been in development there. And if they don’t fix this, I fear an epic failure.

The first is that they’ve provided such an unstructured writing environment. If you’re trying to create a body of high quality written material, there are ways you can structure your editing environment so that content conforms to certain styles and expectations. It’s particularly in Google’s interest to do so, since as they keep telling the SEO world, well-structured HTML and documents are easier for them to index and search. And yet Knol’s featured Migraines article has swathes of tabular content in the un-indexable, un-accessible JPEG format.

The second is much more subtle and can’t be fixed with as much of a technology patch as the first can. Google have failed to realise that often the most expert authors are going to be simultaneously the least equipped to properly format and polish their own documents (whether it beĀ  due to lack of technical skills, or time), and also the least willing to submit their work to editorial changes from the unwashed anonymous masses. The fix for this I think will involve a recognition and separation of the two types of editing that happen on Wikipedia: authoring or fixing of content; and editing for quality control (fixing grammar, spelling, style, adding useful metadata to a document). Then build a system to acknowledge the good editors, not just the good authors. Then encourage authors to allow editorial changes from recognised quality editors. In fact, drop the “closed collaboration” option altogether.

This is even harder than getting good quality content in the first place. Writing is glamorous. Editing isn’t, but it’s so very important. Knol’s only got half the problem solved.

[1] Certainly one of my favourite extensions is the perenially useful Googlepedia, which remixes your Google search results to embed the first returned WP article on the right (It’s particularly nice on widescreen monitors).

[2] So it’s not a directly applicable synonym of ‘word’, but it was the best the thesaurus could give me

Tact and keywords

The rise of the laser-beam-narrow targeting allowed by Google AdWords and AdSense has led to some interesting uses. It’s also led to accusations of insensitivity on Google’s part, who explicitly point out they don’t exercise human editorial control over ad placement. But Cameron showed me one last night that leaves me feeling slightly odd, and this one isn’t Google’s “fault” as much as the advertiser.

Campbell Live ad on Google search results page

For the non-NZers, the biggest news item this week here has been a quite tragic accident where six students and a teacher from a high school, Elim Christian College, were swept to their deaths after a flash flood during an outdoor exercise in a gorge. What you see above is an ad using the school’s name as a keyword for a Campbell Live, an evening TV news/interview show, or rather their specific portal page for the subject.

I can see why they did it. They may have done it automatically even, with some system to buy up keywords on common phrases in hot stories, and part of me thinks it’s a good idea. But I still can’t help feeling that this a somewhat tasteless use.

Google’s new broadside against AWS

Apparently Thomas Watson of IBM never actually said in 1943 that the world market only had room for 5 computers. Still, the misattribution’s been favourite fodder for years on lists of short-sighted predictions, along with Bill G’s equally misattributed “Nobody needs more than 640K of memory”.

The funny thing about history is how we’re now at a point where people are actually regarding that first nonquote with fresh regard. Sun’s John Gage, one of their original employees, once famously said that the network is the computer, and in this regard Watson’s nonquote starts to make some sense. To be more specific, substitute “network” with “distributed computing platform”. The idea is simple. Only a few companies have the resources and expertise to maintain an international-scale computing environment that applications can scale across to meet the gigantic range of demand the internet can provide. It’s also the source of a compelling business model to the potential owners of such “computers”.

Amazon Web Services have been the biggest and most prominent push in this direction for some time. Sun did come up with their Sun Grid product, but it was a dud by most accounts. Why? Because it wasn’t really connected to the internet. AWS (by which I primarily mean the EC2 computing services and the S3 storage service) are oriented all around supporting web applications and rich internet applications. They recognised the value in providing a service that small developers can build on with a reasonable expectation that should they hit the ball out of the park, that they’ll be able to handle any surge in traffic without going into the red ink for three years to come.

It was always strange that the world’s most famous distributed computing platform, Google, not be a fore-runner in this game. But that’s changed now. They’ve arrived with a flash and a bang. Scoble has videos of the launch, but the bare facts seem pretty cool. A Python environment with access to a storage service based on BigTable, and free accounts. The accounts are limited to 500MB of storage, 200 million megacycles/day CPU time, and 10 GB/day bandwidth, with the obvious business plan being to provide scaling resources beyond that for a fee.

Potentially it’s a huge announcement for web developers, and for Python. Google has a very strong brand in when it comes to reputed distributed computing power, and fears of platform lock-in are mostly eroded by the open tools architecture. It would require a rewrite to move an app from the GAE to your own servers (unless you thought about everything closely up front), but it wouldn’t be a huge one. It’s WSGI compliant, and it even comes with Django built-in. The only really unique part about the platform is their GSQL language, which is an acceptable change from the norm since BigTable isn’t a row-oriented database. That and some other features like Google Accounts (which they should really hurry up and turn into an OpenID service) integration, which is of secondary value to the average developer.

It’s limited availability and the first 10000 accounts have already been snapped up (and I missed out :-(), but there’s an SDK available for playing around with. Expect some nice experimental web apps in the next few months, afforded by the very low barrier of entry on this.