Print Encyclopedias Join Dinosaurs (Part 2)

by Michael Hart on April 8, 2008
News

In 1985 when Gary Kildall, IBM’s first choice before Bill Gates to design their PC’s operating system a few years earlier, came out with the first electronic encyclopedia, who would figure it would be only a quarter of a century before print encyclopedias faded from the limelight to join vinyl records and dinosaurs?

$999 would buy you an external Sony CD drive and Grolier’s CD– pretty much the same price as the paper encyclopedias, but with the option of putting any number of CDs in the drive.

This was only a year after the famous “1984” Super Bowl ad that ran only once and changed Super Bowl ads forever.

It was only a year after IBM offered the AT.

It was the beginnings of the second generation of big time home computers and the truth is that very few people realized what a huge difference this was all going to make.

I, personally, was working on putting Shakespeare online as far as my biggest public plans for the future.

In private, I was making 50 foot printer cables for the U of I, for $109, mostly because everyone had said it was impossible.

Note:

I should add here, for the benefit of one of the trolls I know, that I predicted back in the early 1980’s that the wide variety of very expensive printer cables would be replaced by a narrow, very narrow, variety costing only a fraction as much.

How did I know this?

I saw the same thing happen with hi-fi cables.

When my Dad bought our first hi-fi 55 years ago he had to hire, literally hire, an expert to come out to our house and make the cables that wired the various components together for $34 each.

This would be about $250 in today’s money!!!

Yet you can buy most stereo [his were mono] cables today for an average price of a couple dollars, though you can obvious get a gold, silver, nickel or whatever cable for much more. We could only get plain wires and connectors back then.

So it was obvious to me that the stunning array of cables would not continue, nor their stunning prices.

I was also charging $180/hour for computer consulting, as I was one of the only people anyone knew of who did both hardware and software consulting, which was a HUGE advantage, in an age when most users could not tell you whether a problem was hardware or software related.

It was a kind of “Golden Age” of computing, but expensive.

A full tilt IBM-AT or Mac system might have cost you $10,000 in dollars that was worth over $20,000 today, according to several of the “Inflation Calculators” available online today.

Can you imagine spending $20,000 on a computer today???

It is hard to spend $2,000 on a computer today, one hundreds of times more powerful, with 10,000 times the drive storage, color palettes with over 16 million colors, etc., etc., etc.

The average computer today sells for under $500.

Note:

I said hundreds of times faster, but. . . .

The first PCs ran at a few megahertz, today’s computers are now running a a few gigahertz, but you have to multiply that by the increased “word” size of how many bits get run per hertz. Most of todays computers run 64 bits, while the early PCs ran 8 bits and I seem to recall some that ran only 4 bits, but that was on computers earlier than are talking about here.

So, if you have a chip running at 2.44 gigahertz, that is half, literally by the clock, as fast as 1,000 times as fast as those first IBM-PCs that ran at 4.88 megahertz. However you have the “word” size at 64 bits, which is 8 times as much so the overall difference is 4 times as fast. Warning: not all programs will run at 64 bits, so your mileage may vary; and the same is true, sadly to say, for the multi-processor systems, not all programs run perfectly on those multi-processor systems, so your mileage will vary even more on those systems.

By the way, in researching this article I read a few articles a decade or two old about how computer speed could not keep going faster and faster, much as a mile runner who started 10 minutes per mile, and got up to 8 minutes per mile after a while, later
6 minutes per mile, could obviously NOT be expected to then get to 4 minutes and two minutes per mile.

Well, surprise, surprise, those pundits were wrong and the news of today contains the first chips running at 1,000 times speeds of the first IBM-PC in terms of clock rate, and which will have at least 8 times the bit count per clock tick, when they arrive over the counter in yet another impossible generation of CPUs.

Back To Encyclopedias

Once you the first CDROM encyclopedia was out, others followed, prices got very competitive, and “bundles” of software appeared with many computers offering the computer, hard drive and CDROM drive, and all the other components, and a stack of CDs all for less than the first $999 offer mentioned above.

I, myself, bought two such systems for about $700 each, with an assortment of CDs including; the Groliers, Microsoft Bookshelf, world and US atlases, a great multi-language dictionary, and an assortment of other programs.

That’s how much things changed in perhaps the next five years.

A few years ago I got the 2002 Britannica, brand new, alone, at Best Buy for about $17, combined with some other purchase.

Only a few of the most hard-boiled conservative products like a copy of the Oxford English Dictionary on disc still cost a real amount of money, but with Merriam-Websters unabridged available at $150 with CDROM, and the American Heritage for much less, or any of a handful of other dictionaries, the pressure is on, and Oxford may yet have to drop its prices to the normal range, and who knows if their famous print edition, the size of the larger print encyclopedias will then survive much longer.

A Little Un-Advertising

Most of the products mentioned above like to thrill you with an assortment of huge numbers about how many thousands of articles and millions of words they contain, but the truth is that their contents are not really measured in that many million words.

Let’s say a giant dictionary has 25,000 words and uses 100 word average entries per word, for 25,000,000 characters.

25 megabytes.

Gee, that original CD from Gary Kildall could hold over 500 meg without undue stress.

That would be 40 dictionaries of this size, plus space left for various pieces of software to enhance the process.

Today’s flash drives can give you 25 GIGABYTES for $100.

A thousand times as much storage, read/write, much faster.

Let’s move on to the huge multi-volume behemoths.

Let’s say you have a 25 volume extravaganza.

Let’s even say each volume has 1,000 pages.

Let’s say each page has 4,000 characters.

That’s 4 million characters per volume for 100 million total.

100 megabytes.

Again, who worries about 100 megabyte files these days?

Anyone with broadband probably downloads files much larger many times without being amazed at the result.

A Little More Un-Advertising

These reference sources try to pretend that you cannot download their entire database simply because it’s not feasible.

The truth is that millions of people download entire DVD images every single day, each one of which is many times larger than a copy of the online Britannica, or every word of any of the more weighty reference products.

Obviously it will take more space if they include pictures.

National Geographic would probably be the most intense example, yet they have packed their first 108 years into 36 CDs, at half a gigabyte per CD that would be 18 gigabytes.

OK, that would take a week to download but only because graphic files are bulky compare to text and because you are including a whole century of output.

However, if you ran that download in the background it might be only two weeks, and you would never notice the load.

Even more un-advertising would have to include that the US will have to be admitted to have pretty lousy bandwidth and that the 20 or so countries with better average bandwidth would allow an easier download than in the US, and for less cost per file.

Bottlenecking or Artificial Scarcity

The way the MBA Generation tries to keep the information flow a staggeringly low amount is to create artificial bottlenecks for the transmission of that flow.

Extending copyright is the most obvious one of these, because a public domain piece of information can be put online in numbers of ways so vast that bottlenecking becomes irrelevant.

The fact that the modern Britannica is sized on the same orders of magnitude as the world famous 11th edition of a century ago, is pretty much totally ignored in their publicity.

The fact that you could download the entire 1909 edition in the time it takes to eat a sandwich, is something they ignore.

Here’s The Truth, In Plain Numbers

Warning: These Are Large Numbers [You may want to stop here. . . .]

Let’s take the largest unit of information, Library of Congress, at least that’s the largest in “normal” use these days.

Let’s say the Library of Congress, or another of the other large libraries at the top of the world’s listings, has books numbered in the range of about 30 million, plus or minus a few million.

Let’s further say that each of these has 1 million characters.

This is being generous in the face of the fact that so many of a collection such as this might be better described as pamphlets.

However we must also consider weighty tomes such as Shakespeare, The Bible, or other similarly sized works at 5 million, or large novels such as Moby Dick at 2 million.

So let’s go with a million characters times ~32 million entries:

~32 trillion characters in The Library of Congress.

Remember, we are only considering the words, not the pictures.

Words compress very nicely in computer files.

You can get about 2.5 times as many words via “.zip” files or an example of many other similar compression formats.

This means an $80 500 gigabyte drive could hold 1.25 terabytes.

This means 10 of these at $800 could hold 12.5 terabytes.

20 of these at $1600 would hold 25 terabytes.

25 of them at $2,000 will hold 31.25 terabytes, or every word in the average one of the world’s largest libraries.

The Last Bit, or Byte, of Un-Advertising

OK, the average computer goes for under $500.

For $2000 you can add enough hard drive to hold a LARGE library.

Total cost of hardware: less than $2,500.

Then add in the cost of your high speed connection.

Calculate how many gigabytes per day you can download.

Warning: some places limit you to one gig per day.

Some wireless connection to 1/3 gig per day. . . .

Be careful to ask about his before you sign. . . .

At 3 gigabytes per day you could download 1 terabyte per year.

And have 10% left over for entertainment value.

It would take you a decade to download a major great library.

At 6 gigabytes per day you could download 2 terabytes per year.

Now only half a decade.

At 25 gigabytes per day you are down to around a year. . . .

It’s possible. . . .

And it will just get more possible every single year. . . .

Note:

At the same time I sent this out the first draft of this I had a note in my email advising me of 1T drives for $119. . . .Call it $120.

That cuts all the terabyte prices listed above by 25% and cuts the space, cabling, heat, etc., by HALF!!!

by Michael S. Hart
Founder, 1971
Project Gutenberg
Inventor of eBooks

If you liked this post, say thanks by sharing it.