Tuesday, April 22, 2008

The Long Tail

James asks, "What does "long-tailed" mean?"

The "Long Tail" is the right-hand side of a graphed power law function, which emerges when a few items have a high value and many many items have a low value. Think of sales of popular music or books, or frequency of words in written English.

More to the point, think of Amazon.com vs. Wal-Mart's book section. Because inventory costs are low, Amazon can afford to sell very infrequent copies of rather obscure books - those that occupy the Long Tail. Low inventory and delivery cost means the threshold for commercial viability is drastically lower and more creators can get a (tiny) slice of the action.

Between now and the future

Jessamyn West links to a good comment from publisher Tim O'Reilly about how technology is upending business models:

I don’t doubt that in the long run, there will be new long-tail economic models that support investment in specialized forms of content that don’t have the volume to be supported by advertising, but we’re heading for a really tricky period where the old models will be dead before the new ones have arrived.

Agreed. I'm fairly blase about how the publishing business will shake out, eventually. But in the meantime, the growing pains will be rather intense - something the music industry is experiencing and which the Kodak Company wisely foresaw and prepared for.

Clinton is seduced by the dim side

Or at least that's what she wants us to believe.

Oops, that would look like evolution

Lizards Rapidly Evolve After Introduction to Island

Sunday, April 20, 2008

Patrick wins!

Danica Patrick won her first Indy Car race in Japan Saturday night and I didn't get to see it. Originally scheduled for ESPN2, the race was postponed 22 hours for rain and broadcast on ESPN Classic. ESPN f*&^'in' Classic! That's where you go to see replays of thirty-year-old Banana Bowl games and Comcast doesn't include it (or the BTN) in their basic package.

I'm glad this finally happened. Congratulations to Patrick and her crew.


So I'm back from MAC - the Midwest Archives Conference - in Louisville. Had a good time (did the earth move for you, too?) and learned a few things, met a few people, and ate some terrific brownies.

If there's a common thread, it's to make me appreciate going through the SI program at Michigan when I did. Much of the emphasis on emerging technologies seemed to me ... well, very familiar and rather obvious. SI puts a lot of emphasis on information technology and its effects on storage and retrieval, so it was slightly surprising to realize that some of the folks with less recent training might find some things like Web 2.0 unfamiliar.

I also have to appreciate SI's emphasis on presentations - in most classes, I've had to give a 10-30 minute presentation with Powerpoint slides and get critiqued by my instructor and my peers. And I've sat through innumerable classmates' presentations and done the same for/to them. There were some great presentations in Louisville, but I also saw and heard a few that resembled what I was doing two years ago.

Some highlights:

Matt Blessing from Marquette, Helmut Knies from the Wisconsin Historical Society, and Roland Baumann from Oberlin, spoke about turning your collecting policy into actuality. Much of this session focused on the diplomatic art of cultivating donor relations. I'd heard much of this before in SI's archives practicum lectures, but it's always a good idea to hear it more than once - especially when you're a new archivist who happened to badly lose his last game of Diplomacy.

Cynthia Miller of the Henry Ford Museum led us through several exercises in selecting "The Useful Ten Words of the Ten Thousand" that might describe a photograph. I've heard Cynthia speak before (at the aforementioned practicum), but I picked up a useful rule here. Instead of asking whether a term might describe a photograph, she asks, Would a user who was searching on that term be glad that this photograph turned up? That sounds like a rule that's likely to narrow the list of good terms you might choose for your description of a photo and I'm going to employ that when cataloging my own photos (which is itself a rather exhausting project that I may never finish). Also, you need to consider the terms that are useful to your particular repository as well as those useful to a researcher. Standard references for controlled vocabulary terms are the Thesaurus for Graphic Materials (TGM) and The Getty Art & Architecture Thesaurus Online. When describing objects, Cynthia prefers the Getty to TGM.

Leah Broaddus of SIU-Carbondale gave an assessment of the University of Illinois's open source Archon software, which provides archival organization and generates a searchable public web site. Sounds like a cool product that any small repository might be grateful to learn about. In the same session, Chris Prom of UI at Champaign-Urbana described his experience using Google Analytics, also free ("I am at least as cheap as Leah," said Chris), to better understand how internet searchers are using one's website. One key insight: people who are searching on a topic through Google will reach the topic page directly. That beautiful home page that you spent so much loving attention on? Most people never see it. It's important that your deeper pages not be dead ends, that they fully identify your institution, and that they indicate where in the overall website structure that this page is located.

Finally, we closed out on a discussion of Web 2.0, beginning with Kevin Leonard's YouTube video, Beth Yakel's discussion of what UM has learned through the Polar Bear Expedition Digital Collections, and Kevin Schlesier's talk on community collaboration with physical exhibitions - again, this is all about diplomacy.

Again, I had a great time at MAC. Everything went smoothly and there was a lot to learn, not least the fact that I really have learned a lot at SI. I don't know whether I'll be staying in the Midwest, as I'm open to moving almost anywhere, but MAC is well worth attending, wherever I might be.

Galileos are rare

Via The Island of Doubt comes this comment from Michael Tobis of Wired:

I'd like to caution especially my younger readers that you may be very smart, but you should assume that you are making a mistake if you find yourself thinking you are smarter than every scientist in the world put together. A feeling like that is wrong a million times for every time it's even half right.

Adolescents who fall into this trap too spectacularly have a hole to dig themselves out of. It's not a great way to enter adulthood, having been spectacularly and publicly wrong, but youthful indiscretions are often forgiven or forgotten.

If you have such a feeling of superiority too strongly as an adult the world will not treat you kindly. You will almost certainly be wrong, wrong in the sense that 2 + 2 = 5 is wrong. Most likely you will be called a crackpot. It's suprisingly common to be possessed by this feeling of superiority but it is usually tragic. Science is a team sport.

Or, as Oliver Wendell Holmes put it,"You may have genius. The contrary is, of course, probable."

Friday, April 18, 2008

Consider yourself warned

Louisville, KY, Waterfront Park

I got rocked!

I'm in Louisville for the Midwest Archives Conference and got awakened last night by a jolt on my bed. I thought, Man, that feels like an earthquake, but it seemed rather unlikely. Unlikely or not, that's what happened.

[Update: the earthquake has been downgraded from 5.4 to 5.2. I thought it didn't quite feel like a 5.4 ....]

Wednesday, April 16, 2008

Web archives, pt 3

ArchivesNext has the NARA response to their announcement that they won't be taking a web snapshot at the end of the current election cycle. It does sound like poor use of resources for NARA to do this, as their focus is mainly on documenting the functions and programs of the agency. The confusion between various kinds of records is touched on in these guidelines.

In the archives

Evidence from the archives of the firm that built Titanic is used to bolster a theory of poor materials contributing to her sinking.


I probably won't venture too often into politics and economics, but this, from John McCain - wow. Just wow. In an attempt to pander to all parties, McCain is embracing the worst part of the Republican viewpoint - supply-side economics - and then jettisoning the only thing they ever do seem to understand, namely, market forces.

We've practiced supply-side for twenty of the last twenty-eight years now and the correspondence with ballooning deficits is pretty darn close to 1.0. The only time economic growth has conjured up enough cash to avoid heavy borrowing is when - no, can it be? - when the Democrats shifted taxes back to the wealthy and raised the minimum wage. Every Republican in Congress voted against the 1993 budget, predicting economic meltdown. Oh, to have such hard times again!

Then, when it comes to dealing with rising energy prices, McCain wants to artificially boost consumption! Oh, that'll fix everything. I agree that gas taxes hit the poor and working classes the hardest. But you could address two problems at once by shifting the tax burden back towards the people who can afford it, but preserving high gas prices. Even if the working stiff can afford it, he'll still be more tempted to economize his driving for $3.50 per gallon than he was at $1.50.

Now all McCain needs to do is propose a hike in payroll taxes and his journey to the dim side will be complete.


A set of interesting articles, from Science Daily, on developing more human-like robots:

"Robotic Minds Think Alike?"
"First Humanoid Robot That Will Develop Language May Be Coming Soon"
"Intelligent Software Helps Build Perfect Robotic Hand"

The first two are especially interesting, continuing the theme of trying to mimic human cognition rather than building on the calculation that computers have always done well.

Tuesday, April 15, 2008

Expelled exposed

The ScienceBlogs people are asking for more links to Expelled Exposed, hoping to boost its Googlebility. I'm sure the contribution of my pitiful little blog will be the fuzzy value* "about zero," but here it is, anyway. I hope that some misinformed, but genuinely curious, people might find their way to this site.

*Yes, I'm reading up on fuzzy logic just now. Not sure why, other than pure curiosity.


Heather sent me this link about police who may have had too rigid a concept of "order." It made me want to read "Shooting the Elephant" again, which reminded me that I've been wanting to reread Kim, and since it was on the same shelf I just had to take The Jungle Book, which I've never read.

At some point it occurred to me that "Shooting the Elephant" was written by Orwell, not Kipling. I don't know why I always forget that.

[And, later yet, I realize the correct title is "Shooting an Elephant." Sigh.]


You see a lot of t-shirts like this at the University of Michigan. Each department has their own, done in exactly the same style. So you'll see "Michigan Engineering," "Michigan Social Work," etc., but you have to look just a little closely to see which field the wearer is boasting.

My favorite, though, has to be the one I saw this morning which reads "Michigan Undecided."

Sunday, April 13, 2008

Web archives, part 2

Someone at NARA has posted a comment at this blog responding to the accusation that web pages are being lost to history. The writer points to this partnership with North Texas Libraries to indicate what is actually being done to preserve web documents, while criticizing the quality of the program that has just been discarded. The comment turns into a rant toward the end, but makes some good points along the way.

NARA has extended their traditional records scheduling to include internal e-records, but it's not clear to me that this includes web pages. Libraries have always collected government documents, not so much to preserve them for all time, but simply to make the information available at all to the public. I don't see public libraries having a strong motivation to archives web pages, but I should think archives and academic libraries would.

This was too fun not to post:

Saturday, April 12, 2008

Web archives

NARA is coming under fire for jettisoning their practice taking snapshots of federal websites every two years. Their argument is that the agencies are already archiving their websites more frequently than that, so it's a waste of time and money to do a less thorough copy of the same thing.

What I didn't get out of this writeup is NARA's role in ensuring that the agencies do their own archiving, maintain good standards, and guarantee their preservation. That's what a records management program is supposed to do, because you don't want to leave it all to a few thousand individual managers who have little expertise and even less incentive to maintain inactive records.

Thursday, April 10, 2008

Time capsule

The 49-year-old YMCA building in Ann Arbor is coming down and the destruction workers found an unexpected time capsule.

Fact v. fiction

Philippe Sands has an article in Vanity Fair concerning the "torture memos" and the responsibility of White House officials for installing the "aggressive" interrogation regime. But there's another aspect, which Sands especially describes in an in interview with Democracy Now!, that I found telling:

PHILIPPE SANDS: I went back. I spoke with others, including Diane Beaver again and Mike Dunlavey, and went into great detail. And it turns out, as she described it to me, the TV program 24 had many friends down at Guantanamo. And the timing is fascinating. The abusive interrogations started in November 2002, just three weeks after the start of the second series of 24. And it seems that there was a direct connection between that program and the creating of an environment in which individuals felt it was permissible to push the envelope, as it was put to me.

AMY GOODMAN: And what did the lawyers say about 24?

PHILIPPE SANDS: Diane Beaver no longer feels able to watch 24. I mean, she told me she recognizes now that this is extremely problematic, but that 24 was being broadcast into Guantanamo by cable television. The first series ran throughout 2002, and they were active viewers. It was an extremely popular program.

Which might not be such a big deal, except that at the debates for the GOP nomination, torture was presented in exactly the terms of a '24' script:
The candidates also were asked to respond to a hypothetical scenario — homicide bombings at three shopping centers near major U.S. cities. With hundreds dead and thousands injured, a fourth attack is averted when the attackers are captured off the Florida coast and taken to Guantanamo Bay to be questioned. U.S. intelligence believes another, larger attack is planned and could come at any time. How aggressively should the detainees be interrogated about the where the next attack might be?
And if you've been holding someone for a month and are just getting frustrated? Uh, we won't consider the more realistic scenario.

So there's Ronald Reagan invoking the spirit of John Wayne, the election of Arnold Schwarzanegger in the hopes that he would be a political Terminator, and the brief infatuation with an actor from 'Law and Order.' What is it about fiction that conservatives just don't grasp?

Tuesday, April 8, 2008

Reconstructing unwritten history

The NYT Science section has a nice article on some recent thinking about the Anasazi. About their mysterious migrations, the writer says:

"Scientists once thought the answer lay in impersonal factors like the onset of a great drought or a little ice age. But as evidence accumulates, those explanations have come to seem too pat — and slavishly deterministic."

This touches on the problem of granularity, which I struggled with while studying history in grad school. In essence, if you look closely at a historical process or event, everything seems very chaotic and unpredictable, with unnumerable chances for things to have gone differently. As you step farther and farther back, certain things take on an air of inevitability and those small decisions seem to matter less and less.

The less information you have - which often means, the longer ago we are studying - the more those big forces (like climate change) seem to dominate. As you fill in details (personality conflicts, random events), contingency takes center stage. Rather like gravity and quantum mechanics. Somewhere those two worlds intersect, but I never had a solid grasp on balancing them against each other.

Popline, continued

More on the Popline case, where administrators at Johns Hopkins blocked searches on "abortion" and related terms. USAID claims that they never asked for this action and Popline overreacted to some complaints about specific articles. This is probably true, but I don't consider it an innocent mistake. It's an indication of how far the expectation of censorship has gone in this country, when people begin to censor themselves even more than the government demands of them.

Thursday, April 3, 2008

Caving to those who want to hide information

I wasn't familiar with POPLINE, "the world's largest database on reproductive health," until now. It's maintained by the INFO Project at the Johns Hopkins Bloomberg School of Public Health/Center for Communication Programs and is funded by the United States Agency for International Development. (USAID). Now, the folks running USAID are strongly anti-abortion, and apparently the folks at the INFO Project would rather play ball than search for other funding.

I especially like the "I hope this helps" at the end of the reply. No, denying access to health information never helps. Not a bit.

And no, it's not an April Fool's joke at all. You really can't find a single article at POPLINE if you search on "abortion."