Recently in Content Management Category

MarkLogic on Debian

| No Comments | No TrackBacks

I really like the MarkLogic product offering. Not just because it's fast and stable, but also because they offer a community license (other CMS and XML database vendors take note - if I can't try it, I probably won't recommend it to my company). So, I decided to install it on my Debian box today. Here's my story.

First off, I didn't have the alien or daemon package installed. You'll need those (among others), so make sure you have those installed. I'm not going to list every package you need. If you're using Debian, you can probably figure that out yourself.

  1. Download it from MarkLogic. I'm running on a 32 bit processor, so I downloaded the RedHad 32 bit rpm.
  2. Use alien to convert the rpm to a deb.
  3. Use dpkg -i debfile
  4. mkdir /var/lock/subsys
  5. Edit /etc/init.d/MarkLogic and change the line
    . /etc/rc.d/init.d/functions
    to
    #. /etc/rc.d/init.d/functions
  6. Start the server - /etc/init.d/MarkLogic start
  7. Follow the rest of the install guide
It's really easy, so you have no excuse for not using a fully functional XML database.


Day 2 and 3 at CMS Strategies

| No Comments | No TrackBacks

Day 2 and 3 at the content management strategies conference - quick hits.

The TEXTML team does a brilliant job of marketing their XML database as a CMS. It is, if you do a bunch of implementation work, but out of the box, it isn't a CMS.

DITA is starting to move to other communities, not just technical publication groups. Legal and other industries are just starting to move into it, but that's a great sign for the long term health of DITA.

Scott Wolff gave a very good presentation on the "5 Mistakes to Avoid when Buying a CMS". I won't steal his thunder (I'm sure he'll have the presentation on his web site), but here are the big things I took away from it

  1. Buy a hosted CMS if it meets your needs, if it doesn't, buy on-site CMS, if one fits your needs. If neither of those are possible, then, and only then, build your own.
  2. Start small, and build. Implement a CMS for one group in your company and then add other groups over time.
  3. Not every group needs a CMS

IBM still thinks putting all information that is accessed via conrefs into a single (or multiple) shared files is a good plan. I disagree. It's no better than having a bunch of topics with a single element (the method I discussed in Conrefs and “Shared Content”) that are designed for sharing. Yes, you need to avoid a "spaghetti sharing model", but neither of these methods solves the problem.

Last, but not least - I can be a real loud-mouth when it comes to open forums.

I'm attending the Content Management Strategies conference here in San Francisco (you can take a look at the agenda if you are interested in it).

Some quick comments on day 1.

Lots of people! Many more than last year's conference in Annapolis. The welcome presentation said it was 68% larger, and over 300 people were there.

The vendor area is too crowded. I wouldn't be very happy if I was a vendor.

The keynote presentation was slanted, and a bit dull. There was an assumption that using an enterprise CMS is the only way to fly, yet what a CMS is wasn't defined and the reasons why a CMS is required were not very detailed.

Search and context

| No Comments | No TrackBacks

John Battelle came by to talk to a bunch of us folks at salesforce.com yesterday. He gave an interesting talk that combined points from his book (The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture ) and some other, related comments. One of the things he talked about is a search scenario he recently posted on his blog: Google Launches Biz Local AdWords: It's Just the Start..... Go read the scenario and then come back here (I know his blog is much more interesting than mine, but let's finish this dialog, then go read more there).

The scenario is interesting from my point of view as an information designer that wants to make information access more effective. The truth is, for most things, the problem isn't finding the information, it's finding the appropriate information. John's scenario is very similar to an existing search, but adds two contexts - location (the shopper is in a particular location) and interest (s/he wants to buy a bottle of wine, known from the location [in a grocery store's wine section] but also from the search engine choice).

The context is what makes the search results valuable. I can do this search now while at the grocery store using my Nokia 770. The information I want will be retrieved, but since the search engine won't know the context, I will have to filter it myself to gain the same value. I probably won't be very successful because there is too much information for me. People are amazing pattern matchers, but we cannot effectively handle the volume of content such a search would return.

Traditional documentation does the same thing. We put out there a bunch of information, and the users have to use the table of contents, index, or search to find the content they want, and then they have to filter it for their context. We need to add context if we're going to solve the information glut problem.

Context sensitive help is a step in the right direction. We link from a particular UI to a particular bit of information. That's good, but it's very rare that a UI is simple enough that one help topic can be sure to give the user exactly what they want. Usually the content is too broad, forcing the user to dig into it (which takes away most of the advantages you gain from the context sensitive link). Sometimes the content is too narrow, which is even worse.

Contextual embedded help is the next step. Some software products already do this. Tax software programs are a great example. They give you information on how to complete each task without forcing you to ask for help. They don't forget though, that users often need more help, and they include links to that as well.

One thing that I haven't seen in embedded contextual help is something I call "the onion" for lack of a better term.

Embedded context help needs to be specific enough to be useful, but if it's too specific, it might not help me put this current task in context of the user's goal (quick aside, a task is a specific action, and a goal is what you want to accomplish - for example, figuring out your gross income is a task, completing your taxes is a goal). Each embedded context help topic needs to include a weighted related topics list, that gets more general the further out you go. Depending on the topic, it may also have a related topics list that gets more specific.

Here's an example. My goal is to complete my taxes. My current task is to compute my gross income. When I get to the UI page for doing that, the embedded help topic is "Computing your gross income". It has two sets of links, one a list of more specific tasks ("Calculate earned interest income", "Calculate income from salary", etc.). The second set of links is more general and includes concepts about gross income, other types of income, and a link to the overview of the entire tax preparation process.

It's an onion rather than a ladder because you may move out to a more general layer, but it may be to a topic that isn't a direct hierarchical parent of the topic you are looking at. For example, from computing your gross income, you may move to a conceptual topic on the alternative minimum tax. Related topics, sure, but you wouldn't put them in the same information hierarchy.

There's a huge information design problem that needs to be solved to make this model work, but I think it's worth the effort.

A neat idea for conref dependencies?

| No Comments | No TrackBacks

Still on conrefs and dependencies.

I think I've just come up with a neat idea for handling conrefs in a mythical CMS (a reminder, we don't use a CMS, so I don't know if any CMS already does this or could do this).

When I create a conref in Topic B to an element in Topic A, the CMS includes metadata that indicates which version of Topic A I'm conref'ing from. A writer changes Topic A. I open Topic B for editing, and my CMS tells me, "Topic A/elementname has changed. Do you want to review the change or keep your existing content?" If I review it, I see the standard three pane merge tool view - my current topic with the old content, my topic with the new content, and Topic A. There I can choose to consume the content from Topic A (the conref is unchanged), reject it completely (losing all content for that element), or keep the previous version of the content (the element with the conref gets replaced by an element with the content from the earlier version of Topic A).

The second part of this would be reporting on conrefs. Before publishing, you run a report that notifies you of all the topics with conrefs to versions of topics that are not the most recent version and have not been resolved. This allows you to see at a glance conrefs that may no longer valid.

A neat idea or mad ravings of a hungry geek? You tell me.

DITA 2006 Conference

| No Comments | No TrackBacks

I attended the DITA 2006 conference in Raleigh, NC, last week. Overall, it was a very good conference. Good networking opportunities, some really good meetings with vendors, some really nice presentations.

One of the more interesting things that I noticed was the number of attendees that were surprised, perhaps astounded, that IBM doesn't use a CMS for their content.

I think I've covered this before here, but I think traditional content management systems, which are usually designed to work with binary files that represent documents, need to change to support a topic based system like DITA. CMS vendors need to learn SCMS like Perforce, ClearCase, etc. When you have more than 2,000 XML files, and a large team of writers without strict boundaries, diff and merge, build and test, dependency analysis, branching, and other common SCMS functions are vitally important.

SiberLogic seems to understand that software documentation and software development are moving closer together in their needs. However, I heard some rumblings from the crowd that SiberSafe, SiberLogic's CMS tool, may not yet be ready for prime time. It does have one feature that I think is vital for DITA CMS tools - "Where Used".

When a writer opens a topic for editing, they need to know the side effects of the changes they make. When you are writing code, the build will fail if you screw up a dependency. In DITA, your dependencies can fail in two ways

  1. Just like code, you can break a dependency.
  2. Unlike code, you can change the context of an element that makes a dependent topic invalid
I've written about this topic before, so I won't go into it now. Suffice to say any information architect that isn't concerned about this isn't thinking clearly.