Thoughts and Notes Ideas that stay with me long enough to get written down

27Apr/100

DITA – The Things Experts Say I am Doing Wrong

A few months ago, JoAnn Hackos told me that I'm not really using DITA.  Yesterday, Eliot Kimber told me I'm not really doing reuse.

In a way, it's an affirmation for me to have disagreements with such knowledgeable, skillful people and to honestly believe that I'm right.

We use generic DITA topics.  JoAnn's point is that without using specializations of the  generic topic in DITA we are not taking advantage of  DITA.  We're just writing XML.  In a way, I agree.  One of the great things about using specialized topics is that you can easily create and enforce a consistent information model.  However (you knew that was coming, right?), DITA is much more than an information model. You can create information models using many different architectures or processes.  It's quite possible to create and enforce an information model using plain text files.  Besides, taking JoAnn's position to it's logical extreme, how can you say you are really using DITA if you haven't specialized everything?  Why is a task topic specific enough?  Shouldn't every project have a task topic that is designed for its information model?  If that's a requirement to really be using DITA, then nobody is really using DITA.

At the CMS/DITA conference, Eliot asked if anyone using DITA 1.1 was reusing content.  I, and others, raised our hands.  He then made the statement that we were wrong - no one using DITA 1.1 is doing reuse.  It was a typical Kimberism.  Broad, strong statements made, partially, to make his audience think about reuse.  When questioned about it, he clarified by saying that "unbounded reuse is not possible in DITA 1.1," which, I think, nearly everyone would agree with.  However, bounded reuse is possible.  For example, in our system all of our content is stored in
one repository.  All the content is always available to every project at build time.  As an author of the API documentation, I can always depend on the content from the online help.  That means conrefs are always valid.  It gets more tricky when you start using cross-references, either using the xref element or a reltable.  The problem is that while I can be sure the content that is being linked to is available at build time, I cannot be sure that the content is available in the target.  To use an example, I cannot guarrentee that "About Accounts," a help topic, is going to be in the PDF version of the API, so any content in the API guide that includes a link to "About Accounts" will break.  So what, you say, "As the API writer I won't put in one of those links."

Ah, but what if you are reusing content and that content includes a reference to "About Accounts"?  If it does, your guide has a broken link.  Here's the bad part - you cannot know that the content you reuse doesn't include that link.  The way to resolve that, as Eliot hinted at, is to bound your reuse.  You need to either ensure that elements with cross-references are not reused, or you need to test for broken links at build time.  We do the latter.

If a link in the PDF fails, we drop it.  The text is still there, but the link is gone.  The writer gets a message from the build indicating the issue, and then it's the writer's choice to either drop the reuse, change the source with the cross-reference, or let the build drop the link.

In a way I like the problem.  I have two reasons for that.

Our main focus is online content delivery.  When your online project includes search and hierarchical navigation tools (like a table of contents, index, etc.), inline links are, IMO, a usability problem rather than a usability enhancement.  First, due to the fact that we need to style a link differently than non-link text, when a user scans the content, the link jumps out.  The user focuses on it.  That leads to the second problem; a treasure hunt.  The user scans the page, finds a link, clicks it, scans that page, sees a link, clicks it, etc. Next thing you know, they have scanned 10 different topics, the answer to their question wasn't as apparent as all the links, and so, the user gets frustrated; "I looked all over the doc and I couldn't find an answer".  So, inline xrefs actually make the documentation less effective.

The second reason is a more subtle one.  A topic is the smallest piece of information that can stand on its own.  If you include an xref, is it really standing on its own?  It's a crutch that lets the writer be lazy.  If they need the information in the link to really grok the information in the topic, then the topic doesn't stand alone.  If the topic works without the link, then why include it?  You don't know what the user is trying to do, and trying to predict it with an inline link muddies the path the user would choose to follow.

22Feb/100

Testing IE – Virtual PC

Okay, I've complained a lot about Internet Explorer.  With good reason, in some cases, sometimes just because I'm complaining.

One of the things that has always bugged me about IE - you can only have one version on your computer.  That makes it tough to test versions 6, 7, and 8 without having multiple computers.

Microsoft stepped up and created virtual machine images for each variable.  See IE6 and IE7 Running on a Single Machine.

Do I want to be using multiple VMs to test a browser?  No, but at least now I can do it.

23Oct/090

Filterchains – just another reason to love ant

One of the projects I'm working on is conditional publishing of DITA content using ant. Basically what I want to do is, given a list of files, build the output that is affected when those files change. We manage our files using Perforce, so the workflow is something like this:

  1. Check out files
  2. Edit/modify content
  3. Run conditional build
  4. Review output
  5. Check-in files

When you check files out of Perforce, they get put into a changelist, so, to get my file list, I need to ask Perforce what files are in the changelist. Perforce has a nice way to do that. The command is,

p4 opened -c

The problem is the output includes a bunch of information I don't want.  I get lines like this:

//doc/main/core/build/build.xml#164 - edit change 1075456 (text) by sanderson@docbuild

what I want is

//doc/main/core/build/build.xml

or, even better,

files-in-changelist=/home/sanderson/doc/main/core/build/build.xml

That format is the format ant expects for a property files. Property files are nice, because you can load them up and use that property in ant.

What to do, what to do? In my case, I turn to ant. If I have a build issue, someone else has probably run into it, so, I look there first.

Sure enough, ant has a task called filterchain for exactly these kinds of uses. Here's what I wound up with:



  
      
          
          
          
      
      
          
             
             
             

          
      

Just another reason to love ant.

Tagged as: , No Comments
12Nov/070

Using EXSLT date-time functions in Saxon 6.5.5

I needed to put a time-stamp in a file I was generating with XSLT and Saxon. Saxon supports parts of EXSLT, and one of the parts supported is @date:date-time@. It's kind of challenging to figure out exactly how to make it work, though, at least it was for me. In hopes that I'll save someone else some work, here's how to add a time-stamp using Saxon 6.5.5.

<?xml version="1.0" encoding="UTF-8"?>
<xsl :stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:saxon="http://icl.com/saxon"
xmlns:exsl="http://exslt.org/common"
xmlns:date="http://exslt.org/dates-and-times"
xmlns:func="http://exslt.org/functions"
extension-element-prefixes="saxon exsl date func">
<xsl:template match="/">
Date-time: <xsl :value-of select="date:date-time()"/>
Date-year: <xsl :value-of select="date:year()"/>
Date-month-in-year: <xsl :value-of select="date:month-in-year()"/>
</xsl:template>
</xsl>

I hope that helps someone.

12Nov/060

oXygen and profiling

I ran into a little bit of a problem profiling a XSL transformation in oXygen today. Any errors at all in the transformation will cause the profiling to return no value. So, just as a note, make sure your transform doesn't return any errors, not even errors related to being unable to open files using the document function.

7Apr/060

MarkLogic on Debian

I really like the MarkLogic product offering. Not just because it's fast and stable, but also because they offer a community license (other CMS and XML database vendors take note - if I can't try it, I probably won't recommend it to my company). So, I decided to install it on my Debian box today. Here's my story.
First off, I didn't have the alien or daemon package installed. You'll need those (among others), so make sure you have those installed. I'm not going to list every package you need. If you're using Debian, you can probably figure that out yourself.

  1. Download it from MarkLogic. I'm running on a 32 bit processor, so I downloaded the RedHad 32 bit rpm.
  2. Use alien to convert the rpm to a deb.
  3. Use dpkg -i debfile
  4. mkdir /var/lock/subsys
  5. Edit /etc/init.d/MarkLogic and change the line
    . /etc/rc.d/init.d/functions
    to
    #. /etc/rc.d/init.d/functions
  6. Start the server - /etc/init.d/MarkLogic start
  7. Follow the rest of the install guide

It's really easy, so you have no excuse for not using a fully functional XML database.

6Apr/060

Day 2 and 3 at CMS Strategies

Day 2 and 3 at the content management strategies conference - quick hits.
The TEXTML team does a brilliant job of marketing their XML database as a CMS. It is, if you do a bunch of implementation work, but out of the box, it isn't a CMS.
DITA is starting to move to other communities, not just technical publication groups. Legal and other industries are just starting to move into it, but that's a great sign for the long term health of DITA.
Scott Wolff gave a very good presentation on the "5 Mistakes to Avoid when Buying a CMS". I won't steal his thunder (I'm sure he'll have the presentation on his web site), but here are the big things I took away from it

  1. Buy a hosted CMS if it meets your needs, if it doesn't, buy on-site CMS, if one fits your needs. If neither of those are possible, then, and only then, build your own.
  2. Start small, and build. Implement a CMS for one group in your company and then add other groups over time.
  3. Not every group needs a CMS

IBM still thinks putting all information that is accessed via conrefs into a single (or multiple) shared files is a good plan. I disagree. It's no better than having a bunch of topics with a single element (the method I discussed in Conrefs and “Shared Content”) that are designed for sharing. Yes, you need to avoid a "spaghetti sharing model", but neither of these methods solves the problem.
Last, but not least - I can be a real loud-mouth when it comes to open forums.

4Apr/060

Content Management Strategies Conference, Day 1

I'm attending the Content Management Strategies conference here in San Francisco (you can take a look at the agenda if you are interested in it).
Some quick comments on day 1.
Lots of people! Many more than last year's conference in Annapolis. The welcome presentation said it was 68% larger, and over 300 people were there.
The vendor area is too crowded. I wouldn't be very happy if I was a vendor.
The keynote presentation was slanted, and a bit dull. There was an assumption that using an enterprise CMS is the only way to fly, yet what a CMS is wasn't defined and the reasons why a CMS is required were not very detailed.

1Apr/060

Search and context

John Battelle came by to talk to a bunch of us folks at salesforce.com yesterday. He gave an interesting talk that combined points from his book (The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture ) and some other, related comments. One of the things he talked about is a search scenario he recently posted on his blog: Google Launches Biz Local AdWords: It's Just the Start..... Go read the scenario and then come back here (I know his blog is much more interesting than mine, but let's finish this dialog, then go read more there).

The scenario is interesting from my point of view as an information designer that wants to make information access more effective. The truth is, for most things, the problem isn't finding the information, it's finding the appropriate information. John's scenario is very similar to an existing search, but adds two contexts - location (the shopper is in a particular location) and interest (s/he wants to buy a bottle of wine, known from the location [in a grocery store's wine section] but also from the search engine choice).

The context is what makes the search results valuable. I can do this search now while at the grocery store using my Nokia 770. The information I want will be retrieved, but since the search engine won't know the context, I will have to filter it myself to gain the same value. I probably won't be very successful because there is too much information for me. People are amazing pattern matchers, but we cannot effectively handle the volume of content such a search would return.

Traditional documentation does the same thing. We put out there a bunch of information, and the users have to use the table of contents, index, or search to find the content they want, and then they have to filter it for their context. We need to add context if we're going to solve the information glut problem.

Context sensitive help is a step in the right direction. We link from a particular UI to a particular bit of information. That's good, but it's very rare that a UI is simple enough that one help topic can be sure to give the user exactly what they want. Usually the content is too broad, forcing the user to dig into it (which takes away most of the advantages you gain from the context sensitive link). Sometimes the content is too narrow, which is even worse.
Contextual embedded help is the next step. Some software products already do this. Tax software programs are a great example. They give you information on how to complete each task without forcing you to ask for help. They don't forget though, that users often need more help, and they include links to that as well.

One thing that I haven't seen in embedded contextual help is something I call "the onion" for lack of a better term.
Embedded context help needs to be specific enough to be useful, but if it's too specific, it might not help me put this current task in context of the user's goal (quick aside, a task is a specific action, and a goal is what you want to accomplish - for example, figuring out your gross income is a task, completing your taxes is a goal). Each embedded context help topic needs to include a weighted related topics list, that gets more general the further out you go. Depending on the topic, it may also have a related topics list that gets more specific.

Here's an example. My goal is to complete my taxes. My current task is to compute my gross income. When I get to the UI page for doing that, the embedded help topic is "Computing your gross income". It has two sets of links, one a list of more specific tasks ("Calculate earned interest income", "Calculate income from salary", etc.). The second set of links is more general and includes concepts about gross income, other types of income, and a link to the overview of the entire tax preparation process.

It's an onion rather than a ladder because you may move out to a more general layer, but it may be to a topic that isn't a direct hierarchical parent of the topic you are looking at. For example, from computing your gross income, you may move to a conceptual topic on the alternative minimum tax. Related topics, sure, but you wouldn't put them in the same information hierarchy.

There's a huge information design problem that needs to be solved to make this model work, but I think it's worth the effort.

Reblog this post [with Zemanta]
30Mar/060

A neat idea for conref dependencies?

Still on conrefs and dependencies.
I think I've just come up with a neat idea for handling conrefs in a mythical CMS (a reminder, we don't use a CMS, so I don't know if any CMS already does this or could do this).
When I create a conref in Topic B to an element in Topic A, the CMS includes metadata that indicates which version of Topic A I'm conref'ing from. A writer changes Topic A. I open Topic B for editing, and my CMS tells me, "Topic A/elementname has changed. Do you want to review the change or keep your existing content?" If I review it, I see the standard three pane merge tool view - my current topic with the old content, my topic with the new content, and Topic A. There I can choose to consume the content from Topic A (the conref is unchanged), reject it completely (losing all content for that element), or keep the previous version of the content (the element with the conref gets replaced by an element with the content from the earlier version of Topic A).
The second part of this would be reporting on conrefs. Before publishing, you run a report that notifies you of all the topics with conrefs to versions of topics that are not the most recent version and have not been resolved. This allows you to see at a glance conrefs that may no longer valid.
A neat idea or mad ravings of a hungry geek? You tell me.