For the rest of this article, I'll focus on contextual inquiry, which is the initial step of the contextual design process.
The point of contextual inquiry is to collect data that allows the development team to understand the user and the type of work that the user does. So understanding the user is the goal, though contextual inquiry does only half of what's needed for that. After doing a contextual inquiry, the development team needs to build work models based on the data collected during the inquiry. The work models themselves aren't part of contextual inquiry, so I won't address them here, but suffice it to say that contextual inquiry produces the data that drive the construction of the work models.
Contextual inquiry is something like an interview process between the IT members and the business users, though "interview" is not exactly correct, since that connotes a question and answer session that isn't really what the inquiry is about. It might be better to think of a contextual inquiry as an observation session, where an analyst from the IT team meets with a business user in the business user's workspace (office, cube, whatever) and observes the business user performing real-life business work. The analyst asks questions throughout, but these are mostly requests for clarification on the business task, rather than being known up front. In a good contextual inquiry, the analyst acts as an apprentice to the business user, who is trying to demonstrate and explain the job to the analyst in a concrete fashion. This means that it's the business user who's doing most of the driving, and not the analyst. The outcome, if things go well, is that the analyst has a good sense for the type of work that his or her particular user does.
To generalize this knowledge across multiple users, the analysts who conduct the contextual inquiries will bring their data back to the team and conduct modeling and interpretation sessions, as mentioned above. Because the inquiries serve as a basis for model building, they need to involve "sufficiently many" users: there have to be enough users to support the uncovering of generalizations about the nature of the task domain. From the Beyer and Holtzblatt book, I believe that the recommended number of users is 10-20, but really it is just going to depend on the task domain. If the user tasks are highly uniform across users, you might be able to get away with fewer. It's not really a matter of proving statistical significance; it is a judgment call.