CODAP Case Clustering

Last updated Jan 10,1997

Case clustering is a method of grouping together incumbents who perform similar jobs (Archer, 1966). The similarity of jobs is normally based on the overlap of percent time spent on tasks. The standard CODAP clustering method uses a hierarchical approach which requires the execution of two programs - OVRLAP and GROUP.

The program OVRLAP computes a similarity measure between every pair of cases contained on a Case Data file. This measure is computed for each pair of cases by comparing the corresponding responses for all tasks. OVRLAP will cluster a maximum of 7000 cases and 7000 tasks. The output file is a similarity matrix.

The program that actually performs the hierarchical clustering is GROUP (Phalen, 1975). GROUP uses the similarity matrix produced by OVRLAP as its input, and creates an output Cluster Solution file which records the details of the clustering process. GROUP's output report shows the groups that merged at each stage of the clustering process. This report is referred to as a group membership listing (Phalen and Christal, 1973).

MPATH allows the user to reorder the presentation sequence based on the average value of any specified variable(s). MPATH does not alter the clustering solution, but simply creates a new presentation sequence. The primary variable is used for the sort with the secondary and tertiary variables used to break ties. The most common application uses C0007 (number of tasks performed) as the sort variable. The resulting DIAGRM report will display groups with the higher average number of tasks performed to the left. MPATH also displays a revised group membership listing.

DIAGRM is the primary output of the clustering process. It is run to show the major points in the clustering process, and has flexibility with respect to the level of detail reported (Phalen and Christal, 1973). The final output report is a tree structure which may be up to fifty printed pages in length and 20 in width. The size of the output may be adjusted by altering minimum starter group size, specifying a restricted KPATH range, and/or changing minimum average between requirements.

Each diagram box, which represents a stage in the diagram, displays information about that stage. The first line gives the stage number and the number of members in that stage, respectively. The second line gives the KPATH range, and the last line reports the average overlap between and average overlap within, respectively. DIAGRM also provides a group membership listing of all selected starter groups.

Back to the CODAP home page