3
Task Load Model

As shown in Figure 3-1, the task load model consists of four basic data and modeling components:

  1. A defined set of R-side tasks performed by a controller when working traffic in a sector;

  2. Analyses of traffic operations and flight-planning data to simulate the traffic activity in a sector during a time period and to indicate traffic events that trigger the occurrence of one or more of the defined R-side tasks;

  3. The calculation and assignment of time spent by the R-side controller in performing each triggered task; and

  4. Computation of total task load by summing the time spent by the R-side controller on these triggered tasks plus some additional time spent by the controller in monitoring the radar screen for all flights transiting the sector.

Explanations of each of these components are provided in this chapter along with a summary of the description by the Center for Advanced Aviation System Development (CAASD) of its efforts to evaluate and validate the traffic simulations, key model parameters such as task times, and the task load output from the model. The chapter concludes with the committee’s assessment of each modeling element, including evaluation and validation efforts.

TASKS IN MODEL

Eight Triggered Tasks

Each of the triggered tasks is associated with the action of an aircraft under the control of a sector. In most cases, “tasks” are actually aircraft



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 26
3 Task Load Model As shown in Figure 3-1, the task load model consists of four basic data and modeling components: 1. A defined set of R-side tasks performed by a controller when working traffic in a sector; 2. Analyses of traffic operations and flight-planning data to simulate the traffic activity in a sector during a time period and to indicate traffic events that trigger the occurrence of one or more of the defined R-side tasks; 3. The calculation and assignment of time spent by the R-side controller in performing each triggered task; and 4. Computation of total task load by summing the time spent by the R-side controller on these triggered tasks plus some additional time spent by the controller in monitoring the radar screen for all flights transiting the sector. Explanations of each of these components are provided in this chap- ter along with a summary of the description by the Center for Advanced Aviation System Development (CAASD) of its efforts to evaluate and validate the traffic simulations, key model parameters such as task times, and the task load output from the model. The chapter concludes with the committee’s assessment of each modeling element, including evaluation and validation efforts. TASKS IN MODEL Eight Triggered Tasks Each of the triggered tasks is associated with the action of an aircraft under the control of a sector. In most cases, “tasks” are actually aircraft 26

OCR for page 26
3 1 Task Time Assignment Eight Triggered R-Side Tasks Times for most delay tasks (most composed of subtasks) 2 (based on consultations with Assigned times and R-Side Task Trigger Data Entry operational experts) schedules can vary Exit Traffic operations and by flight type: Flash through Times for subtasks flight-planning data – Nonmilitary Nonradar arrival for other triggered tasks – Propeller Nonradar departure (modeled using GOMS) Separation event data – Military Transition – International Spacing and delay data Separation assurance Subtask scheduling Delay (7 types): (based on consultations with – Low – Shortcut operational experts) – Medium – Reroute – High – Hold – Diversion 4 R-Side Task Load Computation – Sums times spent on R-side tasks during 15-minute period (assumes task times are fixed and no multitasking) – Adds time for monitoring task per aircraft calculated based on composition of the task load during the period FIGURE 3-1 CAASD task load model.

OCR for page 26
28 Air Traffic Controller Staffing in the En Route Domain actions, or events, that require the lead controller to execute a series of procedures. For example, an aircraft entering a sector is referred to as an entry task. When an entry occurs, the lead controller must take a num- ber of actions that involve verbal communication with the pilot and other controllers, such as identifying the aircraft, establishing a clearance plan, and accepting the hand-off. Only the R-side procedures handled by the lead controller are included in the model. For reasons explained in Chapter 2, the CAASD model does not estimate D-side task load. The eight tasks in the model that are triggered by traffic activity are the following: • Entry, which encompasses the actions that the R-side controller must take in accepting the hand-off of an aircraft from another sector. Three types of hand-offs are distinguished in the model: those involv- ing aircraft entering from international airspace, from a sector con- trolled by a different en route center, and from a sector controlled by the same center. Hand-offs are differentiated in this way in order to associate different task times with each. • Exit, which encompasses the actions that the R-side controller takes in handing off an aircraft to a downstream sector. Hand-offs are dif- ferentiated in the same manner as for the entry task. • Flash through is an entry in which the R-side controller handles the aircraft for less than 2 minutes. If after accepting the hand-off of an air- craft from an upstream sector, the controller determines that commu- nication with the aircraft is not needed (e.g., the aircraft will be in the sector for a very short time and is separated from other aircraft), the controller will hand off the aircraft to the downstream sector without ever communicating with the pilot. The upstream sector controller will instruct the pilot to tune to the downstream sector’s radio frequency. Because the flash through task entails less work than the full entry task, the assigned task time is less than that for the entry task. • Nonradar arrival includes the actions taken by the R-side controller to provide services to aircraft arriving at an airport not offered the full range of radar services. • Nonradar departure includes the actions taken by the R-side con- troller to provide services to aircraft departing from an airport not offered the full range of radar services.

OCR for page 26
Task Load Model 29 • Transition consists of the actions performed by the R-side controller in clearing an aircraft to climb or descend and monitoring the aircraft to ensure that it is separated from other aircraft during the altitude transition. • Separation assurance requires the R-side controller to identify air- craft pairs that are projected to lose lateral and vertical separation, determine a maneuver to ensure that separation will not be lost, and issue the maneuver clearance. Identification of this task does not mean that the aircraft pair has lost separation, only that a controller’s attention was likely drawn to the pair to ensure separation. • Delay involves the vectoring performed by the R-side controller for traffic separation and coordination such as merging aircraft in a stream. It also involves other controller actions that increase (delay) or decrease (cut short) the amount of time that aircraft are controlled by the sector such as holding, rerouting, and diversion. All eight of these tasks are triggered in the model by a specific aircraft action, or event, that can be identified through the flight data that are used to simulate the traffic experienced in each sector for given time periods. The one modeled R-side task that is omitted from this list is a more generalized task referred to as “monitoring.” This task entails scanning the radar dis- play to maintain situational awareness and ensure that aircraft are follow- ing their clearances. All aircraft that transit the sector are assumed to require some amount of monitoring by the R-side controller. The modeling of the monitoring task is explained later in the discussion of task time derivation. According to CAASD, these nine tasks are the main contributors to the lead controller’s R-side task load. The committee was told that they represent about 90 percent of the R-side tasks performed based on CAASD’s review of the literature and consultations with controller per- sonnel and other subject matter experts. TRAFFIC SIMULATIONS AND TASK TRIGGERS Data Sources The two sources of flight data used to simulate traffic and trigger tasks are the National Offload Program (NOP) and the CAASD Analysis Plat- form for En Route (CAPER). NOP is a collection of messages generated

OCR for page 26
30 Air Traffic Controller Staffing in the En Route Domain by each en route center’s host computer system as traffic is worked through the center. It provides aircraft and sector identification, hand-off time, alti- tude, equipment type, origin, and destination. These data are used to trig- ger entries, exits, flash throughs, transitions, and nonradar arrivals and departures. NOP messages also provide the flight and equipment informa- tion needed to determine whether the flight is a military operation, which are allowed higher times for some tasks (as explained later). Because FAA does not currently maintain an operations data set that contains information to trigger separation and delay tasks, CAASD uses its own analytic model to develop this information. By analyzing flight- plan and tracking data from FAA’s Traffic Flow Management System, CAPER produces a four-dimensional projection of the path of an aircraft flying through the airspace from departure to arrival with associated transit times. Separation events are identified for aircraft pairs when the CAPER-modeled positions of two aircraft are predicted to come within a set of lateral and vertical threshold parameters. These parameters have been set to identify situations that would likely have drawn the attention of the R-side controller to ensure separation.1 The CAPER model is also used to trigger the reroute and hold delay tasks. It does this by capturing the change in the estimated time of arrival (ETA) as a flight enters and exits a sector. CAPER records relevant data for all active flights each time their modeled trajectories are updated. This process creates a historical record of the change in ETA for each flight including the time stamp for when the trajectory was updated and the reason for the update. For instance, a flight may enter a sector with an ETA of 1:00 p.m.; however, while it transits the sector it is vectored to provide the spacing needed to manage the traffic flows in a downstream sector. The extra time taken to vector is 5 minutes, and thus the ETA indicated upon exit from the sector will show 1:05 p.m. The model infers 1 According to CAASD, the algorithms used by CAPER to detect separation events are similar to the conflict probe algorithms in the User Request Evaluation Tool (URET) automation, which is used operationally by en route controllers. The URET prototype was developed by CAASD to assist con- trollers with timely detection and resolution of predicted separation problems. URET is designed for the D-side controller, who typically has a more strategic look-ahead for potential separation. The prototype was used operationally for over 5 years at en route centers in Memphis, Tennessee, and Indianapolis, Indiana, to develop requirements for a capability to be deployed nationwide.

OCR for page 26
Task Load Model 31 the type of action taken by assessing the size of the change in ETA. For instance, a large change in ETA is inferred to be caused by holding rather than a simple rerouting. As explained later, each of these types of delay is assigned a different task time. Finally, changes in flight-plan informa- tion are also assessed by CAPER to identify reroutes and diversions. Evaluation and Calibration of Data Sources CAASD performs various checks on the simulated traffic and its data sources, particularly the CAPER tool used for separation and delay trig- gers. The CAPER output, for instance, is routinely examined against recorded weather data to ensure that during known periods of severe weather the sector traffic simulations show an increase in the occurrence of delay tasks from holds, reroutes, and diversions. Likewise, checks are made to ensure that during periods of heavy traffic involving high vol- umes of crossing and transitioning aircraft the model detects higher lev- els of the separation-related task load. When the use of CAPER to trigger separation events was first considered, CAASD compared the separations identified by CAPER for all Indianapo- lis en route sectors with actual aircraft conflicts identified in the sectors by a prototype of FAA’s URET. Consistency in results led CAASD to conclude that CAPER would generate accurate separation event data for the task load model. Further analyses were performed to calibrate the parameters used by CAPER in detecting separation events. These analyses included the ran- dom sampling of URET conflicts and the checking of flight-track histories, flight-plan amendments, and controller–pilot voice recordings. In addition to the CAPER–URET comparisons, CAASD performed sensitivity analyses to evaluate the use of alternative parameter thresholds for lateral and verti- cal separation minimums as well as the maximum look-ahead time for CAPER to probe flights along their trajectories. On the basis of these analyses and through an iterative process of cal- ibrating the CAPER trajectory modeler and air-to-air separation probe algorithms, CAASD believes there is a strong correspondence between URET conflicts and CAPER separation events. Similar checks have appar- ently been performed on CAPER results used for triggering the delay events. In addition, CAASD continues to sample the modeled flights to verify whether the calculated delay is consistent with observed flight

OCR for page 26
32 Air Traffic Controller Staffing in the En Route Domain operations. These comparisons with actual traffic observations are used to adjust the CAPER parameters when warranted. TASK TIMES AND SCHEDULES Derivation of Task Times The time assigned to each triggered task is a critical element of the task load model, since the task load output is a time summation. In briefings to the committee, CAASD maintained that it encountered difficulty establishing task times from the literature. Model developers apparently could find little documentation on the task times used in other relevant efforts to model controller task performance, and in the few cases in which task times could be identified, the tasks did not represent the same set of actions or assumptions as those used in the CAASD model. Hence, whereas early versions of the model tried to use literature-based task times—which had to be modified and supplemented by estimates from operational experts—CAASD later concluded that another process was needed to derive these times. CAASD’s investigation of options for deriving task times led it to select the modeling process known as Goals, Operators, Methods, and Selection Rules (GOMS) to develop many of the task times in the model. GOMS is a type of human information processor model that is used to predict user performance for a given task and to provide an estimate of how much time it takes to accomplish the task. In short, the model assumes that humans pursue tasks according to goals. Each goal is accomplished by employing various “operators” consisting of cognitive processes, perceptions, and motor actions. These operators are sequenced into “methods” that relate how the operators are used to accomplish the goal. Because there can be more than one method for accomplishing a goal, various selection rules (e.g., “if–then” statements) are employed to describe when a user would choose one method over another. In this way, GOMS modeling predicts the time it takes for a person to accomplish a task by associating spe- cific times with each operator and then sequencing them according to the selected method. Specifically, GOMS-generated task times are used in the CAASD model for the entry, exit, flash through, transition, separation, nonradar,

OCR for page 26
Task Load Model 33 and shortcut (delay-related) tasks. Accordingly, CAASD had to decom- pose each of these tasks into constituent subtasks and then further convert each subtask into its execution method and the specific operators involved (e.g., uttering words, pressing keys) as modeled by GOMS. To identify the subtasks and their execution methods and operators, model developers consulted with subject matter experts. The nature of these consultations and the methods used were not documented or explained to the commit- tee other than to describe them as involving an iterative process until experts were satisfied with the decomposition of subtasks into operators. CAASD then simulated the actions for each operator to develop execution times. Documentation on the method of simulation was not provided, although apparently these simulations did not attempt to capture all of the perceptual or cognitive processes that affect execution time, but only the time required to perform motor actions. While CAASD informed the committee that the GOMS estimates do include subtasks such as identifi- cation of the problem, priority-ranking of the problem, and generation of a problem resolution, the potential impact on task time estimates from not including the time required for perceptual and cognitive processes was never addressed in depth. The subtasks associated with each of the eight triggered tasks identi- fied above, and their GOMS-generated execution times, are shown in Tables 3-1 and 3-2. It merits noting that GOMS is not used for most of the delay tasks because model developers could not identify constituent subtasks. Accordingly, CAASD consulted with subject matter experts to establish the times assigned to the delay-related actions of rerouting, holding, and diverting. To establish these times, traffic replays were pre- sented to the experts, who estimated the task completion time. These delay task times range from 25 to 75 seconds. The consultations with subject matter experts also led CAASD to con- clude that the longer task times were needed for certain types of traffic, par- ticularly flights involving propeller and military aircraft. For this particular traffic, assigned task times are increased by 25 percent for entry, exit, and nonradar arrivals and departures, under the assumption that additional communication and coordination are required. Although the validity of this adjustment factor was not researched by the committee, CAASD main- tains that it was derived from information in FAA’s Position Classification Standard for Air Traffic Control Series.

OCR for page 26
TABLE 3-1 Times Assigned to R-Side Tasks and Subtasks Event or Main Task Subtask Subtask Time (s) Task Time (s) Entry Identify aircraft 1.5 17.4 Establish clearance plan 6.4 Pilot call-in, hand-off accepted 9.5 Exit Identify aircraft 1.5 14.0 Automated hand-off 1.3 Change frequency 11.2 Flash through Identify aircraft 1.5 12.3 Verify flight path exit sector 1.3 Contact center controller 9.5 Transition Identify aircraft 1.5 14.6 Determine altitude 1.6 Determine clear path 1.4 Issue clearance 5.4 Listen to readback 4.7 Nonradar arrival Identify aircraft 1.5 52.3 Altitude assignment/restriction 9.3 Traffic advisory 11.8 Issue approach clearance 9.7 Weather issuance 9.3 Change frequency 10.7 Nonradar departure Identify aircraft 1.5 21.0 Altitude verification 1.9 Radar identification 8.3 Altitude assignment/restriction 9.3 Separation Identify aircraft 1.5 27.6 Determine Vector 1 1.3 Ensure clear path 1.4 Issue Vector 1 clearance 9.1 Determine Vector 2 1.3 Ensure clear path 1.4 Issue Vector 2 clearance 7.5 Listen to readback 4.1 Delay (shortcut) Identify aircraft 1.5 11.0 Determine clear path 1.4 Issue direct clearance 5.0 Listen to readback 3.1 Delay (low) None 25.0 Delay (medium) None 35.0 Delay (high) None 45.0 Delay (reroute) None 60.0 Delay (diversion) None 75.0 Delay (hold) None 3 s/min in holding Monitoring None Variable s = seconds; min = minutes. NOTE: When aircraft is military or propeller, all task times for entry, exit, and nonradar arrivals and departures increase 25 percent.

OCR for page 26
Task Load Model 35 TABLE 3-2 Task Scheduling Distributions Task Distribution Minutes Distribution Type E to E + 3 Entry Quasi-uniform X − 2 to X + 2 Exit Custom E to E + 1 Flash through Uniform ⁄2(E + X) to 1⁄2(E + X) + 1 Transition Uniform 1 E + 4 to X Nonradar arrival Uniform E − 2 to E + 3 Nonradar departure Uniform S − 6 to S + 1 Separation Quasi-uniform E to X Delay Uniform E = minute of hand-off from upstream sector X = minute of hand-off to downstream sector S = minute separation event begins To assign times to the monitoring task, CAASD evaluated several alter- native approaches. At first, the model assumed that a fixed amount of controller time is spent monitoring each aircraft transiting the sector. However, because some of the task times already presumed a certain amount of monitoring—especially in the case of the time-consuming delay tasks—CAASD was concerned that such a fixed time would lead to double counting and an overestimation of total monitoring task time for some flights. Model developers therefore created an algorithm to produce a monitoring task time for each aircraft transiting a sector. The algorithm assigns a monitoring time per minute a flight is in the sector. The assigned time varies depending on the composition of the R-side task load gener- ated by the flight. Hence, if the flight’s R-side task load consists of many tasks identified as being complex (such as delays that already include a large amount of monitoring time), the algorithm selects a lower monitor- ing rate per minute because it is assumed that a large amount of monitor- ing time is already included in the flight’s R-side task load. Task Scheduling As discussed earlier, many of the tasks used in model are made up of sub- tasks. These subtasks are presumed to be performed by the R-side con- troller in a defined order but interspersed with subtasks being performed for other tasks. Because the model computes task load for 1-minute intervals to obtain rolling 15-minute task load estimates, CAASD needed

OCR for page 26
36 Air Traffic Controller Staffing in the En Route Domain a method for distributing the subtasks over the entire span it takes to com- plete the full task. The model therefore sequences subtasks in a pattern that is thought to be typical for the task and over a completion period that is considered typical.2 These sequencing and scheduling profiles, which are shown in Tables 3-1 and 3-2, were determined through consultations with subject matter experts. As shown in Table 3-2, the entry, exit, and separation tasks are assumed to have nonuniform distributions, since the majority of the task time is scheduled at the point when the task is triggered. Time for transition, flash through, and delay tasks is uniformly distributed over the life of the flight in the sector. The time for nonradar arrivals and departures is uniformly distributed on the basis of the entry time identified by the trigger. Evaluation of Task Times CAASD selected GOMS to model task times because it was viewed as pro- viding efficiency and flexibility, since the time estimate can be calculated with relatively little effort if the operators and methods are known and accu- rate operator time data are available. In discussion with model developers, committee members noted that GOMS models are more commonly used to assess user performance across various prototype products and systems in which there are few practical opportunities for direct observation of human performance in operational or experimental settings, such as eval- uating alternative workstation layouts and computer interfaces. When questioned about the applicability of GOMS to the air traffic control environment—which is an observable operational setting—the model developers restated their belief that GOMS offers the needed efficiency and flexibility and provided the 12 literature sources identified in Box 3-1 2 For example, a sector entry is identified from a hand-off message contained in NOP data, with the time of the message being noted. The entry task time is then distributed around the time of the hand-off message relative to a typical sequence of actions, or subtasks, that a controller performs for a hand-off. The larger portion of the task time is scheduled at the minute that the hand-off is accepted in consideration of the actions a controller typically performs when accepting a hand- off: determining that the aircraft is not in conflict with other aircraft and that it is following its recorded flight plan. The remaining time is spread over the few minutes following the hand-off message representing the time the controller would be monitoring the aircraft until it actually crosses the boundary into the controller’s airspace.

OCR for page 26
Task Load Model 37 TABLE 3-3 Comparison of GOMS and HITL Operator Times by Data Source and Operator HITL GOMS Value as a HITL Standard Percent of GOMS Average Deviation HITL Average Syllable utterance 150 ms 187 ms 12.3 ms 80.2 Keystroke 280 ms 247 ms 25.7 ms 88.2 Fixation 500 ms 542 ms 117.3 ms 92.3 ms = milliseconds. SOURCE: CAASD submission to committee. as containing examples of previous research supporting the use of GOMS for modeling task times in situations similar the air traffic control domain. While the committee did not review each of these sources, it notes that only a few (4 of the 12) appear to involve air traffic control tasks. Questioned on how the GOMS times were validated, CAASD pointed to the limited comparisons that have been made with experimental data from human-in-the-loop (HITL) experiments conducted in 2008 by FAA. As shown in Tables 3-3 and 3-4, when the GOMS operator times are com- pared with the HITL times, the former are found to be 10 to 20 percent lower than the latter. According to CAASD, these GOMS error rates are comparable with GOMS error rates generally, as observed from the liter- ature cited in Box 3-1. However, because the FAA HITL experiments were not conducted for the specific purpose of developing model times, they could only be used to assess some of the task times. CAASD has therefore consulted with TABLE 3-4 Comparison of GOMS and HITL Times: Times Aggregated to Task Level with Typical Usage* GOMS Value as a Percent GOMS HITL Average of HITL Average 17 syllables 2,550 ms 3,181 ms 80.2 5 keystrokes 1,400 ms 1,237 ms 88.4 2 fixations 1,000 ms 1,083 ms 92.3 *A typical usage would be a pilot readback consisting of 17 syllables. ms = milliseconds. SOURCE: CAASD submission to committee.

OCR for page 26
38 Air Traffic Controller Staffing in the En Route Domain BOX 3-1 Sources Cited by CAASD in Support of Using GOMS Modeling for Deriving Controller Task Times* Card, S., T. P. Moran, and A. Newell. 1983. The Psychology of Human–Computer Interaction. Lawrence Erlbaum Associates, Mahway, N.J. Endestad, T., and P. Meyer. 1993. GOMS Analysis as an Evaluation Tool in Process Control: An Evaluation of the ISACS-1 Prototype and the COPMA System. Technical Report HWR-349. Organization for Economic Cooperation and Development Halden Reactor Project. Institute for Energiteknikk, Halden, Norway. Estes, S., C. Bonaceto, K. Long, S. Mills, and F. Sogandares. 2009. Carbon Copy: The Benefits of Autonomous Cognitive Models of Air Traffic Controllers in Large-Scale Simulations. In Proceedings of the 8th USA/Europe Air Traffic Management Research and Development Seminar, Napa, Calif. Gong, R. 1993. Validating and Refining the GOMS Model Methodology for Soft- ware User Interface Design and Evaluation. PhD dissertation. University of Michigan, Ann Arbor. Gray, W. D., B. E. John, and M. E. Atwood. 1993. Project Ernestine: A Validation of GOMS for Prediction. Human–Computer Interaction, Vol. 8, No. 3, pp. 237–309. Irving, S., P. Polson, and J. E. Irving. 1994. A GOMS Analysis of the Advanced Automated Cockpit. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Celebrating Interdependence, April 24–28, Boston, Mass., pp. 344–350. Kieras, D. E., S. Wood, K. Abotel, and A. Hornof. 1995. GLEAN: A Computer- Based Tool for Rapid GOMS Model Usability Evaluation of User Interface Designs. In Proceedings of the 8th Annual Association for Computing Machin- ery (ACM) Symposium on User Interface and Software Technology, Nov. 15–17, Pittsburgh, Pa., pp. 91–100. Lee, A. 1992. Accuracy of MHP/GOMS Predictions for the Task of Issuing Recur- rent Commands. In ACM–Special Interest Group on Human–Computer Inter- action (SIGCHI) Conference on Human Factors in Computing Systems, Monterey, Calif., pp. 105–106. (continued)

OCR for page 26
Task Load Model 39 BOX 3-1 (continued) Sources Cited by CAASD in Support of Using GOMS Modeling for Deriving Controller Task Times* Lee, S. M., U. Ravinder, and J. C. Johnston. 2005. Developing an Agent Model of Human Performance in Air Traffic Control Using APEX Cognitive Architec- ture. In Proceedings of the 2005 Winter Simulation Conference, Orlando, Fla., Vols. 1–4, pp. 979–987. Nesbitt, K., D. Gorton, and J. Rantanen. 1994. A Case Study of GOMS Analysis: Extension of User Interfaces. Technical Report BHPR/ETR/R/94/048. New- castle Laboratories, Wallsend, New South Wales, Australia. Ravinder, U., R. W. Remington, and S. Lee. 2005. A Reactive Computational Model of En-Route Controller. In Proceedings of the 2005 IEEE International Conference on Systems, Man and Cybernetics, Oct. 10–12, Waikoloa, Hawaii, pp. 1628–1633. Smith, E. C. 2008. Flight Management System Execution Task Time Modeling for Loading Terminal Area Navigation Procedure Changes. In Proceedings of the Human Factors and Ergonomics Society 52nd Annual Meeting, Vol. 52, No. 13, pp. 912–916. * GOMS Modeling for the MITRE En Route Workload Model, a briefing presented by MITRE to the TRB Committee for a Review of the En Route Air Traffic Control Com- plexity and Workload Model, March 2010. operational experts to obtain their opinions on the validity of the GOMS times. Although the nature of these consultations was not explained, they apparently led CAASD to conclude that the GOMS times are generally reasonable but require some adjustment to represent the times associated with speaking, which the experts thought were too high. The experts con- sulted also raised questions about the model’s basic assumption that tasks are performed sequentially rather than in parallel in some instances. When the committee asked about the possibility of performing dedi- cated HITL experiments, CAASD agreed that doing so could yield a rich array of information but restated the concern that the experiments can be time-consuming and expensive. Model developers reported that they are continuing to evaluate the 2008 HITL data to assess the prevalence

OCR for page 26
40 Air Traffic Controller Staffing in the En Route Domain of multitasking and to research the GOMS speech operator to make it produce times that are closer to those indicated by the HITL data and the judgment of consulted experts. In general, CAASD believes the future use of HITL experiments will be confined to the development and validation of a select number of task performance times because of the perceived time and expense of conducting such experiments. TASK LOAD COMPUTATION Rollup to Task Load As previously discussed, the task times are scheduled in 1-minute inter- vals. The model processes the 1-minute intervals with the output rolled up to a larger time interval, typically 15 minutes. The processing that the model performs to roll up 1-minute task load is summarized by the following equation: n+ 7 ∑ ∑∑w Wn = ijk i =n − 7 j k where Wn = 15-minute workload at minute n, and wijk = 1-minute workload at minute i due to task j being performed in service of aircraft k. The task load output is computed by summing all of the time spent on R-side tasks during the measured period. In theory, the highest value for R-side task load for a 15-minute period is 900 seconds (15 minutes times 60 seconds/minute), assuming (unrealistically) that a controller can effec- tively use all 900 seconds of available time and that a second controller is handling the D-side task load. As discussed in Chapter 4, when the task load rollup exceeds a certain threshold (around 600 seconds), it is assumed that two controllers are working the traffic. Evaluation of Task Load Rollups At various stages in the development of the model, CAASD has under- taken evaluations of its task load output for accuracy. Early evaluations,

OCR for page 26
Task Load Model 41 including the one described in Box 3-2, suggested that the results of the model were a major improvement over the volume-based (aircraft count) metrics that had been used previously to inform controller staffing requirements. These initial evaluations caused FAA to favor the task-based approach over the earlier methods. In more recent evaluations, CAASD once again turned to experts for their opinions on the task load output. In 2006, CAASD assembled a group of front-line managers from 10 en route centers.3 Each man- ager was briefed on the background, objectives, and outputs of the task load model. Before the model results were presented, the man- agers were asked to rank their respective sectors by traffic complexity. The rankings were then compared with the rankings of the same sec- tors based on the traffic simulations and task loads generated by the model. The participants were asked if the model’s results were accu- rate in characterizing the individual sectors in terms of typical traffic volume and types of activity (e.g., prevalence of separations, delays, transitions). According to CAASD, for most of the sectors the man- agers responded that the model results closely matched their own per- ception of sector traffic complexity. Although based on perceptions, these assessments were used by CAASD as guidance in making further refinements to elements of the model, particularly the delay task. The evaluations were also one of the factors that caused CAASD to seek additional information to repre- sent international flights and flights to and from airports with no radar services. COMMITTEE ASSESSMENT FAA asked the committee to examine the input data and processes used for modeling traffic activity, the tasks and methods used to assign task times, and the means for validating model assumptions, parameters, and output. An assessment of each is offered next. 3 Atlanta, Georgia; Boston, Massachusetts; Dallas, Texas; Denver, Colorado; Houston, Texas; Memphis, Tennessee; Minneapolis, Minnesota; New York City; Salt Lake City, Utah; and Seattle, Washington.

OCR for page 26
BOX 3-2 CAASD Comparisons of Task Load Output with Results of Dynamic Density Experiments A concept known as dynamic density was critical to the free flight paradigm guiding the planning of the NAS in the late 1990s and early 2000s. The dynamic density concept is built on two of the same basic principles as the CAASD model: (a) complexity affects the capacity of a sector, and (b) complexity is dynamic and changes over the course of a day for a sector. In 2002, the FAA requested that CAASD evaluate the effectiveness of four sets of dynamic density metrics developed by various research organiza- tions for predicting air traffic complexity as perceived by con- trollers.* To conduct the study, traffic scenarios were evaluated by controllers in HITL experiments at the FAA Technical Center using a rating scale from the National Aeronautics and Space Administration (NASA) known as the Air Traffic Workload Input Technique (ATWIT). Using ATWIT as the basis, controllers were asked to rate their subjective assessment of the complexity level they experienced on a scale of 1 to 7. CAASD leveraged the results from those dynamic density experi- ments in a 2004 analysis of the task load model, comparing the scores provided in that study with the values generated by running the model with scenario data obtained from the dynamic density HITL experiments. The results of the study indicated that as esti- mated task load increased, the controller-perceived complexity rating tended to increase as well. While the actual predicted amount of task load was not validated, CAASD believes that the analysis demonstrated consistency between increasing task load and increasing complexity. In addition, CAASD concluded that the results indicated that the output of the model outperformed aircraft count as a predictor of both perceived complexity and the number of required controllers as rated by operational experts involved in the original dynamic density experiments. * Holly, K., Y. Cabeza, M. Callaham, D. Greenbaum, A. Masalonis, and C. Wanke. 2002. Feasibility of Using Air Traffic Complexity Metrics for TFM Decision Support. MTR 02W0000055. MITRE Corporation, McLean, Va.

OCR for page 26
Task Load Model 43 Traffic Modeling Compared with simple traffic counts, the simulations of traffic in the CAASD model provide a more complete picture of both the volume and nature of traffic activity in the en route sectors. The simulations are devel- oped through an array of traffic operations and flight-planning data that represent opportunistic use of many existing traffic data and modeling tools. The traffic activity is modeled in sufficient depth and resolution to enable reasonable approximations of traffic complexity and associated controller tasks. Because the simulated traffic can be checked against records of actual traffic activity, there is ample opportunity to validate the output accuracy and to guide the development and recalibration of mod- eling processes and parameters. CAASD appears to have taken advantage of these opportunities to improve the traffic modeling capabilities. Task Coverage The nine tasks in the model appear to be representative R-side services that must be performed in response to traffic. However, CAASD’s asser- tion that the model covers 90 percent of the R-side tasks is not well estab- lished. To be sure, all R-side responsibilities are not modeled; for instance, the committee observes that there are no tasks associated with issuing weather and traffic advisories, which is an R-side service. While such unmodeled tasks may or may not have a significant effect on task load, the rationale for their absence and the potential impact on task load need to be addressed. Compared with the other modeled tasks, monitoring is the most con- fusing and difficult to connect to traffic activity. Monitoring involves scanning of the radar display by the controller to maintain situational awareness of flights under sector control. The model assumes that mon- itoring is performed by the R-side controller for all traffic, which is a rea- sonable assumption. It is assumed further that a certain (but undefined) amount of monitoring is already included in other task times, particu- larly in the time-consuming delay tasks. CAASD nevertheless added a separate monitoring task so as not to underestimate monitoring, partic- ularly for the most straightforward traffic transiting a sector. While mon- itoring is an important task, its treatment in the model is confusing and

OCR for page 26
44 Air Traffic Controller Staffing in the En Route Domain unconvincing. Since the concern about overestimating monitoring time relates mainly to the delay tasks, a simpler and more transparent treat- ment would be to define how much monitoring time is already included in these task times. It is important to keep in mind that the nine tasks in the model rep- resent only the R-side tasks. In considering the scope of R-side services only, the modeled tasks may be adequate in coverage. From the stand- point of estimating PTT, however, the model provides an incomplete picture of controller task load because the modeled tasks are not linked to D-side services. More consideration is given to this shortcoming in the following chapter. Task Time Derivation For seven of the nine modeled tasks, GOMS is used to derive task times. The other task times are developed through consultations with subject matter experts. None of the task times is derived from the observation and analysis of controllers performing tasks in the field or in experiments. CAASD’s comparison of some GOMS and HITL times indicate that the former are 10 to 20 percent lower than the latter. These limited com- parisons, however, are the only means by which task times have been evaluated, apart from asking subject matter experts to assess them. CAASD claims the literature lacks relevant task times, prompting it to use GOMS and other means for estimating times. CAASD selected GOMS as a primary method believing it to be an efficient and inexpen- sive approach, particularly when compared with gathering and analyz- ing data from operational and experimental settings, such as those for HITL experiments. CAASD believes that GOMS modeling will allow for continued updating of the task times even as controller procedures and capabilities change. The committee questions the extensive use of GOMS for task time derivation and the complete absence of task times developed through field observation or HITL experiments. GOMS modeling is typically used where there is limited opportunity to observe and analyze task per- formance in operational or experimental settings. These conditions do not exist in the air traffic control domain. The GOMS-derived times are based largely on expert judgment, and only loosely validated against task

OCR for page 26
Task Load Model 45 performance data obtained from HITL experiments conducted for other purposes. Given these circumstances, there is no way to know whether the task times used in the model are at all valid. Computation of Task Load The addition of task times to calculate R-side task load may be the sim- plest approach to computing task load while still being reasonable. However, adding task times does not account for the possibility—and real-world probability—that some tasks are performed concurrently and that the time it takes to performs tasks can change depending on the total task load or the number of controllers working the sector. These tenu- ous assumptions may or may not be critical to the task load results. Examining their potential impact on task load, however, is important for making a convincing case that the assumptions do not represent serious modeling deficiencies. This case has not been made.