Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 63
4
Engaging Data Users
I
n Chapter 1, the dynamic and growing role of the Internet as a force
for change in National Center for Science and Engineering Statistics
(NCSES) dissemination practices was briefly discussed. Rosabeth Moss
Kanter has made the point that, although the Internet offers new challenges
and opportunities, the quality of the customer experience remains centrally
important to the success of many (Kanter, 2011). In developing dissemina-
tion policies and procedures, fulfilling the needs of data users in a manner
that exceeds expectations of the user should be a key goal for NCSES.
Although NCSES has long been committed to serving the needs of data
users, it has not gathered sufficient information on who its users are, how
they use its data, and how well it is meeting their needs. Although NCSES
has made several notable attempts to gather this intelligence about user
needs, it does not have a formal, systematic, consistent, structured, and
continuing program for doing so.
One problem for NCSES is that there are multiple communities of
users for which products must be developed. Furthermore, the breadth and
diversity of NCSES data users will expand as it orients itself to the broader
mission mandated by the America COMPETES Act. For the most part, out-
reach efforts have been addressed to those whom NCSES perceives to be in
its main user community. The user community consists mostly of research-
ers and analysts of research and development (R&D) expenditures and the
R&D workforce, particularly those concerned with federal science policy.
The panel heard from key data users in the course of its workshop and
through interviews conducted by panel members and staff. These users
were representing the legislative and administrative branches of the federal
63
OCR for page 63
64 COMMUNICATING SCIENCE AND ENGINEERING DATA
government, the organizations that support federal government science
and engineering (S&E) analysis, the academic community, and regional
economic development analysts. In the presentations and interviews, these
users were asked to address, from their perspective, the current practices of
the National Science Foundation (NSF) for communicating and disseminat-
ing information in hard-copy publication format as well as on the Internet
through the NCSES website, and the Integrated Science and Engineering
Resources Data System (WebCASPAR), and the Scientists and Engineers
Statistical Data System (SESTAT) database retrieval systems.
CONGRESSIONAL COMMITTEE STAFF
Panel and staff members met with staff of the House Subcommittee on
Research and Science Education to discuss congressional staff uses of the
NSF S&E information. Staff work in support of the committee is a fast-
turnaround operation, requiring speed in retrieving data and easy access.
In fulfilling its work, the committee staff makes extensive use of S&E Indi-
cators in hard copy. The staff relies on the report narrative to help them
interpret the data; the analysis helps them put the numbers into perspective.
They expressed the view that data tables lacking explanation are subject
to misinterpretation. Like other user groups interviewed by the panel, the
congressional staff expressed concern about the timeliness and frequency
of the survey-based data.
The main use of the website occurs when the staff is away from the
office and hard copies of the publications. They most often use Google as
the search engine for discovering S&E information, commenting that the
search capability of the NSF site is cumbersome and unreliable. In response
to a question about use of WebCASPAR, there seemed to be confusion as to
what WebCASPAR is and whether, in fact, they did use it at all. The staff
often turns to the American Association for the Advancement of Science
web database when they need NSF statistics, because it is readily available
and comprehensive.
The House committee staff would like to have access to Indicators in
June rather than in the following spring, and the committee had proposed
legislation to make that happen; the legislation was not supported in the
Senate.
Staff also expressed a need for more usability tools, such as the ability
to link to other data. This capability may be available in Data.gov, but the
staff has not used Data.gov very much. They were also interested in the
possibility of visualization tools for the data. Some data needed for support
of legislative initiatives are not presented in the aggregation (i.e., tables and
cross-cuts) they desire. For example, the staff would like disaggregated S&E
workforce and science, technology, engineering, and mathematical educa-
OCR for page 63
65
ENGAGING DATA USERS
tion data by occupation, industry, and geography. Also, they need more
data broken out by field of science and engineering.
CONGRESSIONAL RESEARCH SERVICE
As an arm of the Congress, the Congressional Research Service (CRS)
responds to members of Congress and the congressional committees. In
meeting the requirements of Congress for objective and impartial analysis,
CRS publishes periodic reports on trends in federal support for R&D, as
well as reports on special topics in R&D funding. Both types of studies rely
heavily on data from NSF, both as originally published and as summarized
in such publications as Science and Engineering Indicators and as extracted
from the NSF website. The panel met with Christine Matthews, specialist in
science and technology policy in the Resources, Science, and Industry Divi-
sion of the Congressional Research Service. She is the primary staff contact
with NSF. Her recent publications include The U.S. Science and Technol-
ogy Workforce (2009), Science, Engineering, and Mathematics Education:
Status and Issues (2007), and National Science Foundation: Major Research
Equipment and Facility Construction (2009).
Matthews is a frequent user of NSF information. She makes 8-10 visits
to the NSF website each day and is a listserv subscriber. Although she visits
the NSF website often during a given day, many of those searches are on the
general NSF awards site and sites for divisions other than NCSES.
In addition to general information on S&E expenditures and work-
force, she specifically references data on academic R&D for historically
black colleges and universities (HBCUs) and information on R&D facilities
and equipment. She directs most of her specific inquiries through the NSF
congressional liaison office, mentioning staff member George Wilson.
She commented that her use of the data is limited by the curtailment
in the amount of published information in NSF reports that accompanied
the shift from hard-copy to electronic dissemination of several of the key
reports. The HBCU data, for example, was located in a special report with
analysis and extensive tables, but now they appear only as an InfoBrief
and in data tables.
Most of her data requests are filled by data readily available on the
website. She has requested special data runs only a few times, noting
that not everyone has the ability to request special data runs. Her experi-
ence with WebCASPAR is positive, as it is user-friendly. She has not used
SESTAT.
The timelines of the data is not a particular problem for her. She recog-
nizes that the data require time for collection and processing. For most of
her uses, the data are sufficiently timely. She is able to satisfactorily explain
the lags to congressional staff members when pressed. She does not gener-
OCR for page 63
66 COMMUNICATING SCIENCE AND ENGINEERING DATA
ally use visualizations of NCSES data, but when she does, she would prefer
visualizations in color.
OFFICE OF SCIENCE AND TECHNOLOGY POLICY
Representing the Office of Science and Technology Policy (OSTP), Kei
Koizumi summarized the extensive use of NSF S&E information by this
agency of the Executive Office of the President. He typically accesses the
NCSES data primarily through the NCSES website, through the detailed
statistical tables for individual surveys. He commented that the InfoBrief
series is useful in that it informs him about which data are new. He reads
each InfoBrief and explores some of the data further. For data outside his
core area (R&D expenditures data), he often looks for the data in S&E
Indicators, and, if needed, he goes to the most current data on the NCSES
website. He uses WebCASPAR to access historical data and long time series.
His overall comments focused attention on the timeliness of the data,
suggesting that, to users, the data are never timely enough, although some
of the lags are understandable. He remains optimistic that next year the
data will be available earlier. He expressed concerns over the quality of the
data, and the methodology employed in the federal funds survey, which
were summarized in a recent National Research Council report (National
Research Council, 2010).
SCIENCE AND TECHNOLOGY POLICY INSTITUTE
The Science and Technology Policy Institute (STPI) was created by Con-
gress in 1991 to provide rigorous objective advice and analysis to OSTP and
other executive branch agencies, offices, and councils. Bhavya Lal and Asha
Balakrishnan reported on the activities and interests of STPI, which can be
considered a very sophisticated user of NSF S&E information. STPI sup-
ports sponsors in three broad areas: strategic planning; portfolio, program,
and technology evaluation; and policy analysis and assessment.
In their presentation, Lal and Balakrishnan reported on several specific
examples of the attempts by STPI to use NSF S&E information. In one
task, investigators sought to determine the amount of research funded by
government and industry for specific subfields of interest (i.e., networking
and information technology). They were able to obtain percentage basic
research of R&D “by source” and “by performer” for government and
industry, but not broken out by specific fields or sectors of interest as broad
as networking and information technology. They were able to get data on
industry R&D by fields (i.e., North American Industry Classification Sys-
tem codes), but without the breakdown of basic research, applied research,
OCR for page 63
67
ENGAGING DATA USERS
and development funding. Based on this experience, the investigators rec-
ommended that NSF provide access to the data in a raw format.
Their overall view was that access to NSF-NCSES data tables and briefs
is extremely helpful in STPI’s support of OSTP and executive agencies.
However, access to the data in a raw format would better enable assess-
ment of emerging fields. The STPI researchers would like to obtain the data
sets that underlie the special tabulations related to publications, patents,
and other complex data. Similarly, they would like access to more notes on
conversions, particularly to international data, to understand underlying
assumptions; for example, China’s S&E doctoral degrees. For their work,
they requested more detail on R&D funding/R&D obligations by field of
science and by agency, although, for their needs, those data need not be
publicly available.
ACADEMIC USES
Paula Stephan of Georgia State University, who classifies herself as
a “chronic” user of NSF S&E information, summarized her uses of the
data. She has a license with NSF, and about 40 to 50 times a year she
uses restricted files pertaining to SDR, SED and SESTAT, InfoBriefs, and
the Science and Engineering Indicators appendix tables. She accesses data
through WebCASPAR. Graduate students use WebCASPAR to build tables
and create such variables as stocks of R&D, stocks of graduate students,
and stocks of postdoctorates by university and field. She reported that
WebCASPAR can be difficult for new users to navigate, but they have to
use WebCASPAR because the NCSES web page does not always have the
most up-to-date links to data. For example, the number of doctorates for
2007 and 2008 is available only from WebCASPAR.
She commented that the S&E indicators appendix tables are easy to use
and that the tables are very well named, so it is easy to find data. The ability
to export the data to Excel allows one to easily analyze data.
Stephan noted that she does not use table tools, but her colleague,
Henry Sauermann, did so for a study, and he reported that table tools
provided him with exactly what he needed (starting salaries for industry
life scientists). She pointed out that NSF staff have been very responsive to
user needs. For example, in 2002 users recommended that NCSES collect
information on starting salaries of new Ph.D.s in the SED, and, beginning
in 2007, the question was on the SED.
She suggested a need for more user support. Data workshops were held
for three years that brought together users and potential users of licensed
data. This same approach could be useful for acclimating users to web-
based data. It would be a good way to find out how people use the data and
to find out difficulties with or questions that people have about the data.
OCR for page 63
68 COMMUNICATING SCIENCE AND ENGINEERING DATA
Like other users, Stephan commented that a major problem with the
data is timeliness. The lack of timeliness affects the ability of researchers
to assess current issues, such as the effect of the 2008-2010 recession on
salaries, the availability of positions, the length of time individuals stay in
postdoctoral status, and the mobility of S&E personnel. As an example
of the lag, she pointed out that the 2008 SDR will be publicly released in
November 2010 but the restricted data will not be released for licensed use
until sometime in 2011 (the data were collected in October 2008). Owing
to this lag, the data will provide little useful information about how the
recession affected careers: analysts will have to wait until fall 2012 to get
the 2010 data and will have to wait until sometime in 2013 to get the
restricted data.
Similarly, the earliest SED data collected during recession—for July 1,
2008 to June 30, 2009—were not scheduled to be released until November
2010 (note: the data release was subsequently delayed to allow for correc-
tion of data quality issues in the racial/ethnic data). So it is “early” reces-
sion data, although it will be analytically important because it will be the
third year for which salary data have been collected in SED: when these
SED salary data are available, analysts will be able to learn a good deal
comparing the data with earlier years. However, such analyses will have to
wait until November 2011 when the 2010 SED (July 1, 2009, to June 30,
2010) data are released (and assuming that salary data are made available).
Stephan pointed out the timeliness is not a new issue. She quoted a
2000 National Research Council report: “SRS must substantially reduce the
period of time between the reference date and data release date for each of
its surveys to improve the relevance and usefulness of its data” (National
Research Council, 2000, p. 5).
REGIONAL ECONOMIC DEVELOPMENT USERS
Jeffrey Alexander, a senior science and technology policy analyst with
SRI International, is a frequent user of NSF S&E information and a con-
tractor to NSF. In his presentation, he summarized his previous private-
sector uses of the information, mainly focused on uses of the data for
analysis of technology applications at the state policy level.
He accessed data from the website and through use of WebCASPAR.
He stated a major caution about the comparability of data sources and
noted that good metadata (data about the data) are not generally available
for NCSES data. In particular, he said there is a need for more detailed
geographic coding of the data so one can be confident in matching NSF
data with data from the Bureau of Labor Statistics and other sources. Like
other users, he expressed a concern with the timeliness of the data and said
that timeliness is a key factor in the relevance of the data.
OCR for page 63
69
ENGAGING DATA USERS
With regard to access, Alexander said he often needs trend data, so he
most generally goes to the tables on the web page to extract specific data
items. He has problems in downloading multiple files, and he finds that
the WebCASPAR and SESTAT tools are not very user-friendly. A useful
enhancement would be to enable searches for variables across the various
surveys. He does not use the printed publications, although he finds that
the InfoBriefs are very useful in announcing and highlighting new products.
Alexander suggested that NCSES needs to become a center of informa-
tion for the user community, and it should devote more attention to reach-
ing out to larger users with information about how to access data as well
as to seek input for improvements.
LIMITATIONS OF USER ANALYSIS
The input received in the workshop and in the interviews was very
helpful to the panel in framing its analysis of user needs. The users of
NCSES data can conveniently, if imprecisely, be classified as primary users
(those who directly use NCSES data in their research and analysis); second-
ary users (those who indirectly rely on NCSES products to understand and
gauge the implications for programs, policy, and advocacy, and those who
assist others in obtaining access to the data); and tertiary users (the public).
The input of primary users was extensively provided in the panel work-
shops and in interview sessions, and some information was gathered from
secondary users, but information from tertiary users was less systematically
gathered and is given less attention in this report. Only since NCSES has
begun to conduct consumer surveys is information about the needs of all
user groups becoming known.
It is incumbent on NCSES to consider the needs of all of these groups
and the technology platforms they use to access the data as the agency
considers the program of measurement and outreach discussed in this
report. NCSES could consider novel means of harvesting information
about data use to analyze usage patterns, such as reviewing citations to
NCSES data in publications, periodicals, and news items. For example,
to get a sense of users who are citing S&E Indicators, a panel member
did a Web of Science “cited reference search” on *NAT SCI BOARD and
(sci eng ind*). This exercise yielded a list of 691 publications going back
to 1988, shortly after S&E Indicators was introduced under that name.
Google Scholar is another potential source of such information.
Reaching out to a wide variety of data users by means of surveys or
interviews would be another worthwhile initiative. Moreover, such inter-
actions would inform NCSES not only about user dissemination needs,
but also about their substantive data needs, such as subject, variables, and
level of geography. A list of organizations that could be contacted to assist
OCR for page 63
70 COMMUNICATING SCIENCE AND ENGINEERING DATA
in obtaining input on uses of S&E information would include the Ameri-
can Association for the Advancement of Science, the American Economic
Association, the Association of Public and Land-Grant Universities, the
Association of Public Policy Analysis and Management, the Association for
University Business and Economic Research, the Council for Community
and Economic Research, the Industry Studies Association, the International
Association for Social Science Information Services and Technology, the
Interuniversity Consortium for Political and Social Science Research, the
National Association for Business Economics, the Special Libraries Asso-
ciation Divisions of Biomedical and Life Sciences and Engineering, and the
State Science and Technology Institute. One means of ensuring that the
needs of the secondary and tertiary data users are met is to ensure that pro-
grams of outreach are specially directed to members of the media—those
who rerelease the NCSES data and interpret them to the public.
Among the tools that NCSES has used to assess user needs, according
to John Gawalt, NCSES program director for information and technol-
ogy services at the time of the workshop, are URCHIN, a web statistics
analysis program that analyzes web server log file content and displays the
traffic information on the basis of the log data, and WebTrends, software
that collects and presents information about user behavior on its website.
With proper permissions and protections, NCSES is also contemplating
using cookies to identify return users and increase the efficiency of filling
data requests.
In April 2011, NCSES took another step in the direction of obtaining
user feedback when it placed a link on the website that directs users to a
short customer survey to formally measure satisfaction and initiated an
email-based sample survey, sent to customers who had requested electronic
notification of new NCSES reports. As of mid-August 2011, the agency
had received 44 responses to the website survey and 20 responses to the
email survey. Most of those responding to both surveys were researchers,
students, and teachers, with a smaller number of librarians, reporters, and
policy makers, including legislative staff members.
The respondents viewed the organization of the home page in positive
or neutral terms, reporting that they could find what they were looking
for using the current topical groupings or that they could find what they
needed even though the organization was not satisfactory. Not surprisingly,
researchers tended to want more in-depth reports with extensive data and
analysis and detailed data tables, whereas reporters and policy makers were
more likely to be satisfied with short, topical reports with summary data
and analysis. Students and teachers varied in their needs and were about
split between wanting short, topical reports and wanting more in-depth
reports. Detailed data tables were commonly requested from this subset
of customers as well. The staff of NCSES reports that it will continue to
OCR for page 63
71
ENGAGING DATA USERS
solicit the views of visitors to the website and to periodically solicit views
from a sample of requestors of electronic notification of NCSES reports in
the future.
Recommendation 4-1. The National Center for Science and Engineering
Statistics (NCSES) should analyze the results of its initial online con-
sumer survey and refine it over time. Using input from other sources,
such as regular structured user focus groups and panel-based periodic
user surveys, NCSES should regularly and systematically collect and
analyze patterns of data use by web users in order to develop a typol-
ogy of data users and to identify usability issues.
The surveys are a useful start, but there is much more that can be
accomplished by way of seeking the input of data users. In seeking a model
for outreach to users, NCSES could consider modeling its efforts on the
very aggressive program of Statistics Canada, described at the workshop
by panel member Diane Fournier. Statistics Canada uses a combination
of online questionnaires, focus groups, and usability testing to assess user
needs and the usability of its website. One advantage of this approach,
although it is resource intensive, is the possibility of gathering useful infor-
mation from a wide range of users, both from regular users, who are knowl-
edgeable, and from secondary and tertiary users, who are less familiar with
the data.
Another initiative that NCSES could undertake to better determine
user needs is to renew the data workshops that it conducted for several
years but have been discontinued. Those workshops brought together users
and potential users of licensed data. This same approach could be useful
for acclimating users to web-based data and to introduce frequent users to
changes in data dissemination practices and procedures. Such data work-
shops would be a good way to find out how knowledgeable data users
use NCSES data and to find out what concerns users have about the data.
These workshops could be conducted onsite, in remote locations (perhaps
in conjunction with meetings of interested associations), or by means of
webinars (perhaps hosted by interested associations).
The input received in the workshop and in the interviews was very
helpful to the panel in framing its analysis of user needs. We recognize that
the analysis relies mainly on the input of primary and, to a lesser extent,
secondary users. The panel was not able in the time allowed to systemati-
cally gather much information from tertiary users (such as policy makers,
the media, and librarians). Nonetheless, the panel thinks that it is incum-
bent on NCSES to consider the needs of all three of these groups and the
technology platforms that they use to access the data as it considers the
program of measurement and outreach discussed in this report.
OCR for page 63
72 COMMUNICATING SCIENCE AND ENGINEERING DATA
The agency can begin by developing a concrete typology of its data
users. One approach to this would be to develop user personas—that is,
stereotypical characters who represent the variety of user types for the sci-
ence and engineering data (Pruit and Adlin, 2006, p. 3). These personas are
usually developed by distilling data collected in interviews with users, much
as the panel has tried to do in this report. The personas could be formal-
ized in short descriptions to aid data dissemination designers, in that they
provide a common description of the needs, skills, and the environment
faced by the various user persona.
A related approach would be to develop a typology of user interac-
tion scenarios that describe what users do with the online resources. The
user scenario would provide a concrete and flexibly detailed representation
of the tasks that users will try to carry out with the systems (Rosson and
Carroll, 2002). These two aids (personas and scenarios) provide for a user-
centered integration of the system life cycle. Once done, they will serve as
a reference for subsequent redesign and they help to focus the design of
usability tests and user assistance programs.
Recommendation 4-2. The National Center for Science and Engineer-
ing Statistics should educate users about the data and learn about the
needs of users in a structured way by reinstating the program of user
workshops and instituting user webinars.
The outreach activities discussed in this chapter, along with the devel-
opment of a formal typology of users, will assist NCSES to better under-
stand and respond to user needs. These activities will also assist the agency
in allocating its scarce resources to the groups and needs that have the
greatest return to the dissemination investment.
USING WIKI AND OTHER COLLABORATION
TOOLS FOR COMMUNICATION WITH USERS
Another means of obtaining user input is offered by means of online
collaboration tools, or wikis. Wikis have greatly improved the ability of
federal agencies to establish open lines of communication and engage com-
munities interested in their activities (Schroeder, Eynon, and Fry, 2009).
The most widely used wiki tool is Wikipedia, the collaboratively cre-
ated online encyclopedia. The Wikipedia Foundation provides the com-
puting infrastructure, the server, wiki software, general rules for entries,
and style guidelines. Content is generated by anyone who has access to an
Internet browser. Users can edit existing content pages or create new pages
on topics not yet covered. The Wikipedia wiki software provides the online
editing environment, tracks the changes made to pages, and allows con-
OCR for page 63
73
ENGAGING DATA USERS
tributors to engage in an online discussion about the content of pages. Page
and text formatting is accomplished by simple specialized mark-up tags.
Wiki software tools have increasingly been adopted by government
agencies as a platform for sharing information and as a means of encour-
aging the sharing of best practices and other types of information. Wiki
software is available from commercial software vendors and as open-source
software. Standard tools include software for group editing of online con-
tent, blog pages, threaded discussions, and file management for group
access to files and images.
A version of Wiki has served as the foundation of Eurostat’s dissemina-
tion system, called “Statistics Explained.”1 This is a new way of publishing
European statistics, explaining what they mean, what is behind the figures,
and how they can be of use, in an easily understandable language. Statistics
Explained looks similar to Wikipedia, but unlike it, information can be
updated only by Eurostat staff, thus ensuring the authenticity and reliability
of the content. The latest data can be accessed through hyperlinks available
in each statistical article.
The U.S. General Services Administration (GSA) operates a wiki envi-
ronment to encourage communication across governmental entities. The
GSA site emphasizes a “community of practice” model for taking advan-
tage of wiki software. People who have some engagement in a particular
subject or project can benefit from a central online point of contact rather
than attempting communication through a series of email conversations.
Wikis and other online collaboration tools can help maintain a dialog
with academics and outside experts. Wiki pages on technical issues related
to the database could generate a valuable two-way flow of information
about technical issues between outside researchers and staff experts at
NCSES.
KEEPING USERS INFORMED
The current NCSES websites and published reports appropriately point
users to technical descriptions of the data collections and identify staff who
are ready and able to assist users in their use of the data. However, a perusal
of other federal statistical agency websites identifies useful information
sharing. For example, the Census Bureau’s Manufacturing and Construc-
tion Division, which manages the Business Research and Development and
Innovation Survey (BRDIS) for NCSES, includes on its website (see http://
www.census.gov/mcd/clearance/ [November 2011]) a listing of the open
opportunities for public comment noted in the Federal Register, identifies
1 See http://eppeurostat.ec.europa..eu/statistics_explained/index.php/Main_Page [November
2011].
OCR for page 63
74 COMMUNICATING SCIENCE AND ENGINEERING DATA
planned changes, and includes copies of the forms and the supporting docu-
ments as submitted to the Office of Management and Budget (OMB). The
technical information in these OMB clearance packages can assist users in
understanding the strengths and weaknesses of the data.
ENHANCING USABILITY OF NCSES DATA
Usability is generally understood to be the extent to which a product
can be used by specified users to achieve specified goals with effectiveness,
efficiency, and satisfaction in a specified context of use (ISO 9241-11). The
field of website usability is developing rapidly and now includes sophisti-
cated methods to gather feedback from users about their interactions with
websites.
Although there is no single broad measure of website usability, some
very useful guidelines are contained in a federal government publication,
Research-Based Web Design and Usability Guidelines, prepared jointly
by the U.S. Department of Health and Human Services and the General
Services Administration (U.S. Department of Health and Human Services
and the General Services Administration, No date). Identified on the web
as Usability.gov, the publication contains guidelines that emphasize the
need to design websites “to facilitate and encourage efficient and effective
human-computer interactions.” The guidelines call on website designers to
strive to reduce the user’s workload by taking advantage of the computer’s
capabilities on the premise that users will make the best use of websites
when information is displayed in a directly usable format and content
organization is highly intuitive.
The guidelines make the point that task sequences make a difference.
The sequencing of user tasks need to be consistent with how users typically
do the tasks for which they have visited the site, using techniques that do
not require them to remember information for more than a few seconds,
employing terminology that is understandable, and taking care to refrain
from overloading users with information. Likewise, users should never be
sent to unsolicited windows or unhelpful graphics. The guidelines empha-
size that speed in accomplishing a task is important, and users should not
have to wait for more than a few seconds for a page to load, and, while
waiting, should be supplied with appropriate feedback. Tasks like printing
information should be made easy.2
2 See http://www.usability.gov/pdfs/chapter2.pdf [November 2011].
OCR for page 63
75
ENGAGING DATA USERS
Evaluation of the NCSES Website
In order to assess how well the current NCSES website (see http://www.
nsf.gov/statistics/ [November 2011]) fulfills these basic usability guidelines
and criteria, the panel conducted an evaluation of the site as it appeared
in May 2011. This review was by no means exhaustive. Rather, the goal
was to stimulate the development of a formal usability process by briefly
reviewing the current design. Clearly, further user research would be neces-
sary prior to making improvements to the current design. Any decision to
change the website’s design, including content and organization, must be
based on user feedback and a usability evaluation testing strategy, which is
presented at the end of this review.
It is apparent that having the NCSES web pages as a subsite of the NSF
website poses limitations for NCSES website designers. If not treated care-
fully, this fact of life could increase the difficulty of navigating the site for
NCSES data users. For example, the design of the NCSES tab “Statistics” is
a path to a different site altogether. The visual cue indicating for users that
they are still within the NSF website is the use of the same visual design
template (same header, footer, and title format with image), which is crucial.
However, the main issue with this design is that users can have some
difficulty finding NCSES if their point of entry is the NSF home page. From
the NSF home page, users are expected to find what they are looking for by
exploring the site through main and secondary navigation.
Once users find the NCSES subsite (from the NSF home page or directly
via an Internet search engine or bookmark), users are faced with an organi-
zation-centric site rather than a user-centric site based on tasks. The current
design appears to try to educate return visitors on how to navigate the site,
but it would be best to organize the site in a way that all users (frequent,
infrequent, or new users) can quickly and efficiently accomplish the task
they are setting out to do. Suggestions for reorganizing the NCSES subsite
appear in Appendix B.
On the whole, the evaluation of the NSF website points to the need for
more systematic user-centered design and more regular usability evaluation.
There are a number of methods in use, including expert heuristic evalua-
tion, usability testing with small samples of actual users, and large-scale
web browsing analytics.
A heuristic evaluation is recommended as an initial approach. It is
one lightweight method that web designers use for discovering usability
problems in a user interface design (electronic or paper prototypes), so
that problems can be addressed throughout an iterative design process. The
evaluation is usually employed early in the design process—the earlier, the
better. Jakob Nielson, an expert in this field, recommends having between
three and five evaluators separately review the interface design. The num-
OCR for page 63
76 COMMUNICATING SCIENCE AND ENGINEERING DATA
ber of issues discovered increases with each evaluator, but the cost-benefits
begin to decrease after five (Nielsen, 1994; Nielsen and Molich, 1990).
Along with the customer surveys and focus groups recommended in Recom-
mendation 4-1, the heuristic evaluation can intelligently inform the process
of designing a more effective and efficient website.
Recommendation 4-3. The National Center for Science and Engineer-
ing Statistics should employ user-focused design and user analysis,
starting with an initial heuristic evaluation and continuing as a regular
and systematic part of its website and tool development.
Meeting Compliance Standards
Websites should be designed to ensure that everyone, including users
who have difficulty seeing, hearing, and making precise movements, can
use them. Generally, this means ensuring that websites facilitate the use
of common assistive technologies. As a federal government agency, NSF is
governed by the Section 508 regulations. These amendments to the Reha-
bilitation Act require federal agencies to make their electronic and infor-
mation technology accessible to people with disabilities. Section 508 was
enacted to eliminate barriers in information technology, to make available
new opportunities for people with disabilities, and to encourage develop-
ment of technologies that will help achieve those goals. The U.S. Access
Board has responsibility for the Section 508 standards and has announced
its intention to harmonize the web portions of its Section 508 regulations
with Web Content Accessibility Guidelines (WCAG) 2.0, for which the Web
Accessibility Initiative (WAI) has responsibility. Statistical Policy Directive
Number 4 (March 2008) directs statistical agencies to make information
available to all in forms that are readily accessible.3
Some of the major accessibility issues to be dealt with include the
following:
• provide text equivalents for nontext elements;
• ensure that scripts allow accessibility;
• provide frame titles;
• enable users to skip repetitive navigation links;
• ensure that plug-ins and applets meet the requirements for acces-
sibility; and
• synchronize all multimedia elements.
3A summary of Section 508 is available. See http://www.section508.gov/index.cfm?fuse
Action=stdsSum [November 2011].
OCR for page 63
77
ENGAGING DATA USERS
When it is not possible to ensure that all pages of a site are accessible,
designers should provide equivalent information to ensure that all users
have equal access to all information.4 Other standards include the “web
accessibility initiative” of the World Wide Web Consortium (W3C), which
provides guidance and tools for a range of websites and applications. Even
more significant, given the possibility for rich dynamic interaction with
these data resources, is that W3C has also developed standards for access
to dynamic content, with specific guidelines in four categories:
1. Accessible rich Internet applications: address accessibility of
dynamic web content, such as those developed with Ajax, dynamic
HTML, or other such technologies.
2. Authoring tool accessibility guidelines: address the accessibility of
the tools used to create websites.
3. User agent accessibility guidelines: address assistive technology for
web browsers and media players.
4. Web content accessibility guidelines: address the information in a
website, including text, images, forms, and sounds.
The convention when considering web design for individuals with
disabilities is to ensure that the site is accessible to those who are visually
impaired. However, there is a much wider range of ways in which some-
one’s access to information should be considered when developing websites
and web applications. For example, a chart that is color-coded may not be
readily interpreted by someone with color blindness, multimedia files may
not be accessible to someone with deafness unless they are accompanied by
transcripts, and someone with a cognitive disability, such as attention deficit
disorder, may find websites that lack a clear and consistent organization
difficult to navigate.5
Data Accessibility Issues
The accessibility of tabular data and data visualization is an open
research question. Although W3C has pioneered standards for accessibility
of dynamic user interfaces, many other issues, including table navigation,
navigation of large numeric data sets, and dynamic data visualization,
raise computer-human interaction challenges that have been explored only
4 U.S. Department of Health and Human Services, Research-based Web Design and Usability
Guidelines, p. 23. (2008). See http://www.usability.gov/guidelines/guidelines_book.pdf [May
2011].
5 Presentation of Judy Brewer, director of the Web Accessibility Initiative at the W3C, on the
issue of accessibility of information on the web.
OCR for page 63
78 COMMUNICATING SCIENCE AND ENGINEERING DATA
peripherally. The issue of accessibility is a clear opportunity for NSF to
partner with scientists with disabilities and those who work on interface
design and so lead by example.
In order for NSF S&E information to be used, it must be accessible to
users. By nearly eliminating the hard-copy publication of the data in favor
of electronic dissemination, mainly through the web, NSF is committed
to the provision of web-based data in an accessible format, not only for
trained sophisticated users, but also for users who are less confident of
their ability to access data on the Internet. Importantly, the user popula-
tion includes people with disabilities for whom, by law and right, special
accommodations need to be made.
The panel benefited from a presentation by Judy Brewer, who directs
the WAI at W3C. W3C hosts the WAI to develop standards, guidelines, and
resources to make the web accessible for people with disabilities; ensure
accessibility of W3C technologies (20-30 per year); and develop educational
resources to support web accessibility.
Brewer stated that Web 2.0 adds new opportunities for persons with
disabilities, and that data visualization is a key to effective communication.
However, people with disabilities face a number of barriers to web acces-
sibility, including missing alternative text for images, missing captions for
audio, forms that “time out” before they can submit them, images that
flash and may cause seizures, text that moves or refreshes before they can
interact with it, and websites that do not work with assistive technologies
that many people with disabilities rely on.
In response to a question, Brewer addressed the continued problem of
making tabular information accessible, and she requested input on where
the WAI should go in this area. She referred to a workshop held by the
National Institute of Standards and Technology on complex tabular infor-
mation that resulted in several recommendations.
Brewer argued for publishing existing S&E data in compliance with
Section 508 requirements, while continuing R&D on accessibility tech-
niques for new technologies, improved accessibility supports for cognitive
disabilities, and more affordable assistive technologies, such as tablets. She
said WAI would partner with agencies to ensure that dissemination tools
are accessible.
Recommendation 4-4. The National Science Foundation should spon-
sor research and development on accessible data visualization tools
and approaches and potential other means for browsing and explor-
ing tabular data that can be offered via web, mobile, and tablet-based
applications, or browser-based ones.