Hunter Gatherer:
Interaction Support for the Creation and Management of Within-Web-Page Collections

m.c. schraefel,1 Yuxiang Zhu,1 David Modjeska,2 Daniel Wigdor,1 Shengdong Zhao1
1Dept. of Computer Science | 2Faculty of Information Systems
University of Toronto, Toronto, Canada
{mc |dwigdor |yuxiang|shengdong}@dgp.toronto.edu; modjeska @ fis.toronto.edu

Figure 1. Hunter Gatherer at work. A sample collection page is on the right. Below each component in the collection is the link to the component's source page. Each component has a default, editable title. Collections can contain any web page element: shown here are images, forms and text. In the upper left is the List/Edit window to monitor the collection as it is being created. In the lower left are the pages from which the collection was created. A video demonstration is available at http://shaka.dgp.toronto.edu/hg/overview

Copyright is held by the author/owner(s).
WWW2002, May 7-11, 2002, Honolulu, Hawaii, USA.
ACM 1-58113-449-5/02/0005.

ABSTRACT

Hunter Gatherer is an interface that lets Web users carry out three main tasks: (1) collect components from within Web pages; (2) represent those components in a collection; (3) edit those component collections. Our research shows that while the practice of making collections of content from within Web pages is common, it is not frequent, due in large part to poor interaction support in existing tools. We engaged with users in task analysis as well as iterative design reviews in order to understand the interaction issues that are part of within-Web-page collection making and to design an interaction that would support that process.

We report here on that design development, as well as on the evaluations of the tool that evolved from that process, and the future work stemming from these results, in which our critical question is: what happens to usersÕ perceptions of web-based resources and their web-based information management practices when they can treat this information as harvestable, recontextualizable data, rather than as fixed pages?

Categories

H5.4 Hypertext/Hypermedia-Architectures, Navigation, User issues. H5.2 User Interfaces-Prototyping.

General Terms

Design, Experimentation, Human Factors.

Keywords

Web-based interaction design, information gathering and management, attention, collections, transclusions

Copyright is held by the author/owner(s).
WWW2002, May 7-11, 2002, Honolulu, Hawaii, USA.
ACM 1-58113-449-5/02/0005.

1. INTRODUCTION

Studies of Web-based information interaction such as [2][5], have generally dealt with a Web page as the smallest unit of consideration. Task analysis carried out in a user study reported in [13] indicates that users, however, regularly need to deal with smaller units, that is, information components from within Web pages. The study found two things: (1) that Web users want to be able to make collections of information found from within Web pages, but that (2) users only infrequently make such collections, in large part because of poor interaction support for this activity. For instance, bookmarks, referencing entire pages often capture more than the desired data; this forces users first to load and then to sift through multiple pages to attempt to find the desired material. Text editors cause users to shift attention between the information gathering task in the browser and the information management task with the editor. With editors, users often forget or neglect to label the collected component with a title or the URL of the source page, making later access to the original material difficult, degrading the value of the collection over time.

Despite these shortcomings, those surveyed still expressed a need to create collections from material within Web pages. Scenarios for such collections are easy to imagine: a journalist might want to build a collection of different newspaper coverage of the same story. A student might build a heterogenous collection to reflect her current term, including courses, professors, gym hours and so on.

We developed Hunter Gatherer (HG) both to support this kind of within-Web-page collection making and to investigate how this novel interaction design might affect Web-based information practices. Hunter Gatherer (Figure 1) blends the transparency of bookmark capture for component selection, with the support of an editor for revising collections. The tool also automates the inclusion of a contextual, editable header/annotation for each component, and grabs the URL of the source page for that component (Figure 1, right; Figure 2, close up), so that users can return to the source document at any time.

Figure 2. Close up of single component in a collection. Figure shows collection title in the window name; automatic addition of both element header and URL back to componentÕs source page.

 

 

 

Our interaction goal for Hunter Gatherer's design is to let users, rather than the tool, determine which information activity they wish to focus on: gathering, management or contemplation of the collection. Our software goal has been to create a tool that integrates with the browser and utilizes web-based protocols so that the user does not require additional software to carry out these tasks. Our larger research goal is to use this tool to help us investigate both perceptions of and expectations of what might be called information flexibility in an information space that has previously defined the smallest unit of information to be the Web page. We wish to investigate how this might change once users have tools which can support information harvesting, in which they can replant or repurpose information elements from one context into ones of their own devising.

Hunter Gatherer is the result of an iterative process of user-based design, surveys and evaluation. This paper describes the most recent version of the artifact, the associated interaction design, and its evaluations. We begin with a discussion of Web-based collection management tools research and illustrate where this work does not address the interaction problem most relevant to within-Web-page collection making: shifting focus between information capture and post-capture information management. We follow this with a discussion of our prototype tool development. We present our evaluation and consequent evolution of the tool over several iterations. Finally, we report on lessons learned from these evaluations, and describe how the results have helped to refine our understanding of the tasks we hope to support, and the steps we wish to pursue in future work.

2. RELATED WORK

Our research investigates the problems faced by Web users who wish to carry out two related tasks: to gather information components from a variety of Web sources and to manage that gathered information. When we focus on information gathering on the Web, we foreground the process that Marshall et al. [9] refer to as "information triage," the act of moving through a variety of sources to determine quickly whether they are of potential worth. The sticking point occurs when, on making such a determination, we wish to capture the component identified for retrieval. When users are engaged in information triage, they currently lack a method for putting the identified components into a collection without needing to make the collecting activity a foreground task. While there has been much work done on the management of Web-based document collections (which we discuss below), there has been less work on the interaction activity of placing the identified information from the source into the collection. Therefore, our work has focused especially on the latter process.

2.1 Bookmarks and Visualization

Our design model for the kind of transparent interaction that we wish to emulate has been bookmark-making. Bookmarking is well integrated with most Web browsers. The user engages a simple command key sequence, or makes a menu selection, and the current page is added to a list of bookmarks. With slightly more concentration on the bookmark task, users can shift focus to more specific information management tasks: many bookmark tools, for instance, support adding bookmarks directly to specific folders within the bookmark list. Such interaction supports a gradient of task focus, from peripheral attention to main focus. While bookmarking supports this multiple attention level for interaction, its failure to help users retrieve information effectively from bookmarks has been well discussed in Abrams et al. [1]. To deal with the shortcomings of bookmarks for retrieving information, several research and commercial applications have been developed. While not completely applicable to our research, there are related findings from that web-based work which inform ours.

Card, Robertson and York's WebBooks [2] is an early example of an application for bookmark visualization. In this work, the entire Web page is always available, eliminating the requirement for a user to load each interesting bookmark iteratively. Collections of pages are visualized as books, where pages in the collection can be quickly "flipped through." While the WebBook eliminates the need for users to load pages, it still focuses on a complete Web page as the artifact of value.

More recently, Robertson et al. developed the Data Mountain tool to let users arrange bookmarks as page of thumbnails on an inclined plane. Compared with Internet Explorer's Favorites bookmark tool, participants were able to retrieve pages more quickly and with fewer errors [12]. Czerwinski et al. extended this work; they demonstrated that the name and the location of a bookmark on the plane were the two factors most important for successful retrieval; a page's thumbnail image was less important [5].

Amento, Terveen, Hill and Hix's TopicShop work [2][14] draws particularly on the Data Mountain research for letting users manage collections of sites on a given topic. In this case, an algorithm developed for TopicShop captures candidate sites, which become available to a user in a multi-paned window. In the site profile pane, for instance, a list of sites shows miniature thumbnails of the page, along with relevant site characteristics, such as name and number of links in and out of the page. This information helps users decide if they wish to visit the site. Users can then drag chosen sites into a "work area." The site is represented here as a thumbnail. Thumbnails can be "piled" into groups; groups are in turn reflected in the site profile window. Evaluation participants found this multi-view approach to evaluating and organizing collections to be TopicShop's most effective feature.

Once again, the Web page is the entity of value. This makes sense in the case of TopicShop, as the entire page or site is desired overall, since, by design, the pages collected are themselves either all "on topic" (e.g., a fan site) or are collections of links to such sites. It is not clear if the TopicShop algorithm could be extended to capture, for instance, a more heterogeneous notion of topic, as in the preceding student scenario. There, "My Term" as a topic might reflect an associative set of components such as courses and student loan information, rather than clusters of similar information.

2.2 Editors

Some editors such as Microsoft Windows' Front Page and Netscape Navigator's Communicator are better integrated for the within-Web-page collection process than basic text editors or even some word processors. Both applications let users open a blank, editable page into which they can drag content, including images, from the browser to the editor. Users can then edit the collected information in any way they wish. Unlike bookmark managers, the editor page makes all the collected components readily apparent to a user looking at the file. The file can be saved to a server via the editor's integrated FTP support. Users can also access the URL of any collected image. The same cannot be said, however, for any collected text. Unless the URL is specifically grabbed, that information is not captured. Similarly, the user must label the content themselves, since no page information (such as page title) travels with the copied content. Word processors such as Microsoft Word support drag and drop of both text and images from Web pages into files; plain text editors support text capture.

2.3 Hybrids: Spatial Hypertext

In Spatial Hypertext, which predates the emergence of the Web, the notion of the page, per se, does not exist. Documents are always already collections of data objects, like one's own notes on a topic, or references to other works. These data objects are manipulated in a 2D visualization space, so that the space in which a user creates a hypertext is also the space in which that document is viewed. This is a more elastic version of hypertext than what the Web currently supports. By way of intermediary, Mark Bernstein's Web Squirrel(*), is a tool that attempts to bring some of the data object vs. Web page approach to Web practice, though its main use is for annotating bookmarks rather than capturing components within pages. Web Squirrel lets users create and copy information (such as URLs) into a Web Squirrel file. The data is represented as squares to be directly manipulated in a 2D space. The objects can then be arranged and annotated. Agents sift through information in a collection (or "farm" in Squirrel parlance) and suggest connections among collected objects. Like bookmark lists, which only reveal a page title, not the page content, the Web Squirrel boxes hide annotation/link information attached to them. Also, only one box's information can be revealed at a time. As well, while users copy and paste text information from a Web page into Web Squirrel, the source URL for that text is lost unless the user also grabs the URL and drops that into the application. This URL will then show up as a distinct box from the text. Finally, Web Squirrel does not capture images or other media.

(*) http://www.eastgate.com/squirrel/FAQ.html

2.4 Overview

With the exception of a hybrid tool like Web Squirrel and the Spatial Hypertext work that informs it, Web-based research has focused on managing whole Web pages and sites, rather than on the discrete content within a Web page. Even in Spatial Hypertext with its emphasis on capturing one's own annotations, however, there is little consideration of the interaction of getting content from one context to another. We wish to expand the research to consider this interaction aspect of the movement among information gathering, capture and reflection, and how that can be supported in a web-based approach.

3. Hunter Gatherer Design Process

Our main goal for Hunter Gatherer has been to support the collection making interaction process for collecting within-Web-page components. To determine how best to do this, we carried out the task analysis, tool comparison and an initial prototype design review [13].

3.1 Goals

From our tools and task analysis, and prototype design review, we determined 3 requirements for Hunter Gatherer.

* First, the addition of components to collections must be as transparent as highlighting text.

* Second, the interaction must support user-determined, not tool-forced focus shift among component selection, addition, monitoring, and management.

* Third, the collected components must automatically capture enough contextual information for the collection to be immediately valuable for the user.

In the following sections, we present an overview of the artifact to support this process, and its evaluation in terms of these three goals.

3.2 Description of the Tool and Architecture Overview

3.2.1 Browser Integration

Hunter Gatherer is a browser-based, not a stand-alone application. By integrating Hunter Gatherer within the browser in a manner similar to browser support for bookmarking, we are able to minimize the forced divided attention [15] introduced by shifting between one application (the browser) and another (the editor); between information triage and management. Our approach is also proxy based. This means that the user does not have to download additional software to access the tool. While not perfect, the proxy approach also lets us support multiple operating systems and browsers simultaneously. Further, our interest is in the potential impact of supporting within-Web-page collection making on Web information practices. Multiple OS support lets us deploy the tool over a wide user space for this assessment.

3.2.2 Relation to Open Hypermedia

Hunter Gatherer collections are created by rendering references to a collection of addresses for the components within the Web pages. This means that there is no copying of content; only referencing of content addresses. This strategy closely emulates the Open Hypermedia concept of creating collections of smaller-than-page-size elements for what [7] refers to as "pick-up" styled, or arbitrary and user-determined, collections of components. By referencing locations within documents, HG Collections may also be framed in Open Hypermedia terms as user-defined (or user-authored) composites of anchors, as recommended by Halasz, "constructed by reference rather than by value" [8, p355]. We describe the benefits of this approach following an overview of the tool's architecture.

3.3 Basic Architecture

In the current system, once the client browser makes a request for a page, that page is run through a server-side process to convert the HTML to XML-compliant XHTML. Once the page is in XHTML, we can use XML's Document Object Model's tree structure for the document to determine the location of a particular component selected by the user. We have 2 methods to identify components for selection: one is by page element, such as a paragraph, indicated by the XHTML tags like <p></p>

The second method is to use XML's associated XPath to identify entities within elements, so that in <p>some text</p> a user can select, for instance, the last "e" of "some" and the first "t" of text. This latter method emulates the act of highlighting a portion of a Web page for copying. In the current iteration of Hunter Gatherer, we have discovered a number of incompatibilities across systems for within-element text selection, so have temporarily taken this approach off line.

Once a user indicates a selected component is to be added to the collection, the server process either (a) creates a new collection Web-page if one is not already in use or (b) adds the component to the active collection. The component has a default, editable title assigned to it, consisting of the source page's title and a few keywords from the component. We also use the URL part of the component address to create a URL for each component to take the user back to the component's source page. The collection can then be represented as what we call an Aggregated URL. For instance,

http://[server]/examples/servlet/Collection_b?aurl=http%3a%2f%2fwww%2eutoronto%2eca% 2fphonebook%2f%23H1%231%234%23Find%20profs%20with%20this...[UofT%20Phone%20Book%20Search] %7chttp%3a%2f%2fwww%2eutoronto%2eca%2fphysical%2ffac%5fserv%2ffacilities%5fsub%2fACentre %2ehtml%23P%234%231%23...[Gym%20Hours]&pagetitle=U%20of%20T%20and%20related%20info

represents an AURL with 2 components, the title of each component is in bold. The final attribute of the URL is the title for the collection itself which will appear in the title for the Web page containing the collection.

Portability. In emailing or otherwise sharing Collection AURLs, each user can view and non-destructively edit the collection, since editing only changes an AURL, and one user's changes to an AURL has no impact on another's.

Dynamic Components. The referenced-based approach to collections makes collections dynamic. If a user includes a component for the local weather, each time the page is loaded, the user will see the latest forecast; reference a bank account balance, it will show up as the current balance. In some cases, it may be necessary to construct methods to let users identify which components are important to be set as static and which can remain dynamic. For now, we are interested particularly in focusing on better understanding the interaction between component selection, capture and management rather than considering the long-term archival properties of a collection. That said, dynamic versus static raises interesting questions about location for static material with respect to where the static material is stored. It is relatively simple to save the collection as a local HTML file that will keep the HTML attributes, like links, in the page alive, but that reintroduces a user-side problem for future retrieval of the file. A server side solution would likely require a network Web disk approach. We are looking into the design of this extension.

Relative Addressing and Bumping. The Document Object Model (DOM) of web pages lets us access locations within a Web page relative to the root of the document. For instance, a page may have two elements after the document root, paragraph A, <p>A<p>, and paragraph B, <p>B</p>. If the user selects paragraph B, we initially used the location of paragraph B in the document tree to create the address for that component in the collection's AURL. This approach had one potential drawback: if an author adds a new paragraph between A and B, the new paragraph becomes B, and the old B becomes paragraph C. We call this effect "bumping." If the user previously collected paragraph B, they would now have the new paragraph B in their collection. Our solution (implemented after our initial field trials) came from the Annotation community [3]. In annotation, one of the goals is to keep an annotation associated with a particular component, even if that component is moved within a document. We have recently adapted Phelps and WilenskyÕs Robust Intra-document Location algorithms for reattaching annotations to altered components [11] to keep track of ÒbumpedÓ components. We have yet to formally quantify the success rate of this approach to component tracking, but informally, the technique has proven highly robust and will be part of our Prototype 2 evaluations.

It may be important to note, however, that such robustness is not a key priority for interaction design evaluation. In our field trials, losing components by being "bumped" in this way has not shown up as a concern for users. We do not have enough data yet to know whether or not this is because most collections reflect structurally static pages, so bumping is an infrequent occurrence, or if the collections themselves are being created for shorter term projects, rather than archival purposes, so that if a page changes structurally, users have not encountered this problem showing up in their collections.

Transclusions. By referencing components with AURLs rather than by copying the content, Hunter Gatherer embodies a version of Nelson's Transclusions [10]. Translusions propose creating and publishing hypermedia documents by reference in part so that authors can control both private and public organization and publication of information resources. While the issues of intellectual property raised by letting a user reference parts of a page outside its own (potentially banner-added) context are outside the scope of this paper, one could imagine a method of extending Hunter Gatherer to support authorizing Web sites/pages/components for publication within public or private collections with something like a robots.txt file, or by implementing Nelson's own Transcopyrigtht [10].

4. PROTOTYPES

We now turn to a description of our first alpha-distributed prototype and the evaluation of its interaction.

4.1 First Alpha Prototype

After our initial task analysis, we created a first proof-of-concept prototype to evaluate the concept in a design review with 26 participants [13]. The prototype allowed us to demonstrate the concept of within-page capture as well as the AURL for rendering component collections as new web pages. That prototype relied on the authoring within web pages of specific anchors: if the author had defined an element within the web page and given that ID or Title attribute, Hunter Gatherer could collect the div-wrapped content as components. The results of the design review suggested that we were on the right track with the tool and interaction, but that supporting only author-defined components within web pages would limit the viability of the tool. To address this problem we developed our alpha prototype to support both author-defined and user-determined component selection. We used this first alpha in both lab evaluations and field studies.

4.2 Component Selection in the Alpha Prototype

There are three steps to collect a page component in Hunter Gatherer: (1) select the component to be collected (Figure 3); (2) with that component selected, press the "a" key; (3) a dialog box appears (Figure 4) asking if the user wishes to add the component or not. The user can click "ok" or press the return key to approve the collection. We plan to make this last step part of a user's tool preferences, since in our design reviews, some users wish to be asked to confirm a selection; others did not. The current default is to ask. The user can continue to add components in this manner. Any component that can be displayed in a Web page can be added to a collection, from images to applets.

Figure 3. Component Selection. As the user holds the control key and drags the cursor over the page, available components are indicated by borders appearing around them. By holding the control and shift key, users can select multiple components.

 

 

 

 

Figure 4. The user has selected a component (indicated by the border around the selection) and hit the ÒaÓ key to add the component to a collection. The dialog box then appears to ask the user to confirm the addition.

 

 

The selection and add process is relatively transparent. It does not The selection and add process is relatively transparent. It does not require the user, after selecting a component, to shift attention from the browser to an editor application, paste content into that application's file, go back to the browser, copy the URL, go back to the editor, paste the URL, add a note to contextualize the component, save the file, go back to the browser and refocus on hunting for the next component. The user simply identifies a component to be added; the system automatically adds the component to the collection; creates an editable title for component that, by default, contains the title of the source page of the component. The process also automatically adds the URL as a link back to the source document. By automating these steps, users can focus their main attention on their information gathering task until they decide to shift that focus to a different task.

Prototype Selection Note The visual feedback for selecting a part or parts of a Web page is indicated by borders around elements (Fig. 3) rather than by highlighting. As users, we are used to interpreting highlighting as something that can be edited to a fine-grained level. Since the first prototype could not support this degree of selection fully, we opted to use borders to indicate what is selectable, since such bounding boxes are less likely to be interpreted as being as refinable as highlighting.

In our latest prototype, users can select components down to the level of a character within a word. We will evaluate whether we should keep both modes of selection indicators: highlighting and bounding boxes, or simply use highlighting only.

 

 

Figure 5. List/Edit View of a Collection. This collection contains components from multiple Web sites. Users can sort, delete or rename components listed, and rename or preview the collection.

 

 

 

 

 

 

 

 

4.3 Collection Interaction

When the user first selects a component to be added to a collection, a small window, the List/Edit view, opens (Figure 5). This window displays a list of the components in the collection which allows a user to monitor the growth of that collection. As soon as a component is added to a collection, the name of the component is added to the List/Edit view. As a browser window, the List/Edit view can be closed or partially occluded by moving any other window over it, or it can be arranged to be peripherally available as shown in Figure 1 above. Figure 1 shows the List/Edit view visible beside the main browser window.

The List/Edit view as a separate window lets users determine the degree to which they wish to monitor a collection: each time they add a component after a collection has been initiated, the List/Edit View window does not come to the front, but stays where placed. Indeed, in the first design review of the initial prototype [9], the ability to adjust the "focus" of the List/Edit View to monitor collection state was seen to be an essential feature for the tool. If the user wishes to move task focus from adding components, to the collection, to dealing with the collection itself, they can do so via the List/Edit View. This window for monitoring collection state also acts as the editor palette for the collection. Users have several editing options available: they can rename a component, sort components in the list, delete components from the list, give the collection a title and preview the collection in a browser window.

4.4 Collection View

When the user selects Preview from the List/Edit View, a new browser window opens, displaying each of the components represented by the list, in the order in which they are displayed in that list. With both List/Edit View and Collection View open, as in TopicShop, users have two ways to visualize the collection simultaneously. As shown in Figure 2, each component appears with an automatically generated header: the title of the component's source Web page. The component also appears in the Collection with the source page URL as a link. At any time, the user can click that link to open the source page for that component. Likewise, any links within the captured component behave just as they would in the component's source page.

4.5 Gradations of Interaction: Focus

Throughout the collection making process with the prototype, the user can move among hunting for sources, selecting components from those sources, adding those components to a collection, editing the content of a collection, previewing the collection, and saving a version of the collection (by making a bookmark, for instance, of the current collection AURL). If the user at a later point wishes to return to a collection, they load its AURL, which may be done by selecting a bookmark for a collection or by pasting the AURL from an email message into the browser's Location area. To edit the collection further, the user clicks the "edit" link from the collection page, and a List/Edit View window of that collection opens, listing all its components. The user can continue to view or revise that collection. By having all views as browser windows, the user determines which part of the collection making activity they wish to foreground, keep in the background or have peripherally available, simply by arranging the browser's windows.

5. EVALUATION

In order to asses how Hunter Gatherer meets the requirements for collection, focus shift and continued value, we initiated 2 evaluations: an experiment to assess the toolÕs efficiency and a field study to gain insight into how a new way of working with Web-based information may fit into daily practice. The experiment was designed to assess tool efficiency and effectiveness compared with current best practice: if the tool is not more effective/efficient than existing methods, then there would be little reason for users to adopt the tool. Because we want to deploy the tool widely, the tool must be efficient and robust. The field study, on the other hand, was designed to assess tool affect in the context of Web-based information management practices. This largely self-reporting study would be our starting point to understand how to quantify tool use/impact on these information practices.

5.1 Alpha Prototype Experiment

5.1.1 Design and Methodology

We set up a 2x2, within-subjects study to test the efficiency of Hunter Gatherer compared to an editor for creating collections. To reduce learning curve noise in the data for the editor-based collections, we choose Microsoft Word as the most familiar editor among participants. The first factor in the experiment was tool (Hunter Gatherer vs. Word); the second factor was data set (Web pages on a Chemistry program; Web pages on a Physics program). We first ran a pilot study with five participants, refined the protocol, and ran the formal experiment with 12 participants, representing a mix of technical and non-technical undergraduate and graduate students at the University of Toronto.

At the start of the evaluation, users were given 15 minutes training time with Hunter Gatherer. Users were then asked to build two collections, each from a given set of bookmarks, to be clear enough to be used by someone else. This direction was motivation to use the tools' editing capability to create the most effective collection possible within the time constraints. We alternated which tool a participant would use first, Word or Hunter Gatherer. To reduce potential learning effects, we prepared two similar collections of bookmarks, one on the Chemistry program and one on the Physics program at the University. The pages for each set were taken from the same general Web sites, so that pages were similar but for content. So, for each tool, the participant used similarly structured data with distinct content. Participants were given 5 minutes with each set of 3 bookmarks to familiarize themselves with the content of the pages before each tool trial. Participants were then given 15 minutes to build a collection from the bookmarks that would (a) explain how to get a minor in the given subject, (b) list and describe the required courses, and (c) show the course instructors for those course for the term. The experiment let us test HG in terms of our 3 requirements: (1) the efficiency of component addition (2) the effectiveness of HG in the complete collection making cycle (3) the immediate legibility of the resulting collection.

5.1.2 Empirical Results

A one-way within-subjects ANOVA showed a significant effect of tool type (collection time (F = 5.730, p < .040) in comparing average component collection time using HG and Word. Participants required an average of 6.70 seconds using HG and an average of 10.9 seconds using Word (Figure 6, above). The effect of the content variable was non-significant and was pooled over.

 

 

Figure 6. Hunter Gatherer significantly more efficient in component addition than Word.

 

 

5.2 Observations

General Observations. First, despite practice with the Hunter Gatherer tool in which we also demonstrated that each component captured contained a default header and source page URL, only 3 participants, when using Word to build a collection, included the URL of the source page for a given component. The collections, on average, had over a dozen components. The participants who included URLs did so for only a few components, and each of them had used Hunter Gatherer as their first collection making tool.

Word-specific Observations. In creating collections in Word, many participants over-captured the information required from the Web page, and then edited the extra material out from the collection file. Also in editing, Word was more efficient than Hunter Gatherer for revising component headers. Headers in Word could be edited directly in the collection, whereas Hunter Gatherer required participants to move to the List/Edit view to enter a dialog box to make a change. This motivated our revision of the tool to support editing of the component headers directly in the Collection View

HG-specific Observations. In the post-evaluation questionnaire, most users reported that they would prefer highlighting components to collect them, in addition to having the bounding box as methods for component selection. Participants also commented that sorting components in collections was "easier" in Hunter Gatherer than in Word. Similarly, in being asked what the best feature of Hunter Gatherer is, 10 out of 12 participants reported the automatic capture of the component's URL.

5.3 Analysis

We have met our first design requirement to make the addition of a component as efficient as selecting text in a browser. Though participants expressed a desire to have highlighting as a selection method, HG selection performance was significantly better than with Word. The Hunter Gatherer method is also more effective than Word for component addition, since HG automatically adds both a header for the component and the URL for the source page, the latter addition indicated by users as the most valuable attribute of the tool.

The alpha prototype only partially met our second design requirement to support user-determined focus shift among collection tasks. Header editing in the prototype forced users to concentrate on the tool, rather than the task: double clicking a header title in List/Edit view and ok'ing a change in a header dialog box is less transparent than editing the header directly in a file. Our observations also ind