A User Study of Visual Linked Data Query and Exploration in Mobile Devices

Tracking #: 948-2159

Authors: 
Balazs Pete
Rob Brennan

Responsible editor: 
Guest editors linked data visualization

Submission type: 
Full Paper
Abstract: 
This paper describes the iterative development and evaluation of high usability mobile user interface elements for query and exploration of geographical Linked Data. It includes an analysis and synthesis of the current state of the art in geographical Linked Data visualization and industry design guidelines for mobile device user interfaces. It addresses the lack of published research on mobile Linked Data application usability or user experience. The usability studies described here compare the usability of custom mobile Linked Data query and exploration interfaces to standard geographical Linked Data interfaces available on fixed platforms. Evidence was collected that suggests that despite the limitations of a mobile interface for complex tasks, such as Linked Data query and exploration, it is possible to attain equivalent usability on mobile devices to fixed platforms. The importance of visual feedback for users was demonstrated when designing for the limited screen area of mobile devices. The user study provides evidence that task-oriented HCI elements or controls are more important for usability than dataset explanation or visualization. The limited screen area of mobile devices often necessitates multi-screen task dialogs. This study provides evidence that minimizing the memory requirements for the user for multi-screen tasks, in terms of visual clues of state in subsequent screens or even animated transitions between screens, produces a better user experience. The prototype mobile application developed as part of this user study delivers highly usable Linked Data geographical data-set exploration and query that compares favourably to state of the art fixed platform geographical tools. The paper also presents a unified analysis of industrial mobile HCI best practice and state of the art Linked Data visualization application requirements that will act as guidelines for future mobile Linked Data application development. The user study provides a case study on how to design mobile Linked Data interfaces for specialized data sets with known use cases.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Emmanuel Pietriga submitted on 31/Jan/2015
Suggestion:
Reject
Review Comment:

The authors report on the iterative design of a mobile user interface for exploring a specific RDF-ized dataset. This dataset describes events that involved some type of political violence in the USA over the last 230 years. Events in this dataset are linked to named entities in DBpedia. The iterative design of this interface was informed by three user studies that were aimed at assessing its usability.

I was quite eager to read this paper. I agree that the community needs to get away from low-level representations of RDF graphs, whether they be as node-link diagrams or some other representation, and start to explore other types of representations for linked data. One promising avenue for this is to explore familiar representations. Geo-localized data and temporal data lend themselves quite well to this. The authors make a strong case for this direction of research in the introduction, which I really enjoyed reading. I had great expectations for the remainder of the paper, especially given that it seemed to be contributing some interesting user studies based on what I read in the abstract.

Unfortunately, I am afraid that the paper fails to deliver any tangible contribution beyond the introduction. First, there isn't much novelty in this work. Providing a panel with buttons and other widgets to filter a dataset of geolocalized/temporal events and displaying the result set on a map is not new. The software architecture is not new. If it is and there is some significant contribution in there, the authors need to explain it in much more detail. Doing all this on a mobile device is not new either. The authors do not claim that their contribution lies in the novelty of this UI, actually. Rather, the contribution would seem to lie in the iterative design & development process and the results from the user studies. This is perfectly fine, and such types of contributions are quite welcome, especially for this issue of SWJ. But if the contribution of a paper lies solely in its user studies, those have to be both rock solid in terms of experiment design & analysis, and they have to yield some insightful results. To my disappointment, the paper fails on both fronts, as detailed below.

But before I delve into the detail of the issues I have with the studies, there is one high-level issue I want to discuss: the rationale for studying these interfaces on a tablet is not strong enough, and makes the point of this whole enterprise a bit unclear. The authors justify designing an interface that will run on tablets rather than desktop computers on the sole basis that "haptic controls [...] could facilitate a richer, more engaging experience". This is a bit far-fetched. Why would domain experts (social scientists) work on a tablet for this sort of analysis, given that for many tasks those devices have more limited capabilities than desktop computers (as acknowledged by the authors themselves)? If the authors think that there are true advantages in using a tablet, specific hypotheses have to be formulated that capture this particular aspect of the UI, and those hypotheses have to be tested.

Another issue is that Section 3 is not covering related work effectively. It is trying to give a very broad overview of many research areas, but mostly makes high-level statements about visualization, UI design and mobile HCI, failing to reference truly relevant work. This is not what is expected from the Related Work section of a research paper.

Detailed comments about the iterative design and user studies:

- It is not clear how the application requirements get derived from the survey of existing applications. Besides, I would argue that one does not derive requirements just by looking at existing applications. This is not a proper design method. One needs to look at actual user needs and technological limitations, just to mention the most obvious.

- Related to the above, involving "authors, colleagues and friends" in the prototyping process is pretty awkward. It is ok to have some informal first steps, but why didn't you involve representative members of the actual target audience in the design process? This would probably have made the whole effort and end-result much more meaningful.

- The adherence of iOS's guidelines is anecdotal in this context.

- What led to the selection of these particular two interfaces (tag cloud and list) in the first prototype? This seems quite arbitrary. A related issue is that the flaws of the list UI discussed on p16 (selection of multiple values) are quite obvious. On mobile interfaces, lists that allow multiple selections will usually use tick marks to indicate selected items. Why didn't you do that in the first place? This would likely have addressed this issue. The more general point here is that section 5.3.4 isn't particularly insightful to the reader. This section is just reporting on a design choice that was obviously wrong in the first place. The reader does not learn anything new here.

- The experiment design is loosely described. Actually, the sections called "experimental design" are not at all describing the experimental design. You need to systematically report on the number of participants, how they were recruited, what characterizes them (what criteria were used to categorize them as experts or non-experts), and the apparatus (type of tablet, display resolution, desktop monitor resolution in expe #3). This is very important, both to assess the validity of the experiment design and the subsequent analysis of results, and for the sake of replication of the studies' results.

- Empirical results are not reported appropriately. In section 5.3.3 the authors report on two more-or-less random t-tests (in an imcomplete manner, actually), but this is the only place where we get some sort of statistical analysis. All other analyses are merely based on the comparison of mean values between conditions. This is unacceptable, as there is no way to assess the statistical differences of measures across conditions, making the whole reporting almost useless to the reader. Proper statistical analyses have to be performed on the data.

- The authors make some random observations about the data, sometimes making general claims that contradict what they have reported on. For instance, in the first experiment, they write that UI2 has performed better than UI1. Looking at Table 4, task completion times with UI1 are actually shorter for 2 out of the 3 tasks! Add to this that the mean task completion time over the three tasks is quite likely not significantly different between UI1 (272s) and UI2 (269s). For "familiar" users, UI1 was actually twice as fast overall. This is just the most obvious example of such problems in the paper. The interpretation of results is in several places not only misleading but plain wrong. Another instance of this issue is the first conclusion drawn in 5.4.7. Again, this is not acceptable for a research paper.

- There are several issues of counterbalancing conditions throughout the experiments (number of participants per condition, presentation order).

- Experiment 2: I fail to understand why the authors recruited people who participated in experiment 1 as novice users. What is the point? This would only contribute to reducing the differences with experts in an uncontrolled manner. Another question related to participants, as mentioned above, is what qualified some participants as "experts"? What were the selection criteria?

- The number of participants was somewhat low in all cases, which makes it unlikely that there would be strong observations to be made out of a proper results analysis.

- Comparing quantitative results across experiments (as the authors do with #1 and #2) can only _suggest_ possible facts. These comparisons do not _show_ or _demonstrate_ anything, simply because the conditions (UI, tasks, etc.), and the participants, were different across those studies.

- The Map4RDF interface needs to be described. Again, this is both to help readers interpret the results, and for the sake of reproducibility of the experiment.

- I do not understand how users' preferences were collected in experiment #3 (mobile vs. desktop applications). Actually, I am not even sure I understand what this means. Does it mean that (A) they preferred mobile applications vs. desktop applications generally speaking? Or that (B) they preferred one or the other just for this particular application? If (A), this isn't meaningful, as it is unlikely that one would have a preference for mobile vs. desktop independently from the application and its context of use. If (B), then the authors are extracting an additional factor for driving their analysis _from the measures_. This is questionable in terms of methodology. Beyond that, the observations in that respect are more or less stating the obvious.

In the end, I think that the community definitely needs to move in the direction suggested in the introduction of this paper. I strongly encourage the authors to keep working in this direction and resubmit their work once it has reached a higher level of maturity. But I can only recommend that before they resubmit, the authors read more literature about how to design, run, analyze and report on user studies. Example venues were such papers can be found include any SIGCHI journal and conference proceedings (CHI, UIST, CSCW, ToCHI) and other journals (IJHCS, etc.).

Finally, any future submission should be proofread. There are a lot of typos, grammar mistakes and missing words.

References to figures need to be fixed. Most of them are wrong.

Some positive aspects of the paper worth mentioning:
- Figures 9 and 10 are very nice and help quickly understand what were the differences between to successive design iterations.
- I liked the fact that the authors provided a detailed rationale for each of the tasks (questions) performed by participants.

Review #2
By Tomi Kauppinen submitted on 02/Feb/2015
Suggestion:
Major Revision
Review Comment:

My review is structured according to the (1) originality, (2) significance of the results, and (3) quality of writing.

(1) originality

The paper is an original contribution. Although there are plenty of papers comparing web and mobile interfaces, I am not aware of one concentrating on Linked Data exploration with these two interface options.

(2) significance of the results

The authors have a nice goal of structuring the research around hypotheses and presenting the results accordingly. My concern is that e.g. the following hypothesis is not very generic (i.e. it is very specific to this study).: "Prototype 2 will score higher in usability than the usability baseline for prototype 1 created in Experiment 1, given the addition of visual feedback to the application." Thus I would recommend it to be reformulated to be more generic, something like "visual feedback provides ...".

Moreover, I wonder whether the hypothesis "The cloud tag based design for information layout on the visual query interface is more usable than a vertical list based design." has been studied in other papers? This sounds like a task that could be studied without concentrating on mobile & web interfaces but studied more generically?

Other than these I like the way authors have presented the research setting and results. Thus I would recommend a revision where authors reformulate and argue better the hypotheses and discuss the results in relation to existing literature (when applicable).

(3) quality of writing

This paper is very well written and structured, and it does not have any serious spelling or grammatical issues. The only thing I would clarify is the referencing to hypotheses. At the end of the paper the authors talk about hypotheses (e.g. H1, H2). It would be very good to use these symbols already where the hypotheses are first introduced (e.g. section 5.3.1) to support the reader.

Review #3
By Mark Gahegan submitted on 04/Feb/2015
Suggestion:
Major Revision
Review Comment:

A User Study of Visual Linked Data Query and Exploration in Mobile Devices
Overview

This paper covers a lot of ground, and perhaps running so many research questions together confuses the issues and the findings? As a result, I think it is over-long and it would be better to focus a much shorter paper on the Linked Data aspects and remove the material on see-centred design and evaluation, and present that elsewhere.

The main focus, as evidenced in the conclusions, is about a user-centred design and test methodology for improving layout and interface design for a mobile GIS application.
Some claims are made about overcoming the limits of small screen sizes. Would not desktop geoviz also benefit from the same design improvements to the interface though—such as visual cues regarding state, or animated transitions? Which of these is unique to mobile devices? Would we see the user experience improve here as well, if such techniques were used on the desktop?

The introduction of the paper makes some connections to the theme of Linked Data and the semantic web, but to my mind these are not central to the main aims and more detailed content of the paper. I wonder therefore if the paper is targeting the right journal. A cartography journal such as CAGIS might be a better fit. The design and evalution sections are well executed, and I think they make a valuable contribution to the field of map interaction and geovisualisation design. I’m less convinced by the Linked Data aspects: they are present, but I don’t think they play a major role—the work stands without them, and the system itself can work with ‘canned’ or offline data.

Specific issues
Introduction: I agree that user-facing systems should be intuitive, attractive and usable, but what has this got to do with linked data? I think it would be better to keep these two issue separate, and explain the motivation for each separately. So, what does a linked data application actually provide to end users that is missing from traditional geoviz systems? What, if anything, is different between “best practice for information visualisation of Linked Data” and best practice for any information visualisation?

Here we define geographical Linked Data as any data-set that contains sufficient properties to be effectively located on a global map in terms of global latitude and longitude
This definition is quite different from that of typical linked data (e.g. http://en.wikipedia.org/wiki/Linked_data, so I think you should discuss why you choose this definition and what these differences imply. Your definition, by contrast, seems to apply to geographic information in general, whether or not it is Linked.

Section 3.1 Guidelines used for designing the app are sound and represent best practice.
Direct interaction has also been shown to increase user confidence in the results that applications return. Provide a reference to back up this claim.

Section 3.4. Not sure if it is necessary or helpful to reiterate IOS design principles here. Could you simply refer to them? Either way, no citation is provided.

Section 3.5 is a restatement of work already published (by others). Could it be moved to an appendix?

Section 3.6…if caching is needed, then how do you go about it?
I like Table 1, that’s a really nice, succinct summary of mobile linked data apps to date. Table 2 is useful also, but seems somewhat off topic. The focus of the paper is not to survey linked mobile apps. How many of these apps can display geographic data/maps? It might be more useful to show that if you want to demonstrate that you are filling a vacant niche
How efficient is it to move geospatial data around when it is encoded using RDF?

Section 4
On the screenshots shown in Figure 7, what is the ordering used to lay out the options? It does not appear to be alphabetical, but they both some the same ordering. Is ‘shopping’ really a motivation? If yes, that really deserves a paper in its own right!

Section 5
Are the authors, colleagues and friends really a good representative sample for paper evaluation? Why were no formal methods used here?

In the final user study, can you say more about the participants (how many, what background?) The evaluation method seems sound.

Mixing new and repeat users in section 5.4.3 is a mistake I think, as this effect is likely to mask or change the intended study to highlight differences between domain experts and domain novices. But I do note that the experts still did better in terms of time. I’m not convinced we can take these results (Table 6, &) at face value though.

Section 6
Conclusions seem valid, though I note that none of them relate to Linked Data