Review Comment:
The paper follows on from previously published work on VOWL, and provides more detail about the two implementations ProtégéVOWL and WebVOWL, in addition to a new evaluation with a set of (domain) expert reviewers.
The paper is fairly easy to follow, and does a good job of illustrating a new contribution to visualisation of ontologies. Value over existing work is clearly discussed, as well as additional avenues of use, and the paper concludes with pointers to further work.
Overall, I think the paper makes a good contribution to the field, the comments that follow are predominantly to do with points in the presentation where I found myself looking for additional information to answer a question raised by the discussion at that point. I suspect a fair bit of this is making sure that new work is presented without repeating too much of previous work, and/or not presenting so much about what appears to be fairly extended work that the paper then becomes a bit too broad. Or, also, the trap we all easily fall into - that the reader does not have all the context the authors do, so what may be obvious to the latter simply is not to the former.
The other area I would draw attention to is the discussion of the evaluations (detail below).
************
It is not till fairly late in the paper that the EKAW paper that this is extended from is first referenced. I only just noticed, on my third read, that this is actually the first footnote on the first page! But the footnote is not actually referenced from the text. I actually expected this reference at the top of page 2, when the previous version of VOWL and the related papers are first brought up.
Also, "a demo of WebVOWL will be presented at EKAW 2014" - as at the time of submission EKAW 2014 had already taken place.
I remain a bit confused - WebVOWL is described as "a standalone application entirely based on open web standards". Is it a web-based tool - name implies this, or a standalone application as described? This is actually finally answered - on p.10. Might be useful to clarify this earlier - a "web application" isn't quite the same as a "standalone application".
It is also later referred to as "WebVOWL, a responsive web application …" - what exactly does "responsive" mean, how is it measured? Is there any reliance on a network connection, what were the specs of the machines the tests were carried out on?
In a top end research lab, the equipment available far surpasses what the average, non-tech end user will have access to, especially at work. Among others, depending on what they typically do, they simply do not need too much power. FYI, I make this comment based on experience working with, among others, aerospace engineers doing a decent amount of data crunching, where we eventually had to provide alternative machines (security was not the issue here). Point is, the description may be valid, but it needs to be qualified.
A bit pedantic, but "The evaluations helped to improve VOWL and confirm that it creates comparatively intuitive and usable ontology visualizations." - "usable" is so broad that I'm not sure it contributes much after saying "intuitive". What is it usable for? Who (as in user type) is it usable to?
"Many approaches visualize ontologies as graphs, which is a natural way to depict the structure of the concepts and relationships in a domain of knowledge." - playing devil's advocate here, because I do not disagree with the point completely - BUT, what makes graphs a natural way to depict this structure? I could cite half a dozen articles that justifiably say the opposite, or that some other structure works better.
On the same point, and probably more importantly, there are very valid arguments against using force-based layouts - in fact, the authors raise one toward the end - using the word "appealing" to refer to them is debatable.
These arguments become clearer further on in the paper - would be useful to put in a sentence or two here justifying the points, with relevant references.
p.3 - "However, the ontology is converted into the NodeTrix structure for the visualization, making it difficult to get an impression of its global structure and topology." - how is NodeTrix responsible for the issue here?
"developers were given more freedom in the parametrization of VOWL " - does this mean that it can be extended?
Related to this - "VOWL does not specify a particular scaling method for the circle radius, but proper results will likely be achieved with a logarithmic or square-root scaling in most cases. " - what is this claim made based on? Again, does this mean the end user can extend to do this?
The answer appears to be yes, emphasis on "appears".
Doesn't pointing to multiple instances of owl:Thing increase clutter - one of the things VOWL is supposed to avoid?
It's nice to see consideration for use in monochrome. However, there is no evidence provided to back this up - was this explicitly evaluated with end users? Or tested using some other verifiable method? I can see the argument with the text labels - which is a fair point, but this also contributes to clutter (a point noted during the evaluation). My question is also whether the current colour scheme works sufficiently well in monochrome that without the text labels it would still be usable.
The authors refer to the use of Venn diagrams in the description of the tools, and again wrt comments by participants in S5.2. However I struggled a bit with 5.2.5 because to that point there were no examples in any of the snapshots. It may be useful to point forward to Fig. 5 when this is first introduced and especially in section 5.2.5.
EVALUATION
It would be useful to provide a brief (one sentence) description of the user types in the previous evaluations. For instance, on p.6:
"The representations of these elements were considered intuitive by many participants of the user study that compared VOWL 1 to the UML-based visualization of ontologies [55]." - it's impossible for me to interpret this properly without knowing what types of users these were. I know the answer is in the paper referenced, but there are 65 in all … I should be able to read this one without having to go to each to get extra info.
Actually, this is finally provided in section 5, so alternatively point forward to this section.
I have a bit of a problem with the report of previous evaluation(s).
5.1 does a good job of telling me about the users - what I was missing earlier. However, at the end of the section I'm not sure I really get what the results were, beyond that it was compared with a set of (named) tools, and that it came out looking good. This section needs one of two things, either the results are presented in more detail, or a much higher level summary given and the reader simply pointed to the previous paper with the detail. And the focus kept on 5.2, with this as background - see also point below on relating the two sections.
The cover letter refers to a new evaluation with expert reviewers. However, I didn't find a specific reference to this effect in the paper itself (apart from in the abstract) - I guess this is section 5.2? If so would be useful to state that this is a follow-on to the previous one (5.1) reported and give the information (in the cover letter) in the paper itself about why it was considered a good path to go down.
"While the pick- and-pin feature was generally thought of as useful, one participant even asked for such a feature on his own." (5.2.1) - don't understand this. Also, what was this additional feature? Also, I really don't follow the argument in 5.2.2 - multiplication would increase clutter - so this seems a bit contradictory. And if they reported that they wouldn't want to answer the question for which this was relevant- even more confusing. Further, what was the reason for the one exception?
Wrt my earlier comment about force-based layouts being natural or appealing, what were some of the other layouts requested by participants (apart from the mind map - strictly speaking that's not too different).
Wrt to the restrictions to the visualisation of set operations at the end of S.5, what is the expected impact on use? Was this evaluated with the participants?
The two reports in the evaluation section ARE related to each other. However, I do not see any discussion to that effect. This is important - a key aim of VOWL is to support end users who would not normally work with ontologies, or understand their structure in any great detail. The second evaluation with experts is actually very good in that it picks up additional requirements for ontology visualisation that the former may not, and therefore helps to ensure that these would be available to all users. However, at the same time, feedback from domain experts, unless they've training also in HCI/usability, will not pick up on what would be difficult for non-experts to work with.
It would be useful if 5.2 backward references key issues raised in 5.1 during the discussion and/or an additional (relatively short) discussion included as a summary at the end of the evaluation section, showing the value of the 2nd evaluation to these (casual) users.
Also, while it's perfectly acceptable to present just qualitative evaluation results, for a journal paper you really need to justify why this is so. And whether or not the results can be seen to be representative of the target user population. Off the top of my head I would say one reason might be small numbers in each study. However, that alone is not enough. Another might be the use of expert reviewers. But it is not for me to surmise, but for the authors to clarify.
MINOR POINTS
p.5, col 2, top - "The recommended color scheme has been designed in accordance with the general guidelines: For instance, and inline with …" - is there something missing here - the first sentence doesn't end. Also, should be "[in line] with"
CITATIONS & REFERENCES
OntoViz is mentioned but never referenced.
Ordering at the start is weird - appears to list URLs only, but then there are a few others scattered within the rest of the references (which are subsequently listed alphabetically).
Check that capitalisation is maintained for acronyms, e.g., [11] "OWLGrEd: a uml style graphical notation and editor for OWL 2."
And also consistently named, e.g., RDFgravity in [62] but "RDF Gravity" in text.
LANGUAGE & PRESENTATION
OWL === Web Ontology Language - wd suggest "OWL (Web Ontology Language)"
"an increasing number of people in modern knowledge societies get in contact with ontologies." -> "… COME INTO contact with"
"users would not discover them as flawlessly as the permanently displayed elements" (p.12) - "flawlessly" here is a bit strange, maybe "effortlessly" or "easily"?
"5.3. Benchmark of the VOWL Visualization" -consider "BenchmarkING" or "Benchmark TESTING" - as otherwise the header implies that "VOWL Visualization" is the benchmark, rather than that it is being tested.
Ditto "… in contrast to ProtégéVOWL at the time the benchmark was performed…"
Overall, well written and easy to read. A number of minor corrections needed - should be picked up by an auto-check and proof-read.
|