Review Comment:
The paper presents an overview and current status of the Linked Connection (LC) framework, where semantic technologies are employed to allow the inexpensive publication of public transport data gathered from heterogeneous, possibly linked, data sources. The three components of the framework are presented : (i) 0the vocabulary designed to deal with rapidly changing ("live") data; (ii) the server developed to process data feeds compliant with transport standards (GTFS-RT); and (iii) a size-based file fragmentation strategy for data management and exchange. The resulting framework allows the authors to face challenges such as guaranteeing URI stability and validity over time and the management of past ("historical") data for query optimisation. The authors show how the new version of LC is employed in the context of planning bus routes in Belgium and Spain, and evaluate it against its previous version in terms of query execution performance. Results show that fragmenting and caching fragments improve the execution of route planning queries.
Overall, the work is interesting from an industrial perspective, and it is clear that is the result of a research conducted over several year — hence definitely fit for a journal venue. I believe that the problem of dealing with historical & live data (let’s call them dynamic data) and rapidly managing them is quite relevant — not only in transports, but also in areas dealing with streaming sensors such as smart-cities and Internet of Things.
In the specific of the evaluation criteria the paper presents some novel ideas with (their) previous work, and results are somewhat significant (even though it is only compared against the author’s own baseline, and some of the conclusions seems quite trivial). Quality is overall good, with a number of things to be changed — see last part. In general, I have some comments and concerns, which make me say the paper might need some major modifications before acceptance.
# Introduction
This is clear enough I believe, problem, challenges and overall approach are well stated. It might be good to give a bit more of description of Figure 1 tho (lines 29-30), to put the reader in context. Also I am not sure if that sentence is anyhow related to the previous sentence? If so — it should be better clarified.
# Related work
Not convinced at all about this section. This seems more of a "background&motivation" section, where authors mostly report of their own previous work and what brought them to implement new features for their framework. This actual section can and should say (as background&motivation), but I recommend the authors to work on creating a new section of "related work", where they actually locate their work in the literature. More specifically:
1) I was surprised to see no mention of approaches from the transport domain for managing route planning: yet, this is quite a big area (transportation networks/transport planning/intelligent transportation systems). What do people in these area have done? Which algorithms, how do they manage data? Why the Linked Connection framework is supposed to be better?
2) While I do see the fundamental differences, another quite relevant area should be RDF stream processing/querying, which I am sure the authors should be aware of. There many stream processing systems and frameworks have been looking at the integration of historical and real-time data too, so they should mentioned and the main differences with LC have to be highlighted
3) Anything from the IoT and Wireless Sensor networks areas for the management of dynamic data?
4) Also check on how semantic technologies have been used smart cities, to support management of heterogeneous volatile data
These are of course pointers, but more things could be included of course.
On page 5, lines 1-15: The conclusions are very weird : "we think that it is the moment to" … based on what? I would use "based on the evidences presented […], we propose …" . And following : what constitutes reliable transport data?
# Section 3 (the framework)
The description of the framework is ok in subsection 3.1 & 3.2, but the quality sensibly reduces in 3.3. This last section should be better revised, as the reader gets lost. The sentence on lines 15-19 (second column) is not clear, nor grammatically correct. The sentence on page 8, col 1, lines 21 - 31 is also too long. Other minors below.
W.r.t. the use case of section 3.4, has the framework not been used to implement any mobile/web client application relying on it? If yes, it might be useful to present them, for completeness of the work&paper, but also to show the soundness of framework. Was/Is LC used by NMBS at all?
# Evaluation
As I said the work was compared only on the previous version of the framework, but authors devise a large number of experimental settings showing results convincing enough. I would not call the "scenarios" but rather "experimental settings" — in the end you use the same system with parameter variations.
In general the whole section is quite lengthy and could be simplified by putting all the exp. settings in a table, or by merging figures on similar topics (e.g. Fig 11&12, Fig 13&14, 15&16 etc…) As it is also hard to interpret results if they come one after the other. Btw — perhaps discussion of results could be held along the results? This might help in readability.
On page 9, col 2, what is exactly the role of the list lines 33-40? It sort of comes out of the blue, perhaps a introductory sentence could be useful.
On page 10, line 35, could you point to the GTFS datasets and GTFS-RT feeds available as open data?
# Conclusions and future work
this is also okay. I would anyway invite the authors to think and discuss a bit of which other applicability the LC framework could have outside the transport domain.
# Style & Minors
- Captions in Tables&Figures should have a full stop dot at the end (the are very few as such). Also — captions for Tables go above the table, captions for Figures go below.
- page 1, line 35 : the manage of these >> the management of these
- on the top (found in a couple of sections) >> on top
- page 6, lines 11-15 should be unindented (and add a full stop after "Figure 4" on line 14)
- page 7, lines 24-28 should be unindented
- page 7, lines 39-43, replace with fist column sentence: After defining how create Linked Connections feeds which take into account live data, our third contribution is how to provide reliable access to historical and live transport data in an cost-efficient way.
- page 7 line 27 : both, the historical and 27the live connections >> both historical and live connections
- page 9, line 15, a fragment Fk-1 whose >> a fragment Fk-1, whose
- page 9, line 42 : the algorithm process >> the algorithm processes
- page 9, line 50: figure 1 >> Figure 1 (also make sure you use Fig or Figure consistently throughout the paper)
- page 10, line 32 >> table 2 >> Table 2 (see comment above)
- page 18, line 43 : by relaying >> by relying. and "a set of common semantics" >> common standards and vocabularies
|