Knowledge Engineering with Semantic Technologies to Identify and Warn of Transport Disruptions

Tracking #: 2640-3854

Authors: 
David Corsar1
Milan Markovic
Peter Edwards
Paul Gault
Caitlin Cottrill
John D. Nelson
Somayajulu Sripada

Responsible editor: 
Guest Editors Transportation Data 2020

Submission type: 
Full Paper
Abstract: 
Public transport operators and transport authorities are increasingly using social media channels to disseminate relevant information to travellers. Many operators utilise platforms such as Twitter to provide both customer service and real-time passenger information - including details about service disruptions. However, for such reports to be useful to travellers, they must first find the information, relate it to their travel plans to determine if their journey will be adversely impacted, and, if so, decide how to adapt their plans. This paper describes the TravelBot intelligent system, developed to perform the first two of these tasks and warn public transport users of potential disruptions to their journey. Developing TravelBot combined knowledge engineering processes with iterative user-led design activities culminating in a real-world user evaluation. Semantic Web technologies are used to represent and integrate transport knowledge obtained from open data, social media posts, and users, and to support reasoning processes that infer structured representations of events described in social media posts. Inferred events are assessed to determine if they are are likely to disrupt the planned travel of TravelBot users, and, if so, users are sent personalised warnings. Evaluations of the system based on data collected during a user trial found that social media posts are processed in an acceptable length of time, and users generally considered the information provided by the system to be useful. Areas in which the event inference capability could be improved are also identified and offer future research opportunities.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Reject

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 19/Jan/2021
Suggestion:
Reject
Review Comment:

The paper presented introduces an interesting tool called TravelBot. From an innovative point of view this tool offers new and appealing capabilities to travellers who will receive on "almost" real time information about their previously registered trips. The relevance of this topic is well introduced and justified in the introduction where its necessity is clearly state. 

In general the quality of the paper is very good, and the topic addressed has a clear impact. The text is clear and well-written, furthermore, it explains in a simple yet very complete the whole process followed to implement the TravelBot. 

Additionally, a set of experiments to assess the efficiency of the tool are presented: one to show the performance of TravelBot inferencing events with different granularities; another to show the overall performance of TravelBot processing social media; and finally, one to assess the user experience.

From a technical point of view, it seems that the authors took the best from several fields and followed good practices (like using NeON). However, it remains unclear how one critical steps was performed: translating heterogeneous data coming from the different sources into RDF. 

The following issues should be addressed:
- It is unclear if this tool is free and available since no reference to the code was provided in the paper. Thus, the paper should clarify what licence TravelBot has, how it can be found, and how it can be downloaded or used. 

- From a technical point of view, all the data fetched or provided by users ends up in a Knowledge Graph modelled according to the ontology developed for TravelBot. In the paper several sources are mentioned:  users provided trips, tweets, open data, data from OpenStreetMaps, etc. It is unclear how all the data coming from those sources is translated into RDF (is TravelBot using a custom code to translate? Or maybe it relies on a mapping-based approach?). Maybe this could be clarified in section 5 with a sub-section.

- The Abstract has the following typo "assessed to determine if they are are likely to disrupt"

Despide the high quality of the paper, it has been submitted as a 'Full paper'. Nevertheless, the paper clearly presents a tool that combines existing techniques to deliver a set of user functionalities. What is missing in this paper to be considered a 'Full paper' are the research hypotheses and experiments that prove them. Notice that the current experimentation points towards the quality of the tool's functionalities (efficiency of the tool and user experience) than proving a set of research hypothesis. Under the reviewer's point of view this would be a very good (once solved the issues previously pointed out) resource paper submitted as 'Reports on tools and systems' but it is not a research paper 'Full paper'.

Review #2
Anonymous submitted on 22/Mar/2021
Suggestion:
Major Revision
Review Comment:

The paper describes the TravelBot system. The main goal of the system is to warn public transport users of potential disruption to their journey.
It exploits knowledge engineering processes to extract useful information from social media platforms (Twitter) to achieve such a goal. It uses semantic web technologies to check if a user has to be notified of a service disruption. The system has been tested in Aberdeen with 13 participants.

The system has been shown to be useful for the participants, even with a low recall of the disruption events.
However, as the authors reported in the conclusion section, further experiments are required to test how the system would perform in a more significant scenario, both in terms of response time and user satisfaction.

The paper is well written and highlight how knowledge engineering and semantic technologies can be exploited in the public transport domain.
However, a deep comparison (features, response time, user satisfaction) with commercial solutions [1,2 3,4,5] would better motivate the need for such a system.

[1] https://moovitapp.com/en-uk/ios/personalized-service-alerts/
[2] https://citymapper.com/news/1454/get-notified-when-your-line-is-disrupted
[3] https://www.avantiwestcoast.co.uk/travel-information/set-up-disruption-a...
[4] https://en.gvb.nl/klantenservice/omleiding-op-je-route-je-weet-het-met-s...
[5] https://transportnsw.info/opal-travel-app-how-to-enable-notifications

Review #3
By Julian Rojas submitted on 25/Apr/2021
Suggestion:
Major Revision
Review Comment:

This paper describes a system framework to address the use case of automated user notifications about relevant public transit operation disruptions. The proposed system relies on Semantic Web technologies to provide an integrated domain knowledge model, entity identification and inference capabilities.

In general, this work provides a very good example of a use case where semantic web technologies are used in combination with NLP technologies. The addressed use case is very interesting and highly relevant for the transportation domain. Also, the experimental design seems to be well founded and supported by real user-based evaluations. However, I see some limitations that would prevent me to recommend it for publication in its current state:

- From what I can see, the described system framework was designed and implemented mostly during previous work, namely [11], [12] and [27]. It remains then unclear which contributions are introduced in this paper as opposed to the previous work of the authors. For example, "the knowledge model for transportation information" is mentioned as a contribution of this paper, but the same model was already introduced in [11]. If there are new technical contributions, it should be made explicit in the paper. What was designed/created before? What is new in this paper?.
In the current state of the paper, the main contribution lies on the experimental setup and the performed evaluation, which should be highlighted as such throughout the paper.

- In line with the previous comment, I miss the definition of clear research question(s) and hypotheses for this work. The evaluation hints on what the authors aim to study, but I would rather see them explicitly described in the paper, which should also guide the conclusions as the research questions should be answered/validated based on the results of the evaluation.

- Having the system architecture implementation being done at least 6 years ago (as seen from GitHub activity), lead to certain architectural design choices that could be considered outdated or even deprecated nowadays. As examples stand (i) the transit ontology which has been superseded by the linkedGTFS ontology (http://vocab.gtfs.org/) over 6 years ago; (ii) the use of SPIN for defining inference rules, which currently has SHACL as a notorious standard successor (https://spinrdf.org/spin-shacl.html); (iii) relying on D2RQ for Linked Data generation over relational databases, which was replaced by the standard R2RML and (iv) the Ontotext KIM system, which is a core module of this system and does not seem to exist anymore. I would suggest the authors to at least discuss why their implementation despite outdated may still remain relevant and to provide perspectives on how the current technologies and standards would impact their system design and what technologies may be used to replace those that do not exist anymore.

- This work spans over multiple technical domains that are not sufficiently covered in the related work. The related work focuses only on event detection from social media. I would suggest to at least elaborate on related work about the use of semantics in the transport domain, for example with recent semantic models based on TransModel (https://oeg-upm.github.io/snap-docs/). Transport related works are briefly mentioned in [11] but, this paper needs a more recent revision. Also related work on semantic inference is missing and how your approach compares to it. In this way, it would help to discuss a more complete and up to date technological landscape.

- The time performance experiment presented in section 8.2, is used to prove that the system is able to perform its whole process in a reasonable time and breaks down the times takes by each individual sub-process. However, I think your work would be more complete if the scalability limitations of this system are assessed and discussed. If it is supposed to be deployed in a city with possibly thousands of users and complex transport systems, how many twitter accounts can it monitor, for how many users, for how many transportation routes?

Minor remarks:

- Typo in the abstract: "to determine if they are are" -> "to determine if they are".
- Link in footnote 6 is broken.
- Typo in section 3: "providing appropriate privacy" -> "provided appropriate privacy".
- The TravelBot ontology is not linked in the paper?
- Link in footnote 31 is broken.