Clover Quiz: a trivia game powered by DBpedia

Tracking #: 1706-2918

Authors: 
Guillermo Vega Gorgojo

Responsible editor: 
Jens Lehmann

Submission type: 
Application Report
Abstract: 
DBpedia is a large-scale and multilingual knowledge base generated by extracting structured data from Wikipedia. There have been several attempts to use DBpedia to generate questions for trivia games, but these initiatives have not succeeded to produce large, varied, and entertaining question sets. Moreover, latency is too high for an interactive game if questions are created by submitting live queries to the public DBpedia endpoint. These limitations are addressed in Clover Quiz, a turn-based multiplayer trivia game for Android devices with more than 200K multiple choice questions (in English and Spanish) about different domains generated out of DBpedia. Questions are created off-line through a data extraction pipeline and a versatile template-based mechanism. A back-end server manages the question set and the associated images, while a mobile app has been developed and released in Google Play. The game is available free of charge and has been downloaded by more than 5K users since the game was released in March 2017. Players have answered more than 614K questions and the overall rating of the game is 4.3 out of 5.0. Therefore, Clover Quiz demonstrates the advantages of semantic technologies for collecting data and automating the generation of multiple choice questions in a scalable way.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Magnus Knuth submitted on 11/Oct/2017
Suggestion:
Reject
Review Comment:

Review SWJ "Clover Quiz: a trivia game powered by DBpedia"

The paper presents a trivia quiz game for mobile devices that uses questions which are generated from facts of the DBpedia knowledge graph.
There is a number of publications on generating quiz games from Linked Data in general and DBpedia in particular. Compared to existing work on that topic the author claims to provide a superior question generation approach. The game supports multiple choice questions. The questions are generated in a beforehand off-line process.

The question generation process has 4 steps: Domain specification, Data gathering, Category expansion, Category annotation.
The Domain specification step demands a manual definition of SPARQL queries, which are tightly bound to the DBpedia schema (or shapes). Listings 1 and 2 apply raw data triples (properties from the dbp: namespace - these are created from the infobox keys if (since 2016-10: if and only if) no mapping exists and hence are language dependent) which according to the DBpedia best practices should be avoided for high quality data. In the Data gathering step data is queried from the DBpedia endpoint and stored to a JSON file for each entity. The Category expansion step is described to be done via a recursive script, though it is already included in the SPARQL query given in Listing 1. Indeed, instead of a recursive script a simple SPARQL query would be sufficient. The Category annotation step is unclear, it seems to map categories to a proprietary taxonomy, which is nowhere described and motivated, is seems to be a simple removal of the dbc: namespace.

The remaining of the paper describes how questions are generated and weighted. Questions are generated from the entity files and question templates, which again are (hand-crafted?) JSON files. Entities are ranked according the a popularity score based on in- and out-links. Since pagerank values are originally provided by DBpedia, it is unclear why these are not used.

The paper concludes with a lengthy chapter on the backend server and mobile app, statistics on game usage and lessons learned.

In this paper I can not recognize a valuable contribution for the Semantic Web community. The extent of application of semantic web technology is limited: Linked Data technology is exclusively used for data gathering via hand-crafted SPARQL queries. As the data is queried it is transformed to a semantically weak JSON format. For that reason the finally generated questions are hardly connected to the original data, so that it would demand considerable effort to re-extract the facts the question was compiled from. From a Semantic Web perspective I would consider that approach an anti-pattern. Hence the application does not have exemplary character. The contribution is insufficient for an Application report in SWJ. Though, the paper touches some interesting topics, e.g. fact extraction from the WP categories, and unanswered questions, e.g. how can relations (or graph patterns) be (automatically) identified which are relevant for quizzes?

I strongly recommend to reject the paper for the SWJ.

Review #2
By Agis Papantoniou submitted on 31/Oct/2017
Suggestion:
Minor Revision
Review Comment:

The paper describes Clover Quiz, a trivia game powered by DBpedia.
An interesting application also showcasing how semantic web technologies can be used in edutainment/gamification. This is the main reason I suggest that the paper is accepted for publication with minor revision.
Quality of the application: I used the application and it functions as it is described in the paper. The number of downloads and respective comments also shows this.
I would suggest though one thing for possible future version:
Since the problem of the DBpedia latency is addressed (constructed graphs on local level), the author should consider implementing an online/real time version of the game, in addition to the turn based one, by implementing a game room (since node.js is already used). This way the gaming experience goes to "another level", really challenging.

Wrt the second dimension, i.e. clarity and readability, especially conveying the key ideas of semtech in the application, I'm also quite satisfied. The paper is explaining quite well the way the technologies are being used along with the game concept and the underlying processes. I would suggest though the following minor improvements:
1. What does systematically querying DBpedia during the data gathering stage mean? I'd like the author to provide more information.
2. The second paragraph of the question generation section (4) is imo too detailed and makes the reader a bit tired. Could be shrinked.
3. An overall architecture diagram, showing not only the back end but also how the processes are being run and which components are involved, could further enhance readability.
4. Can the author describe a short example of a problem report?

Review #3
Anonymous submitted on 11/Dec/2017
Suggestion:
Major Revision
Review Comment:

This is an application report on Clover Quiz, a turn-based multiplayer trivia game for Android devices with more than 200K multiple choice questions (in English and Spanish) about different domains generated out of DBpedia.

Positive aspects of the app/report:
The application seems to be popular among users and also has good number of downloads. It is good to see a real mobile app based on DBpedia. The report largely covers the technical aspects from a development/game perspective and author has done a good job of presenting them.

Improvements required:
The area where the report is weak is when it comes to reporting existing research or reporting how existing research was utilised to guide some of the design decisions in the development of the game.

The language of the text also has, in many places, more liberty compared to strict research writing - for example "DBpedia constitutes the main hub of the Semantic Web"

I am not too convinced with the latency arguments represented by the authors-"DBpedia endpoint, hence latency is too high for an interactive trivia game, as reported in [13]." since there are so many benchmarks that show that triplstores can handle singificantly larger knowledge graphs and  offer sub-second query response time.

Data extraction process :
- there are few unaswered questions here including 1. mapping the textual keyword to the concept from DBpedia - what if you have multiple options, of no option available matching the keyword,
but synonym or similar concept exists? 2. domain specific file - the format of the file seems random, although DBpedia has irregular property patterns, there are at least some common ones 
offering the opportunity to use a few template across all the domains? 3. wikipedia has category chains and you have used it in your work-however how do you decide in terms of how far to go in terms of sub categories? 

The comparison with the pervious work is quite broad with the criticism as: "question generation schemes that are not able to produce varied, large, and entertaining questions." For example, There are many works on the topic of linked data based query answering including a series of workshops (QALD). Although the submission is an application report, it needs to be positioned within these works. Few examples include: 
https://qald.sebastianwalter.org/
https://www.sciencedirect.com/science/article/pii/S157082681300022X
https://dl.acm.org/citation.cfm?id=2557529

The existing research also need to be reflected in various decisions made in designing the game - for example, the data gathering phase, question generation, etc to show if it was influenced by one approach over another. At the moment it does not refer to previous works and uses quite a broad brush in its criticism of the existing work - although it seems parts of this work follows design/research principles of other games/applications.

I would have expected at least some new lessons learned from a significant development exercise. Many of the lessons learned are not new - for example, messiness of DBpedia. Authors can add further value by focusing on the ease of using DBpedia, performance issues while using in production environment and other developmental issues specific to using semantics/linked data/DBpedia. 

I also feel that there is/was an excellent opportunity to analyse qualitative feedback in terms of your 5k users/downloads. 


Comments

(To the editor: Major revision paper)

An adhoc system that generates MCQs from DBpedia dataset is detailed in the manuscript, with prime focus on DECREASING THE LATENCY. A versatile template-based mechanism is employed for generating VARIED and ENTERTAINING questions.

The presented approach is interesting. The authors have given adequate design and implementation details of the system, qualifying the article as an application report. However, there are significant shortcomings that need to be rectified to warrant publication.
1. Include relevant references.
2. Theoretical aspects of the approach should be formally discussed.
3. Remove a few claims after including proper references.

Introduction: "creating questions from DBpedia can be significantly improved by splitting this process into a data extraction and a versatile question generation stage" is posted as the main HYPOTHESIS! "This approach can...declarative specifying the classes and question templates of the domains of interest." -- This statement is very obvious and makes us feel that an automated method should have been the prime goal this paper.

I felt that many references are missing at several places, say, approaches adopted to find the "popularity of concepts" and the "question difficulty estimator" are not given.

Can you give an account of the set of templates used and why have you chosen only those templates to generate the varied and entertaining question -- you may refer to [1] for seeing an example set of templates and their significance.

Associating images to questions was a good move!

Page-6 para above Listing 6. The example used (museum and country connected using the property country) is confusing.

Do you have a statistics about the no. of properties that are assigned as "functional" (or inverse or other predicate properties) in dbpedia -- to justify your argument.

I am convinced that the android application based on the manuscript is well developed, however, I feel that the theoretical aspect problem mentioned in the paper is limited -- hindering its publication in the SWJ. The android app. is adhoc in nature and limited to a few predetermined concepts -- an automated method would have been more appreciated. An improvement that I can think of is: given a random subject, potential mappings can be found with dbpedia concepts using tools such as AIDA. Similarly, rather than using random templates, related properties can be also identified and ranked, for generating question templates (as in [7]). However, since the focus of this paper is on decreasing the latency the mentioned change is less relevant.

If I understood your approach correctly, the set of concepts obtained after broadening (up 4 levels) are used to frame question templates. Do you consider the generated stems as having same difficulty-levels?

A considerable amount of related works are missing. I am listing a few here.

[1] D. Liu and C. Lin. Sherlock: a semi-automatic quiz genera- tion system using linked data. In M. Horridge, M. Rospocher, and J. van Ossenbruggen, editors, Proceedings of the ISWC 2014 Posters & Demonstrations Track a track within the 13th International Semantic Web Conference, ISWC 2014, Riva del Garda, Italy, October 21, 2014. , volume 1272 of CEUR Work- shop Proceedings , pages 9–12. CEUR-WS.org, 2014. http: //ceur-ws.org/Vol-1272/paper_7.pdf.

[2] Dominic Seyler, Mohamed Yahya, and Klaus Berberich. Generating quiz questions from knowledge graphs. In Proceedings of the 24th International Conference on World Wide Web, WWW ’15 Companion, pages 113–114, New York, NY, USA, 2015. ACM. Dominic Seyler, Mohamed Yahya, and Klaus Berberich.

[3] Knowledge questions from knowledge graphs. CoRR, abs/1610.09935, 2016. Dominic Seyler, Mohamed Yahya, Klaus Berberich,
and Omar Alonso.

[4] Automated question generation for quality control in human computation tasks. In Proceedings of the 8th ACM Conference on Web Science, WebSci 2016, Hannover, Germany, May 22-25, 2016, pages 360–362, 2016.

[5] Vinu E.V. and Kumar P. Sreenivasa. A novel approach to generate mcqs from domain ontology: Considering dl semantics and open-world as-
sumption. Web Semantics: Science, Services and Agents on the World Wide Web, 34:40 – 54, 2015. http://dx.doi.org/10.1016/j.websem.2015.05.005.

[6] Ellampallil Venugopal Vinu and Puligundla Sreenivasa Kumar. Improving large-scale assessment tests by ontology based approach. In Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2015, Hollywood, Florida. May 18-20, 2015., page 457, 2015.

[7] E.V Vinu and P. Sreenivasa Kumar. Automated generation of assessment tests from domain ontologies. Semantic Web Journal, Vol 6, 1023-1047, 2016.

[8] E.V, Vinu; Alsubait, Tahani; Kumar, P Sreenivasa, Modeling of Item-Difficulty for Ontology-based MCQs, arXiv:1607.00869 Technical Report, 2016.

[9] L. Bühmann, R. Usbeck, and A. N. Ngomo. ASSESS - automatic self-assessment using linked data. In M. Arenas, Ó. Corcho, E. Simperl, M. Strohmaier, M. d’Aquin, K. Srini- vas, P. T. Groth, M. Dumontier, J. Heflin, K. Thirunarayan, and S. Staab, editors, The Semantic Web - ISWC 2015 - 14th Inter- national Semantic Web Conference, Bethlehem, PA, USA, Oc- tober 11-15, 2015, Proceedings, Part II , volume 9367 of Lecture Notes in Computer Science , pages 76–89. Springer, 2015. 10.1007/978-3-319-25010-6_5

Section-7: Overall, the templated-based..very well. //there are many other factors other than popularity and similarity of distractors which decides the difficulty level of a question. Refer [7, 8].

Section-8: When designing multiple..., distractors...However, ..use random distractors. //Not all existing methods use random distractors.

The authors did not discuss anything about the natural language conversion part of the generated stems.