ML-Schema: An interchangeable format for description of machine learning experiments

Tracking #: 2274-3487

Gustavo Publio
Agnieszka Lawrynowicz
Larisa Soldatova
Pance Panov
Diego Esteves 1
Joaquin Vanschoren
Tommaso Soru1

Responsible editor: 
Guest Editors Semantic E-Science 2018

Submission type: 
Ontology Description
In this paper, we present the ML-Schema, proposed by the W3C Machine Learning Schema Community Group. ML-Schema is a top-level ontology that provides a set of classes, properties, and restrictions for representing and interchanging information on machine learning algorithms, datasets, and experiments. ML-Schema, a canonical format, resulted of more than seven years of experience of different research institutions. We discuss the main challenges in the development of ML-Schema, which have been to align existing machine learning ontologies and other relevant representations designed for a range of particular purposes following sometimes incompatible design principles, resulting in different not easily interoperable structures. The resulting ML-Schema can now be easily extended and specialized allowing to map other more domain-specific ontologies developed in the area of machine learning and data mining.
Full PDF Version: 

Reject (Two Strikes)

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Stefan Dietze submitted on 29/Sep/2019
Major Revision
Review Comment:

I thank the authors for their response and revisions. While some issues regarding clarity and presentation have been addressed neatly, some key concerns regarding the applicability, instantiation and generalisability of the schema are not addressed and even somewhat reinforced by some of the responses.

Regarding generalisability across supervised and unsupervised ML: again, the authors response (in a very generic way) is not elaborating, how this would be covered (e.g. see my remarks regarding the "Run" class). In supervised learning, an ML implementation may be used to train a "model" which can be run to produce actual "outputs" (eg labels of classified instances). In unsupervised ML, no model is trained but an implementation is run with certain parameters on a dataset to directly produce outputs (eg clusters which in turn may translate to labels). The response by the authors on this remark is not addressing this problem but only mentions that their model only covers the "training" stage (model induction) but at the same time reiterating their claims about generalisability across different types of learning.

The current schema actually is not reflecting on such differences between different types of ML not to speak of representing increasingly popular and important notions such as "reinforcement learning" or "transfer learning".

Even for traditional ML models, it remains unclear how handcrafted features and the complex engineering behind can be covered with the provided models.

The fact that the authors are not sufficiently responding to these points, nor are they able to point to actual instances of their schema which address such problems (what would be the best and easiest way to address any concerns). That is somewhat reinforcing the doubts I and another reviewer shared regarding the lack of applicability and adoption/use of the schema.

Another key problem is the distinction between the schema and instance level, which is not sufficiently addressed and surfaces in several parts of the schema and paper. The provided responses do not actually alleviate my concerns but are indeed reinforcing them. For instance, regarding Table 1, where mismatches surfaced in the original version through this strucutured table, they've now been embedded into the text by verbalising the content of Table 1. Same holds for my remark regarding the EvaluationMeasureClass: it remains unclear why the authors are stating that they plan to instantiate specific measures/values through reification (in this case) but include a dedicated instance for the case of hyperparameter(settings). The modelling problem is exactly the same, i.e. one wants to model metrics (evaluation measures or hyperparameters) and their values.

Most importantly, the paper still shows a lack of experience and lessons learned from actually using/applying the model, where such issues would have been uncovered and could be discussed and addressed.

The authors also seem to agree with the remark that "hasOutput" may also be confusing, given that here "output" refers to the model itself, but in traditional neural network settings, one would use "output" to refer to the prediction output of a model. However, in their response, they do not address how both should be modelled and distinguished.

In summary, the schema appears to only address the claimed contributions/claims at a very superficial level. Hence, it would be crucial to either improve the schema significantly (and evaluate it through extensive application to real-world ML scenarios) or to narrow and reduce their claimed contributions to better reflect what is feasible with the provided schema.

While this work has potential to provide a foundation for better modelling, interpreting and reusing ML models, there is still significant work to be done in order to actually facilitate adoption and use of this schema. I would expect both an improved schema which actually addresses such concerns, and some form of knowledge base of populated real-world instances to evaluate the applicability of the schema. Both should be an iterative approach since actual use and application of the schema will help the authors to surface and comprehend the practical issues raised by the reviewers (and beyond).

Review #2
Anonymous submitted on 04/Oct/2019
Review Comment:

The authors have provided a detailed set of answers to many of the concerns that both the other reviewers and I raised, and I am satisfied with their responses. In particular, I am glad that (in addition to dealing with some of the minor issues, like typos) the authors have significantly re-structured the paper to include sub-sections and to cut down on other sections. I also like the description of the new use case focused on representing ML studies. As such, I believe the article is ready to be accepted.

Review #3
Anonymous submitted on 09/Oct/2019
Review Comment:

This new version is much more readable and better organized. However, I must say that IMHO it has not the quality required for this journal.
This paper describes an ontology model based essentially on the OpenML data model. Notice that in this version of the paper, Table 1, column OpenML, is the only column that has no N.A.'s. In the previous version of the paper there were 6 N.A.'s.
Additionally, the proposed ontology model is not reusing any of the related ontologies, although the paper describes the equivalences (potencial mappings among MLS classes and properties) to related ontologies.
Also, there is no RDF data to be queried, only a few examples of how OpenML data can be converted to this model. If authors could convert the content of the whole OpenML website (at least a reasonable part) and provide a SPARQL EP (a renderer such as Pubby would be also recommended) and some queries to test the ontology and the RDF graph, the proposed MLS ontology would become much more trustable. In its current state it seems to me a proposal supported by a few dummy examples and a java code that generates OpenML instances of (Dataset, Task, Run and Flow) only. SWJ papers deserve a better support.
Concerning the proposed ontology (, the analysis with Oops! ( reports some important pitfalls that a reference ontology should lint.

Comments per section
Section 2.1. In figure 1, I miss a legend explaining the meaning of the different types of arrows.
Section 2.1. "It is worth noting that we do not propose yet another ontology [...] existing SOTA ML ontologies". However, there is a namespace ( and RDF data examples using that namespace. In my opinion this is an ontology.
In conclusions section it is said "we presented ML-Schema, a lightweight [...] ontology for the description of Machine Learning...". Therefore, is it an ontology or not?
Section 2.1. "[...]could exchange SOTA metadata files in a transparent manner, e.g.: from OntoDM and MEX (MLS.Schemadata = MLS.convert([...]))". I would remove the last parenthesis because its content seems some kind of pseudo code not contextualized.
Section 2.2. "[...] and for some time t, p s-depends_on some material entity at t." I do not understand the meaning of s-depends_on