Linked data for Potential Algal Biomass Production

Paper Title: 
Linked data for Potential Algal Biomass Production
Authors: 
Monika Solanki, Johannes Skarka, and Craig Chapman
Abstract: 
In this paper we present an account of the publication of a suite of datasets, LEAPS, that collectively enable the evaluation of potential algal biomass production sites in North Western Europe. LEAPS forms the basis of a prototype Web application that enables stakeholders in the algal biomass domain to interactively explore via various facets, potential algal production sites and sources of their consumables across NUTS regions in North-Western Europe.
Full PDF Version: 
Submission type: 
Dataset Description
Responsible editor: 
Krzysztof Janowicz
Decision/Status: 
Accept
Reviews: 

Review 1 by Anne Thessen

The author has addressed all of my comments. I recommend that the work be published. I would be open to the author contacting me offline as mentioned in the response to reviewers.

Review 2 by Boris Villazon-Terrazas

I am happy to see that my comments have been addressed properly.
Moreover, the demo is very good.

I am in favor of accepting this publication as it is now, but I have a couple of minor concerns
- Page 4, the footnotes of the table have different font, please fix those accordingly
- Figure 6, I'm not able to see the URIs without zooming a bit
- Resources URIs like http://data.biomass.org/algae/sites/id/546 are not available
- Footnotes of last page are before the References section. Those should be at the bottom
- References should be normalized
- Finally, I'm not able to see query results from your SPARQL endpoint
http://www.semanticwebservices.org/enalgae/sparql

Review 3 by Femke Reitsma

Accept as is

The text below if for a previous version of this manuscript

Revised manuscript after an "accept with major revisions". Previous reviews are below.

Review 1 by Femke Reitsma

The paper on linked data for potential algal biomass production is very well written and clearly describes the linked data set and the architecture for browsing that dataset.

My only problem is that the dataset isn't yet available, and the demo video didn't work. Given that the instructions for authors of linked datasets papers clearly mentions the need for a URL with version number, availability etc, I would suggest waiting to publish this paper until the browser and data is actually available.

One minor edit that needs to be fixed includes two instances of "ArchGIS", one in text and the other in Figure 2.

Review 2 by Boris Villazon-Terrazas

The paper describes the transformation, publication and exploitation of Algal Plant Sites data that enable the evaluation of the potential of algal biomass production sites in North Western Europe.

The description of the data set and the method to derive it from
table data are clear and the paper is well written.

Regarding the characteristics of the data set requested to be described in
the call, the authors cover:
- Topic coverage, source for the data, purpose and method of creation
and maintenance
- use of established vocabularies (e.g., RDF, OWL, NeoGeo, etc).

However due to project consortium restrictions the authors cannot expose the complete LEAPS dataset. The authors provide a simple set of snippets. Moreover, they do not provide an entry on thedatahub.org for their dataset. In this sense, it's important to include the specific license of the dataset.

Also, they do not mention reported usage (other than their own
prototype application), and metrics and statistics on external and
internal connectivity. Thus, it is difficult to judge the overall
usefulness (or potential usefulness) of the dataset. Also, even
though the authors point to some areas for future work, they do not
explicitly address known shortcomings, others than the links, of the dataset. In the final
version of this paper, these topics should be addressed more
explicitly.

Comments:
- missing space .. Northon Western Europe(NWE) -> Northon Western Europe (NWE)
- extra space ... characteristics of the dataset .Section 6 -> characteristics of the dataset. Section 6
- normalize the font size/face of the URL footnotes
- Figure 1 is hard to see, the authors have to increase the size or improve the quality of the figure
- Section 3 ... authors convert the original sources into a common ArcGis XML and then from this XML format to RDF, why the didn't transform the original sources to RDF directly? The authors should improve the justification of this decision.
- Section 4 ... authors claim they made some extensions to NeoGeo vocab, it would be good to briefly describe those extensions, and probably to provide a link to the ontology documentation so we can see how they integrate the reused vocabs.
- I'm not able to see Fig. 6, and this is an important Figure, probably authors can provide a link to a web page that describe, with examples, the URI patterns they used.
- The figures should improve in general, some of them are very hard to read. For example you can get rid of Fig 3. and provide a link to a site that contains a high level overview of the ontologies reused with their documentation
- Finally, authors should check the references, they are not well normalized.

Review 3 by Anne Thessen

I found this to be an interesting read. The author describes a data set designed for a specific purpose, to locate potential algal biomass production sites in Europe. I have just a few minor corrections and one fairly major comment.
1. In the abstract NUTS is mentioned, but the reader has no idea what this is at this point. NUTS is not explained until the second page. I would just leave it out of the abstract and say "across North-Western Europe".
2. Introduction line 10. Insert space between Europe and (.
3. Introduction line 17. Insert space between . and Section. Delete space between dataset and .
4. Motivation line 6. Insert space between EABA and (.
5. Egads! These figures are quite small. My poor eyeballs can't take it. Especially Fig 6.
6. Check spelling of "algal" in second to last sentence in paper.
7. There is an extra . at the end of the References
8. The References seem to be out of order. First is 6, then 1, then 8?

My big comment centers around nutrient and algae data. It sounds to me, that the author wants a system where an algal biomass producer could query good growing locations and what species would do well in those locations. In this case, the nutrient data and the algae strain data are crucial for function. (Btw, It would probably be good to mention earlier than the end, why there are no nutrient data sets in Fig 4. Maybe just state that the absence of nutrient data will be discussed in section 7. As a reader I was a bit puzzled by this until I got to the end.) Can the author not access nutrient data from repositories like PANGAEA? Nutrients are going to be very important for algal biomass growth, so this system really needs nutrient data to be functional at the level the author desires. I'm glad to hear the author is working with biologists to get the necessary data. I'm a bit concerned that AlgaeBase is being used as a source of data for "which species does well where". My experience with AlgaeBase is that it does not have the data needed i.e. species occurrence data - or at least does not have it for the majority of species, or has the data and does not share it. Might it be better to check with European Algae Culture repositories? Perhaps GBIF (www.gbif.org) would be better suited? It seems that this project needs a biologist who can chase up a list of the algae found in Europe and then the acceptable environmental conditions for that algae (which actually may not be known for all species). (This is fairly similar to the types of things I've been thinking about recently.) This of course would change over time because of climate change. Just to be clear, I don't think these issues should keep the paper from being published. I just see this as a major stumbling block to the desired functionality. What's been done so far is a great start and should be published. Maybe there should be a future work section? A product that delights users is going to take some human curation, I think, but would be a really cool data set and well worth the effort.

Tags: