The Humanitarian eXchange Language: Coordinating Disaster Response with Semantic Web Technologies

Tracking #: 448-1622

Authors: 
Carsten Keßler
Chad Hendrix

Responsible editor: 
Guest Editors Semantic Web For All

Submission type: 
Full Paper
Abstract: 
The Humanitarian eXchange Language (HXL) is a project by the United Nations Office for the Coordination of Humanitarian Affairs that aims at refining data management and exchange for disaster response. Data exchange in this field, which often has to deal with chaotic environments heavily affected by an emergency such as a natural disaster or an armed conflict, still happens mostly manually. The goal of HXL is to automate many of these processes, saving valuable time for staff in the field and improving the information flow for decision makers who have to allocate resources for response activities. This paper presents a case study on this initiative, which is set to significantly improve information exchange in the humanitarian domain. We introduce the HXL vocabulary, which provides a formal definition of the terminology used in this domain, and an initial set of tools and services that produce and consume HXL data. The HXL system infrastructure in introduced, along with its data management principles. The paper concludes with an outlook on the future of HXL and its role in the humanitarian ecosystem.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Stefan Boyera submitted on 06/Apr/2013
Suggestion:
Accept
Review Comment:

The paper is very interesting and covers all the critical dimensions and potential use of HXL language.
A couple of remarks:
-the paper mention the use of HXL translator in the field. It might be interesting to ensure that major platforms used in the field are integrating by default HXL output and are validated instead of having such a process in the field during an emergency?
-there is no mention of any standardization group or any group supporting the development of this specification? I believe it would be essential that such a work is developed by different actors including ngo in the field, platform providers etc to be sure that all viewpoints are taken into account.

Review #2
By Louiqa Raschid submitted on 19/Apr/2013
Suggestion:
Major Revision
Review Comment:

This is a useful case study of the definition of HXL, motivation and implementation. The organization of the paper is poor and key ideas are not expressed clearly. Thus, the contribution of the current submission is marginal.

Related Work
The comparison with EDXL is shallow. What are the relevant differences between one-off EDXL request and HXL? Is it because humans interpret EDXL messages and software needs to handle HXL? Or is there something about long running operations and the back and forth of exchanges rather than the one-off exchange in EDXL?

spending not spendings

Why are ontologies not discussed?

Section 3.1 and 3.2 appear to be both about requirements?
It is unclear why these are separate sections.
Then 3.2 abruptly introduces some architecture details despite the fact that the title is 'resq spec'? Poor organization.

3.1 Realtime response important but not relevant here.
3.1 Need to support updates important but need more details.
3.1 What is the 'intl hum system'?
3.1 Many levels requirement unclear.
3.1 Clusters within the humanitarian organizations seems to be irrelevant.

3.2 'compilation of common operational picture' -- seems to be very important yet it has not been defined or even mentioned until this section?
3.2 - Second paragraph talks about the solution but the reader is still trying to find the problem? What is the problem?

Section 4

'common operational datasets' undefined?

Humanitarial profile seems strange? How about demographic or population or similar term?
Metadata is too generic and I think that you really mean Provenance?
Response - How about Organizations and Services?
Situation - Do you mean Events?

Section 5 could be shortened since this really does not contribute much to the semantic Web domain?

Combine Sections 6 and 7 but first motivate the goals and provide short descriptions. This is not a user manual for your software.

Review #3
By Mike Powell submitted on 05/May/2013
Suggestion:
Minor Revision
Review Comment:

I think this is a well written article about an important initiative and should certainly be published. I have a few queries/concerns. I am not sure if they relate to the article or are more about the system it describes. In either case a little more discussion/explanation would be helpful. My first comment also includes a genuine uncertainty about what you actually mean.

1. Who are the 'users' who convert spreadsheet data into HXL? It reads as if they perhaps work for the organisation which has collected the data. If so, I wonder how OCHA intends to secure their co-operation in undertaking this additional work. I also worry that requiring information suppliers to carry out these steps, rather than staff in UN IMOs, will be beyond the capacity of smaller, local organisations. This would then have the potential to exclude from the information system, precisely those closest to the ground and closest to the affected, a result which is undesirable on a host of levels.

2. Is there any space for qualitative information in the system? Information about people's experiences of traveling to safety - both in terms of security incidents and of seasonal issues such as mud/ swollen rivers etc - can be essential both to properly understanding the character of the emergency and in planning relief work.

3. For all your excellent introduction and requirements, this still reads as if 'perfect data' is the dream outcome. Page 5 talks of the need 'to improve the efficiency of this cluster-level reconciliation process'. This worries me a lot. Data collected in the acute phase of an emergency is always messy, incomplete and, at least in part, contradictory. This is partly because the people collecting it are almost inevitably very tired and quite stressed and are thus prone to mistakes. Things are disorganised - so it may be your mechanics not your nurses who come across a group of displaced people and try to assess their state. People and organisations at all levels are competing for resources and may be highly 'selective' in what they choose to share. For this reason I would argue that the process of is not necessarily of reconciling data as of assessing and interpreting it. Quality is needed here as much as efficiency. The end result is 'best current estimate'. This will sometimes include the reporting of more than one narrative, citing which data seems to fit which narrative. In the process, one interpretation and choice of one set of data may become the 'official' one and correctly be given more prominence but problematic data and their interpretation should still be accessible. They may turn out to be right, or the fact that someone is deliberately trying to sow disinformation may be as important as the real figures.

3a. I am surprised that there is no mention of language here. Does the system work only in English or is it available in other languages?
3b. The description of the development of the vocabulary is clear with regards to producing something appropriate for the jargon of international disaster response and the for the particular sensitivities of the UN (think Eritrea/Ethiopia border). That is clearly important. Less is said about making the vocabulary sensitive to other users, especially in the location of the emergency. It isn't just placenames but actual vocabulary which can be noticably different (see for example http://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=3&ved=0CE...) I would have thought that learning about the local vocabulary of any emergency and linking it to HXL by means of 'same as' or 'similar to' would be essential. I am interested to know if this or any other manifestation of heterogeneous ontology practice has been used in this process or, if not, why not? This comment links back to the question of how well the system acts as a resource for multiple users, especially at local or national level.

4. I appreciate the specific requirements of and needs for data on disaster response but note that you yourselves mention the potential overlap with longer term development work. I wondered therefore whether you knew of/ were collaborating with other semantic/linked data initiatives in the sector, especially the AGRIS/AGROVOC work at FAO, or work done with R4D output in the UK (http://r4d.dfid.gov.uk/Output/191248/Default.aspx)?

About the reviewer
Mike Powell worked as an emergency field manager for a large international NGO in a number of large scale humanitarian emergencies in the 1980s. For the last 20 years he has specialised in information management in development and (occasionally) in disaster relief work.

Review #4
By Christophe Guéret submitted on 22/May/2013
Suggestion:
Minor Revision
Review Comment:

This very clear and easy to read paper presents an approach to integrate data related to disaster response.
The system uses a vocabulary and a tool to import data. This is in overall a very good paper that presents something of importance. There is however room for improvement:

* It is said that mobile coverage and satellite connection make it possible to use Internet-based systems on site. This is true but nothing is said about the bandwidth usage of the presented approach. Phones and satellites do not offer the same bandwidth as, say, ADSL lines. In order to evaluate the usability of the system it would be interesting to know how much data transfer the import tool and the dashboard require.

* The usage of XHLator sounds simple and intuitive but it would be best if that kind of additional work could be avoided in the first place. Would it be possible to design a spreadsheet like tool that would be used to fill in the data directly? Also, it would be good to say a few word on how XHLator compares to LODRefine and EasyOpenData

* It is said that a lot of data is tabular because that's a format convenient for users. The assumption could also be made that most of the data that is provided is actually of statistical nature, and thus fit nicely in tabular structure using columns/rows as dimensions. Then comes the question of using vocabularies such a DataCube to represent this data. Besides being tailored for statistical data, using DataCube would also improve the re-usability of the data as DataCube is inspired from the popular SDMX data model.

* The system described is heavily centralised, from the data collection approach, to the URI scheme, to the control mechanism for data quality. Would it be possible to deploy it in a decentralised context where every data contributor describes entity under its own namespace and do internal data quality checks ?

Review #5
By Martin Murillo submitted on 27/May/2013
Suggestion:
Minor Revision
Review Comment:

The paper provides an excellent comprehensive, objective and fair overview/report of the technical and application realms of semantic web to a concrete, high-impact, urgent and humanitarian need, thus showing the potential of semantic web to real-life issues that need to be addressed urgently. The paper is technically fair recognizing voids and the lack of further feedback for the fine-tuning of the vocabulary and recognizing limitations.

Thus the paper fits very well within the requirements of the journal. The technology that the paper describes indirectly benefits the principal beneficiaries (i.e. the “all”) thus fitting very well within the goals of the special issue as well. Additionally, the authors provide an excellent approach in offering a solution through the “requirements specification” approach, thus giving the paper an appropriate framework for this section. However, there are a few minor changes that must be done in order to have an even better piece of work and be suitable for publication in the journal. These changes suggested are in light of the multidisciplinary nature of the paper and the amount of space that the authors dedicated to non-technical issues. Thus I recommend the following changes to be made:

1. While it is well understood how the introduction of a vocabulary and tools will help administratively, the authors need to explain a little bit and provide an explicit connection to the ultimate beneficiary, the aid recipient. This could be done through an explicit example in a paragraph; i.e. what does a better “long term humanitarian operations” mean for the aid recipient? While this might seem obvious for the authors and some public, it is important to make it explicit in order for the article to be even more relevant for the special issue and its focus to the “all”. The authors have put an example on short-time issues (i.e. an ambulance that needs to find quickly a nearby hospital); I urge to provide an example of “long term humanitarian operations” in the context of the “all”. Alternatively, the authors could explicitly put an example on how this technology might have helped in something that “failed” in the absence of like technology.

2. While I’m sure the authors know otherwise, the paper seems to assume that the rise of a new technology itself is a sufficient condition for its success on its application to a real-life issue; however, as the authors have mentioned, various initiatives were not successful. There are social, cultural, human, and organizational factors that will determine the degree at which the technological tool gets accepted and utilized in the field and at higher spaces. The authors have briefly touched this area, however, given the interdisciplinary focus of the paper, it would be appropriate to briefly explain (a) the reasons why other technologies (explicitly stated by the authors) were not successful OR (b) what makes HXL better than other approaches (i.e. why the UN along its partner organizations would have better prospects to have a successful implementation/adoption). This would provide the semantic web community valuable insights on important issues that need to be considered in the adoption of semantic web in the daily operation of humanitarian and other institutions.

3. The paper touches on issues of lack of connectivity, a key necessary condition for the proposed technology; however, it does not touch on issues of limited bandwidth nor touches on issues on how semantic web (i.e. text files) would benefit the otherwise transfer of heavy formats. Nor the paper informs on how technologies such as Ajax would put burden on a limited bandwidth (i.e. frequent server requests). The authors mention offline work but they fail to elaborate further on this key issue.

4. The paper states: “This paper presents a case study on this initiative, which is set to significantly improve information exchange in the humanitarian domain. We introduce the HXL vocabulary, which provides a formal definition of the terminology used in this domain, and an initial se”. In this context, is the paper itself a case study? Otherwise identify with appropriate titles or indications where the “case study” is.

5. The paper states: “The goal of HXL is to automate many of these processes” maybe this expression needs to be corrected to, for example, “The goal of HXL is to help/contribute to the automatization of many of these processes” or “to be the basis for the automatization of …”

6. The paper stages: “It is extremely unlikely, though, that all involved organizations can be convinced to use a common information system, due to the differences both in size and topical focus.” The authors need to make it clear OCHA’s relationship with the 5500 organizations; does OCHA fund them? In this case, would there be better probabilities for adoption?

7. The authors make use of abbreviations; however the reader does not know what they mean (i.e. OGC, etc). There are also instances of duplicated words and misspellings.