|Review Comment: |
The authors have made a good effort to improve the paper, which I think now is more compact, focused, and overall reads better. I think it is ready to be accepted. Though I still have a few questions that I hope the authors will clarify.
1. section 2.1: when viewing the KB as a graph, do you use a constrained set of relationships to form edges? Using DBpedia as example, did you use all object-object relations? For reproducibility this should be clarified.
2. page 7, left column, the 'training data' paragraph: so effectively your training data contains a set of pairs where e is the desired entity for the mention and e' is anything that e \neq e'. How many training pairs are used for each mention? Is it e paired with every negative candidate e'? Or is it more selective? Please clarify and explain.
3. Table 2: highlight the highest figures
4. w.r.t. your first claimed contribution, you added a paragraph on page 4 that says '... we simplified the objective function by changing the .... to remove one of the two parameters...' I still do not understand which parameters are we talking about? Please clarify, cross-reference to specific sections/equations in your previous paper if necessary. Also, I think you should talk about empirical benefits (but this may become clear if you clarify the simplification due to parameters) - compared to your cikm paper the results are comparable and there is no noticeable gain from this change in terms of effectiveness. So does the change improve efficiency? Can you discuss it?