Reducing Fragmentation in Incremental Author Name Disambiguation
Author name ambiguity is a hard problem that occurs when several authors publish articles with the same name or when a same author publishes their articles under different names. Traditionally, automatic disambiguation methods process the author names of all citation records in a repository. Aiming efficiency, incremental methods disambiguate author names only when new citation records are inserted into the repository. As a side effect, several citation records of a same author may be associated with different authors, aka, the fragmentation problem. To diminish this problem, we propose a new merge-oriented incremental method capable of reducing such side effect, without the need to apply a traditional disambiguation method on the whole repository. Our experimental evaluation shows that our method produces significant improvements when compared to an incremental baseline and is very competitive with batch-mode methods.