Download PDFOpen PDF in browserA Chinese character hash function based on strokes for fingerprinting7 pages•Published: November 2, 2021AbstractCharacter representation in computer systems is the main purpose of character encod- ings, such as Unicode. The representation of Chinese characters in computer systems is a long-standing issue. It is currently still not possible to easily represent, for instance to input, some Chinese characters in computers. In this research, we especially consider the issue of the Chinese characters that are not covered by the conventional encodings. In this paper, in continuation of our previous works on a universal character encoding for such characters, we describe a non-ambiguous hash function for any Chinese character. Unlike conventional approaches, this function is solely based on the character strokes, thus elim- inating any sort of ambiguity. Given its sparsity and low collision rate, the proposed hash function can then be applied to fingerprinting, which can in turn be applied, for instance, to information retrieval. Simplicity and unambiguity are keys to our proposal. This work is then formally evaluated and compared to previous works so as to show its applicability, contribution and to measure its limits.Keyphrases: character, fingerprint, function, hash, symbol In: Yan Shi, Gongzhu Hu, Quan Yuan and Takaaki Goto (editors). Proceedings of ISCA 34th International Conference on Computer Applications in Industry and Engineering, vol 79, pages 64-70.
|