Kishore Nallan
|
57ac561743
|
Handle special characters in locale tokenization.
|
2022-08-18 10:47:30 +05:30 |
|
Kishore Nallan
|
6729b72b1a
|
Normalize thai text via nfkc.
|
2022-08-04 20:32:54 +05:30 |
|
Kishore Nallan
|
07d838e385
|
Make symbols for indexing and segmentation configurable.
|
2021-08-26 10:27:18 +05:30 |
|
Kishore Nallan
|
e695ba65c8
|
Add a few locale tokenization tests.
|
2021-06-09 20:20:01 +05:30 |
|
Kishore Nallan
|
8726e27718
|
Support Chinese locale.
|
2021-06-06 22:03:02 +05:30 |
|
Kishore Nallan
|
56d3a26cc5
|
Imporve prefix searching on ko locale.
|
2021-05-31 19:47:12 +05:30 |
|
Kishore Nallan
|
8d6742fc6d
|
Normalize ascii tokens intermixed with non-english text.
|
2021-05-28 14:04:10 +05:30 |
|
Kishore Nallan
|
b3b47f5651
|
Refactor highlighting + tokenizer to simplify logic.
|
2021-04-18 20:37:58 +05:30 |
|
Kishore Nallan
|
1d1712f391
|
Refactor tokenizer to use index, skip and separate logic.
|
2021-04-16 17:55:52 +05:30 |
|
kishorenc
|
3a92685967
|
Integrate with Kakasi.
|
2021-04-05 12:25:50 +05:30 |
|
kishorenc
|
dd72e2a78c
|
Introduce field level locale.
|
2021-04-02 21:28:49 +05:30 |
|
kishorenc
|
c2eec85277
|
Fix highlighting of strings with special characters.
|
2021-03-20 12:58:30 +05:30 |
|
kishorenc
|
f501b137b7
|
Tokenize on special characters.
|
2021-03-16 11:39:53 +05:30 |
|
kishorenc
|
a912a250ff
|
Fix bad unicode characters in highlight snippet.
|
2020-12-28 19:19:59 +05:30 |
|
kishorenc
|
9533b73609
|
Fixed a few higlighting/splitting edge cases.
|
2020-11-17 20:10:34 +05:30 |
|
kishorenc
|
6997e35f72
|
Combine various token operations in a single flow.
Splitting, normalizing etc. are now done in a single loop.
|
2020-11-17 20:10:34 +05:30 |
|