22 Commits

Author SHA1 Message Date
Ozan Armağan
ed87e63961
Fix stemming for non-English locales (#1745)
* Fix stemming for non-English locales

* Fix synoyms

* Use `query_tokens_t.q_unstemmed_tokens` for passing unstemmed tokens-

* Refactoring

* Move for loop in parse_search_query to process_tokens completely

* Use q_phrase_dummy and q_exclude_tokens_dummy for unstemmed process_tokens

* Add stopwords_set to process_tokens

* remove unused variables
2024-05-23 01:20:20 +05:30
Kishore Nallan
c69ff12d3d Remove general punctuation for th locale. 2024-03-14 11:59:56 +05:30
Kishore Nallan
9d9ffd3bf9 Add option to expand search query for suggestion aggregation. 2023-12-27 15:56:02 +05:30
Kishore Nallan
f1cd6038ea Trim query suggesitons before aggregation. 2023-08-21 16:46:06 +05:30
Kishore Nallan
a0e7e8826e Make override rules query case insensitive. 2023-04-13 21:49:38 +05:30
Kishore Nallan
cfbcbfc6fb Handle special characters in prefix highlighting. 2023-03-06 13:31:38 +05:30
Kishore Nallan
57ac561743 Handle special characters in locale tokenization. 2022-08-18 10:47:30 +05:30
Kishore Nallan
6729b72b1a Normalize thai text via nfkc. 2022-08-04 20:32:54 +05:30
Kishore Nallan
4f961f4919 Highlight only the prefix. 2022-01-02 18:08:05 +05:30
Kishore Nallan
c3a85be42f Fix highlight surround num tokens in Cyrillic. 2021-12-28 16:40:38 +05:30
Kishore Nallan
a9c15ec0f7 Fix facet highlighting for Cyrillic text. 2021-12-09 10:54:47 +05:30
Kishore Nallan
bd6fd1c03e Improve Cyrillic support. 2021-12-02 16:17:01 +05:30
Kishore Nallan
07d838e385 Make symbols for indexing and segmentation configurable. 2021-08-26 10:27:18 +05:30
Kishore Nallan
8726e27718 Support Chinese locale. 2021-06-06 22:03:02 +05:30
Kishore Nallan
56d3a26cc5 Imporve prefix searching on ko locale. 2021-05-31 19:47:12 +05:30
Kishore Nallan
b3b47f5651 Refactor highlighting + tokenizer to simplify logic. 2021-04-18 20:37:58 +05:30
Kishore Nallan
1d1712f391 Refactor tokenizer to use index, skip and separate logic. 2021-04-16 17:55:52 +05:30
kishorenc
3a92685967 Integrate with Kakasi. 2021-04-05 12:25:50 +05:30
kishorenc
dd72e2a78c Introduce field level locale. 2021-04-02 21:28:49 +05:30
kishorenc
c2eec85277 Fix highlighting of strings with special characters. 2021-03-20 12:58:30 +05:30
kishorenc
9533b73609 Fixed a few higlighting/splitting edge cases. 2020-11-17 20:10:34 +05:30
kishorenc
6997e35f72 Combine various token operations in a single flow.
Splitting, normalizing etc. are now done in a single loop.
2020-11-17 20:10:34 +05:30