579 Commits

Author SHA1 Message Date
Kishore Nallan
39910c872a Capture search related metrics separately. 2021-09-21 11:40:41 +05:30
Kishore Nallan
d75e834ac3 Address warnings. 2021-09-19 21:39:15 +05:30
Kishore Nallan
5b0690fcd8 Allow filtering and deleting using doc IDs. 2021-09-19 18:38:43 +05:30
Kishore Nallan
25d29919ae Multiplex frequency + score based token selection. 2021-09-19 16:31:28 +05:30
Kishore Nallan
27b392cee9 Exhaustive search should not always be enabled during token drop search. 2021-09-18 15:59:35 +05:30
Kishore Nallan
703110264a Dropped tokens should not be prioritized as exact matches. 2021-09-13 16:23:56 +05:30
Kishore Nallan
902704887c Return total_values as part of facet stats, even for strings. 2021-09-10 21:18:21 +05:30
Kishore Nallan
1afa193161 Fix faceting count edge case. 2021-09-10 16:32:22 +05:30
Kishore Nallan
c6fe1369b9 Enable filtering via overrides. 2021-09-08 18:43:45 +05:30
Kishore Nallan
c0fce41c3b Ensure that an import batch cannot contain duplicate doc IDs. 2021-09-07 17:02:58 +05:30
Kishore Nallan
ba67efb7da Support zero weighting for multi-field match scoring. 2021-09-05 14:54:21 +05:30
Kishore Nallan
910256d82c Fix valgrind warnings. 2021-09-05 08:07:00 +05:30
Kishore Nallan
6fc18a0971 Fix test consistency. 2021-09-04 08:53:10 +05:30
Kishore Nallan
48ac6bb82a Weight all components of cross-field match score. 2021-09-03 11:38:46 +05:30
Kishore Nallan
d6e8156973 Fix test again. 2021-09-02 21:11:30 +05:30
Kishore Nallan
75263d52a3 Fix test. 2021-09-02 20:38:31 +05:30
Kishore Nallan
19badcd0cb Move to precise token candidate selection.
No longer approximate.
2021-09-02 17:44:32 +05:30
Kishore Nallan
055f2c7695 Fix perf in scoring results. 2021-09-01 15:27:05 +05:30
Kishore Nallan
7b4450bbf9 Bake concurrency into a single index. 2021-08-31 13:11:50 +05:30
Kishore Nallan
adc816e662 Use token separators whule parsing search query as well. 2021-08-28 20:59:05 +05:30
Kishore Nallan
9659d60047 Exhaustive search should ignore typo and drop token thresholds. 2021-08-28 19:33:40 +05:30
Kishore Nallan
b6f1885aec Stricter bounding of typo correction threshold. 2021-08-28 16:38:07 +05:30
Kishore Nallan
ce7b6e12e9 Prioritize record with a field containing all tokens in the query. 2021-08-27 20:52:51 +05:30
Kishore Nallan
07d838e385 Make symbols for indexing and segmentation configurable. 2021-08-26 10:27:18 +05:30
Kishore Nallan
a931bb4b2a Handle highlighting on a field with empty array value. 2021-08-25 17:05:06 +05:30
Kishore Nallan
d4bd6e67e5 Further tweak exact match logic. 2021-08-22 15:47:21 +05:30
Kishore Nallan
2df55e7991 Fix exact value matching. 2021-08-22 13:45:26 +05:30
Kishore Nallan
793e21a1c2 Ensure that search does not fetch existing tokens. 2021-08-18 18:51:39 +05:30
Kishore Nallan
0e2adb4242 Copy-free intersect + score. 2021-08-17 18:37:42 +05:30
Kishore Nallan
26351a6984 Change default value of typo/drop tokens threshold to 1. 2021-08-11 14:20:28 +05:30
Kishore Nallan
55535198a4 Prefix search to be used only for last token. 2021-08-07 13:12:06 +05:30
Kishore Nallan
b5e3a28ace More fixes for highlighting. 2021-08-05 21:31:04 +05:30
Kishore Nallan
261536d0f4 Merge branch '0.22.0-rc' into postings-refactor-integration
# Conflicts:
#	src/collection.cpp
#	src/index.cpp
#	test/collection_specific_test.cpp
2021-07-31 21:35:30 +05:30
Kishore Nallan
b2c12a9b2c Fix more edge cases in highlighting. 2021-07-31 08:59:49 +05:30
Kishore Nallan
331db4f27e Add precision option to geo field sorting. 2021-07-27 19:57:56 +05:30
Kishore Nallan
13cb7b9364 Revert "Highlight field value that is a prefix of the query."
This reverts commit 545027a59bc55b24c2fece112b4fa6a655a1f79e.

# Conflicts:
#	test/collection_specific_test.cpp
2021-07-27 17:57:49 +05:30
Kishore Nallan
b4c222064c Handle bad data in ingestion text gracefully. 2021-07-26 19:44:38 +05:30
Kishore Nallan
e45f18785f Ignore id field present in schema. 2021-07-26 19:44:10 +05:30
Kishore Nallan
38d44a7c8a Highlight field value that is a prefix of the query. 2021-07-26 15:33:03 +05:30
Kishore Nallan
41c16fb7a7 Merge branch '0.22.0-rc' into postings-refactor-integration
# Conflicts:
#	include/index.h
#	include/posting.h
#	include/posting_list.h
#	src/art.cpp
#	src/collection.cpp
#	src/index.cpp
#	src/posting.cpp
#	src/posting_list.cpp
#	test/art_test.cpp
#	test/collection_specific_test.cpp
#	test/collection_test.cpp
#	test/posting_list_test.cpp
2021-07-24 17:10:54 +05:30
Kishore Nallan
89a509513a Ensure that weights can fully control cross-field matching. 2021-07-24 15:08:08 +05:30
Kishore Nallan
e42f78a695 Fix single character full field value highlight. 2021-07-21 19:26:09 +05:30
Kishore Nallan
672c895805 Typo and drop tokens thresholds must be applied independently. 2021-07-16 13:39:52 +05:30
Kishore Nallan
0ae718d067 Use all candidates of a given num_typo value.
Typo tokens threshold should not trigger when we have explored only some of the candidates of a given num_typo value.
2021-07-16 12:07:44 +05:30
Kishore Nallan
bfb122bfec Repeating tokens in an array: fix relevancy. 2021-07-15 15:58:55 +05:30
Kishore Nallan
56247ce6ac Prefix match must be differentiated from single typo. 2021-07-14 11:44:01 +05:30
Kishore Nallan
614f7a9f61 Don't validate the id field. 2021-07-14 11:44:01 +05:30
Kishore Nallan
184ed43727 Improve multi field typo ranking. 2021-07-14 11:44:01 +05:30
Kishore Nallan
9dd3ba9d6d Make typo correction less eager. 2021-07-14 11:44:01 +05:30
Kishore Nallan
091080985f When sorting on geo field, return distance from reference point in response. 2021-07-14 11:44:01 +05:30