96 Commits

Author SHA1 Message Date
Kishore Nallan
56247ce6ac Prefix match must be differentiated from single typo. 2021-07-14 11:44:01 +05:30
Kishore Nallan
994f5021e6 Ensure that geopoint is validated before indexing. 2021-07-14 11:44:01 +05:30
Kishore Nallan
56bbf8df26 Separate geo index for every field + proper deletion. 2021-07-14 11:44:01 +05:30
Kishore Nallan
e4936a9f1a Simplify wildcard query result generation. 2021-07-14 11:44:01 +05:30
Kishore Nallan
5cbf810fe5 Fix upsert behavior: should accept only whole documents. 2021-07-14 11:44:01 +05:30
Kishore Nallan
2391dad879 Field level prefix configuration. 2021-07-14 11:44:01 +05:30
Kishore Nallan
48c423b85a Basics of a block based posting list container. 2021-07-14 11:44:01 +05:30
Kishore Nallan
78ea80153f Allow num_typos to be configured at a per-field level. 2021-07-14 11:44:01 +05:30
Kishore Nallan
529bb55c5c Make exact match behavior configurable. 2021-07-14 11:44:00 +05:30
Kishore Nallan
e0dc73af3e Swap out underlying geo library. 2021-07-14 11:44:00 +05:30
Kishore Nallan
f9a037a4d5 Reduce no-op operations during updates to fix perf. 2021-07-14 11:44:00 +05:30
Kishore Nallan
25f6fe0614 Prioritize records whose fields match exactly with query. 2021-07-14 11:44:00 +05:30
Kishore Nallan
1d1712f391 Refactor tokenizer to use index, skip and separate logic. 2021-04-16 17:55:52 +05:30
kishorenc
dd72e2a78c Introduce field level locale. 2021-04-02 21:28:49 +05:30
kishorenc
c2eec85277 Fix highlighting of strings with special characters. 2021-03-20 12:58:30 +05:30
kishorenc
42732c454d Handle bad bool field coercion. 2021-03-08 14:41:07 +05:30
kishorenc
46856701d6 Use sparse map for facet values. 2021-03-06 11:48:22 +05:30
kishorenc
33f705cace Geo polygon filtering. 2021-03-03 07:23:25 +05:30
kishorenc
4e3307a891 Use string* to specify string/string array coercion. 2021-02-25 15:07:37 +05:30
kishorenc
3a4d21992c Fix edge cases in schema detection. 2021-02-24 21:38:55 +05:30
kishorenc
0a9cf4aee0 Add more tests for testing schema detection. 2021-02-23 20:04:37 +05:30
kishorenc
f1b70384cc Allow fields to be stringified automatically. 2021-02-23 12:58:14 +05:30
kishorenc
c24fc02d4d Persist per-doc coerce setting + allow dropping of bad values. 2021-02-23 09:35:36 +05:30
kishorenc
d2a825799b Make default sorting field optional. 2021-02-21 19:55:31 +05:30
kishorenc
11c41804e5 Handle bad data gracefully. 2021-02-20 12:49:41 +05:30
kishorenc
e9df6e58e2 Allow indexing of fields without pre-defined schema.
# Conflicts:
#	include/collection.h
#	include/index.h
#	src/collection.cpp
#	src/collection_manager.cpp
2021-02-18 19:08:42 +05:30
kishorenc
17fbbd0838 Refactor concurrency model. 2021-02-06 20:17:18 +05:30
kishorenc
b2fba69a73 Address some warnings related to update doc scrubbing. 2020-12-28 19:20:00 +05:30
kishorenc
302cdf137b Fix field-wise num results used for threshold matching. 2020-12-28 19:20:00 +05:30
kishorenc
bc1d88f1eb Consider tokens matching across fields during ranking. 2020-12-28 19:20:00 +05:30
kishorenc
66a44a5afc Expose field weights used for scoring. 2020-12-28 19:20:00 +05:30
kishorenc
435476df5d Rank prefix match below exact match. 2020-12-28 19:19:59 +05:30
kishorenc
8f818f7fcb More exhaustive multi-field ranking. 2020-12-28 19:19:59 +05:30
kishorenc
88918ef958 Synonyms basics. 2020-12-28 19:19:59 +05:30
kishorenc
6883b4db36 Speed up numerical filter + fixed edge case with -ve value. 2020-12-28 19:19:59 +05:30
kishorenc
1bed2d7d80 Support excluding tokens from query.
Prefixing a token in the query with "-" will fetch documents that do not contain that token.
2020-12-28 19:19:59 +05:30
kishorenc
7e2b0fcdcb Match query tokens across multiple fields effectively. 2020-12-28 19:19:59 +05:30
kishorenc
eaea93c572 Delete documents matching a filter query. 2020-11-17 20:10:34 +05:30
kishorenc
6997e35f72 Combine various token operations in a single flow.
Splitting, normalizing etc. are now done in a single loop.
2020-11-17 20:10:34 +05:30
kishorenc
6c1455bc2f Return matched tokens in highlight response structure.
Also, allows customization of the highlighting tag used (default being the mark tag).
2020-11-17 20:10:34 +05:30
Kishore Nallan
8d2c881040 Swap loops during facet calculation for performance. 2020-11-02 07:24:52 +05:30
Kishore Nallan
e22297e249 Fixed failing test. 2020-10-24 16:48:25 +05:30
Kishore Nallan
2041de033f Refactor underlying APIs to support insert/update/upsert. 2020-10-24 09:23:33 +05:30
Kishore Nallan
9adbdd1576 Fixed a bug with hlighlighting in upsert import. 2020-10-17 11:55:42 +05:30
Kishore Nallan
60d4e9bf5a Support upsert during import. 2020-10-10 18:09:17 +05:30
kishorenc
377769294f Add foundational support for document update. 2020-09-27 17:20:22 +05:30
kishorenc
b764b32134 Iterate doc fields for removal to allow partial deletion. 2020-09-26 15:28:18 +05:30
kishorenc
9c2782e93d Use int64_t for default sorting field references. 2020-09-26 15:13:01 +05:30
kishorenc
5795aba81f Speed up token position computation. 2020-09-07 18:41:29 +05:30
kishorenc
aa9e4a226e Filtering on string field should be verbatim by default.
Allow earlier "CONTAINS" behavior via "~" operator.
2020-09-06 16:25:39 +05:30