3695 Commits

Author SHA1 Message Date
Kishore Nallan
f608656393 Move lock inside hnsw_t destructor. 2024-02-29 11:41:59 +05:30
Harpreet Sangar
1f6fbed372
Refactor referenced_in initialization of a collection. (#1585)
* Refactor `referenced_in` initialization of a collection.

* Review changes.

* Review changes.

---------

Co-authored-by: Kishore Nallan <kishorenc@gmail.com>
2024-02-28 20:48:31 +05:30
Krunal Gandhi
482858d05d
pagination for collections GET (#1581)
* add limit and offset support for collection

* use req->params, dont restrict limit param when records are less

* add more checks and not sort with pagination

* minor updations
2024-02-28 19:09:33 +05:30
Ozan Armağan
c30fc2791f
Add support for custom OpenAI API URL (#1583)
* Add support for custom OpenAI API URL

* Fix test
2024-02-28 15:08:37 +05:30
Krunal Gandhi
f514c42e2d
Fix pinned_hits_str ids order (#1578)
* fix pinned_hits_with_grouping

* remove repeated group_limit check

* fix pinned_hits ids order

* remove redundant sort call
2024-02-27 11:26:23 +05:30
Krunal Gandhi
7762f2d080
respect num of groups when typo_tokens_threshold is applied (#1574)
* respect num of groups when typo_tokens_threshold is applied

* update changes after merge
2024-02-26 15:56:08 +05:30
Ozan Armağan
6e24c06e35
Add support for vLLM RAG (#1563)
* Add support for vLLM RAG

* Add `max_bytes` propety for conversation models

* Fix truncate conversation test

* Refactoring
2024-02-26 11:18:14 +05:30
Kishore Nallan
1b5846fe49 Remove old search_field API and code. 2024-02-24 12:28:51 +05:30
Ozan Armağan
483e10d07e
Fix sort fields bug for hybrid search with self-populated embeddings (#1570)
* Fix sort fields bug for hybrid search with self-populated embeddings

* Fix typo

* Update if condition for vector distance sort field

* Change sort_fields to sort_fields_std
2024-02-22 21:49:00 +05:30
Krunal Gandhi
d1079c633c
fix pinned hits with grouping and filter (#1572)
* fix pinned_hits_with_grouping

* remove repeated group_limit check
2024-02-22 18:47:42 +05:30
Ozan Armağan
2c21e1306b
Fix stemming bug when searching with synonyms (#1569)
* Fix stemming bug when searching with synonyms

* Remove unnecessary parameter in `parse_search`query`
2024-02-22 16:26:03 +05:30
Harpreet Sangar
ec186c032e
Empty filter value fix. (#1571)
* Empty filter value fix.

* Add test case.
2024-02-22 15:21:49 +05:30
Kishore Nallan
0013c50ffd Fix mac arm build. 2024-02-22 10:11:00 +05:30
Ozan Armağan
670b97dbcd
Fix snowball bazel file for MacOS (#1566) 2024-02-21 16:03:54 +05:30
Kishore Nallan
f1e1485d6b Fix nohits query aggregation. 2024-02-20 20:45:49 +05:30
Ozan Armağan
5722a9dddd
Add support for openai/text-embedding-3-* (#1558)
* Add support for `openai/text-embedding-3-`

* Fix test

* Rename openai_custom_dims to has_custom_dims

* Sİmplify custom dimension check condition
2024-02-20 18:26:59 +05:30
Krunal Gandhi
285eb2fa5d
fix excluding upper range val in search (#1564)
* fix excluding upper range val in search

* check lower_range while searching for range
2024-02-20 16:51:46 +05:30
Jason Bosco
996224ee91 Don't error out if bazel cache is not found 2024-02-19 22:06:00 +05:30
Krunal Gandhi
fe2a0be564
refactor counter events with tests (#1557)
* add tests for persisting events

* fix test by adding unique event name

* fix persistance with analytics events

* early return raft_server check

* increment populairty count instead of overwrite

* extract method serialize_as_docs

* move func definition from header to source

* counter events refactor and tests

* fix test

* configure log_to_file per counter event
2024-02-19 20:47:32 +05:30
Krunal Gandhi
01bedbb342
fix the wrong boolean val (#1559) 2024-02-19 20:31:06 +05:30
Kishore Nallan
7b7f0c79d7 Optimize id list iteration. 2024-02-18 07:49:37 +05:30
Ozan Armağan
4371cfd48e
Use regex for parsing CF RAG results and log full response when partial (#1554)
* Use regex for parsing CF RAG results and log full  response when it is partial

* Refactor & add test
2024-02-16 09:07:14 +05:30
Kishore Nallan
6e2931bd02 Add guard for long hostnames. 2024-02-15 17:19:36 +05:30
Krunal Gandhi
c7c24e6ab9
Analytics manager fixes (#1553)
* add tests for persisting events

* fix test by adding unique event name

* fix persistance with analytics events

* early return raft_server check

* increment populairty count instead of overwrite

* extract method serialize_as_docs

* move func definition from header to source
2024-02-15 16:51:08 +05:30
Kishore Nallan
bf8a2fc6e5 Don't accept empty nodes file. 2024-02-14 21:39:50 +05:30
Kishore Nallan
cc980c2ecd Treat zero facet sample percent as not sampled. 2024-02-14 12:29:48 +05:30
Vegard Stikbakke
420c55ee6d Fix typo in error message for non-sortable field (#1547) 2024-02-13 17:38:30 +05:30
Kishore Nallan
f7c1678cc3 Handle wraparound of token offset for large doc highlight. 2024-02-13 17:38:23 +05:30
Ozan Armağan
76c57ec407
Add ts namespace for voice query models (#1549)
* Add `ts` namespace for voice query models

* Fix voice query test

* Update error message in invalid voice query test
2024-02-13 16:39:31 +05:30
Ozan Armağan
59cc66248d
Add stemming for queries (#1548) 2024-02-13 15:05:56 +05:30
Kishore Nallan
da8cac463c Fix inheritance of sort field property for nested field. 2024-02-12 21:47:33 +05:30
Harpreet Sangar
54492fafdc
num_tree iterator (#1538)
* Add `num_tree_t::iterator_t`.

* Add `num_tree_t::iterator_t` tests.

* Add `bool_iterator` in `filter_result_iterator_t`.

* Fix `filter_result_iterator_t::compute_iterators`.
2024-02-12 21:45:49 +05:30
Ozan Armağan
ded3a5ec08
Fix context length for RAG models (#1544)
* Fix context length for RAG models

* Fix prompt for cloudflare model

* Fix error in multi search

* Add error handling for Cloudflare API response
2024-02-12 21:44:52 +05:30
Kishore Nallan
ff6e0e43c8 Ignore errors during analytics json serialization. 2024-02-12 15:34:06 +05:30
Krunal Gandhi
c6b25a7964
return error response when stopword is not found (#1545)
* return error response when stopword is not found

* use shared_lock
2024-02-12 14:54:38 +05:30
Kishore Nallan
fc80cc3a72 Handle special characters within non-English locale.
Unless present in symbols to index / separators, it will be skipped.
2024-02-08 16:52:25 +05:30
Ozan Armağan
48df1e70e8
Fix voice query model path for the test (#1540) 2024-02-07 19:35:18 +05:30
Kishore Nallan
ea29b3bea1 Handle empty query while logging analytics. 2024-02-07 18:32:09 +05:30
Kishore Nallan
062d9c1cee Remove stray log. 2024-02-07 18:25:27 +05:30
Harpreet Sangar
a94062481f
Fix crash on calling compute_string_components multiple times in a complex filter query. (#1539)
* Add failing test.

* Fix crash on calling  multiple times in  a complex filter query.

* Rename `compute_result` to `compute_string_components`.
2024-02-07 12:48:19 +05:30
Krunal Gandhi
7b616b19fa
fix asan warning in analytics manager tests (#1537)
* reorder tests to fix asan issues

* fix iterator invalidation

* make approach more verbose
2024-02-06 18:17:57 +05:30
Krunal Gandhi
562e152d9b
remove error response when deleting non existing doc (#1532)
* remove error response when deleting non existing doc

* handle other response when doc not found

* add test and refactor approach

* add response msg when doc not found

* add optional flag ignore_not_found

* change reponse msg
2024-02-06 15:49:05 +05:30
Ozan Armağan
559b2de337
Fix duplicate results from vector index (#1536) 2024-02-06 13:45:59 +05:30
Harpreet Sangar
56a69d9844
Compute filter result into an array. (#1533)
* Compute filter result into an array.

* Add tests.

* Add tests.
2024-02-06 12:55:47 +05:30
Krunal Gandhi
36fcdccddc
add typo_prefix score and num_tokens_dropped in text_match_info (#1529)
* add num_drop_tokens info in text_match_info

* add typo_prefix_score in text_match_info

* add more tests

* add test with drop_token_threshold=1
2024-02-05 20:44:39 +05:30
Kishore Nallan
fb6cf36604 Make cache num entries configurable. 2024-02-05 10:53:38 +05:30
Kishore Nallan
4a2892c886 Add flag for logging search query at the start of req cycle. 2024-02-02 16:12:56 +05:30
Kishore Nallan
0c814cb3d2 Remove unwanted build flags. 2024-02-02 15:34:44 +05:30
Harpreet Sangar
7e9cb789a7
String not equals filter logic refactoring. (#1528)
* Compute result in case of string not equals filter matching too many ids.

* Deleted id should not be considered a match for string not equals filter.

* Add tests.

* Use `id_list_t::iterator_t` to check if `seq_id` exists in index.
2024-02-02 15:32:18 +05:30
Krunal Gandhi
206cecffbd
Event analytics revised (#1522)
* remove query hits aggregation & store

* refactor analytics changes

* avoid string copy

* typo correction

* event analytics revised

* refactoring code

* add test for collection array
2024-02-02 11:50:42 +05:30