1339 Commits

Author SHA1 Message Date
Kishore Nallan
5591f564c8 Sort the results vector based on score finally.
Required when a multiple leaves of a given node are candidate token suggestions.
2016-06-08 10:23:02 +05:30
Kishore Nallan
04d02919b2 Fix memory corruption during unsorted append. 2016-06-04 19:05:47 +05:30
Kishore Nallan
c029e620d9 Clean up the match scoring logic.
Added more comments to illustrate what's happening.
2016-05-31 19:03:40 +05:30
Kishore Nallan
80d9f57b7b Code clean-up. 2016-05-30 20:13:55 +05:30
Kishore Nallan
0f756efe74 Fix sorting - should be in ascending order. 2016-05-30 20:13:44 +05:30
Kishore Nallan
beba88c1da Positional offsets are unsorted, so should be using unsorted append. 2016-05-30 20:12:04 +05:30
Kishore Nallan
3dde71e72e Unsorted append to forarray. 2016-05-30 20:11:26 +05:30
Kishore Nallan
383212be46 Fix bugs in top-K implementation. 2016-05-15 09:01:05 +05:30
Kishore Nallan
884a83f53c Use lower bound search to implement indexOf() 2016-05-15 09:00:42 +05:30
Kishore Nallan
10ff747802 Minor refactoring. Adding more comments. 2016-04-26 20:49:24 +05:30
Kishore Nallan
c667ed5d10 Fix static linking with libfor. 2016-04-25 21:51:02 +05:30
Kishore Nallan
f0f57f2e2d Saving state. 2016-03-23 07:38:43 +05:30
Kishore Nallan
566c4ce666 Intersection of documents across the search tokens. 2016-02-29 19:47:05 +05:30
Kishore Nallan
47df6201b1 Append offset related fields to the art leaf during insertion. 2016-02-28 21:13:54 +05:30
Kishore Nallan
1a7350c0ec Cartesian product of word suggestions for each query token to form search phrases. 2016-02-28 09:24:23 +05:30
Kishore Nallan
71a9c2709b Bug fix: Wrong order of arguments when recursing. 2016-02-28 09:01:58 +05:30
Kishore Nallan
0ba5c4874f Parameterized the number of fuzzy matches that are returned for words with typo. 2016-02-21 19:51:57 +05:30
Kishore Nallan
b88241d9e9 Bug fix: word suggestions were not showing up sorted on their document scores.
Somehow, std::max() on uint16_t does not seem to work. Using a MAX macro.
2016-02-21 19:21:20 +05:30
Kishore Nallan
8ff75e481d Replace callbacks with a result vector.
Document IDs for the given search token will be populated into this result vector.
2016-02-20 23:14:17 +05:30
Kishore Nallan
1ffe38b5c8 Grow the forarray properly depending on the data stored. 2016-02-20 23:12:55 +05:30
Kishore Nallan
cb3b0e1a6e Using a proper document struct when representing leaf values.
Removed experimental submodules. Only using `for` now (compressed array).
2016-01-31 11:20:07 +05:30
Kishore Nallan
ee77fb4d22 Add 2 more external dependencies via git submodule. 2016-01-24 14:35:40 +05:30
Kishore Nallan
22a63be16b Add external deps via git modules. 2016-01-23 18:23:00 +05:30
Kishore Nallan
c095c166f0 Adding external dependencies. 2016-01-17 19:11:05 +05:30
Kishore Nallan
a662e43959 Top-K matches for a given substring seems to work. 2015-12-31 07:22:35 +05:30
Kishore Nallan
2dfc31a519 Sorting on popularity metric - WIP. Still has bugs. 2015-12-29 20:55:50 +05:30
Kishore Nallan
6e87b65598 Migrating ART to CPP. 2015-12-14 15:42:09 +05:30
Kishore Nallan
5246a1683d Adding a max_score field to intermediate nodes that denote the maximum score of lead nodes.
This is useful for pruning search space when we want to identify top-K matches for a given prefix.
2015-12-14 08:23:28 +05:30
Kishore Nallan
8f91f11cb1 Iterate index only till end of key len, without considering depth of the term length. 2015-11-30 18:04:50 +05:30
Kishore Nallan
eb1e68620a Fixed a LEAF node issue for "amzfing" with threshold of 2.
Was producing too many spurious single char matches.
2015-11-29 22:13:21 +05:30
Kishore Nallan
ba39be766c Prevent early return during recursion inside loop.
Fixed "amazin" with 0 threshold.
2015-11-29 21:18:48 +05:30
Kishore Nallan
f836443ad9 Fixed a bug with NODE48 traversal for exact search of "amazing". 2015-11-29 19:35:12 +05:30
Kishore Nallan
50a125f7ea Fixed a major bug with NODE256 iteration for prefix "twili". 2015-11-29 16:36:36 +05:30
Kishore Nallan
0d1eca8229 Move duplicating code to macro. 2015-11-29 09:08:39 +05:30
Kishore Nallan
619a3972d8 Fix another edge case involving early end of term. 2015-11-29 08:22:05 +05:30
Kishore Nallan
e4a2be3ac3 Rewriting fuzzy look-up using incremental levenshtein matrix. WIP. 2015-11-28 22:41:26 +05:30
Kishore Nallan
b7dbec8535 More bug fixes for fuzzy match. 2015-11-26 08:01:08 +05:30
Kishore Nallan
025d3b6bce Fix bugs in fuzzy match. 2015-11-26 07:21:01 +05:30
Kishore Nallan
64f53b6420 Initial commit. Fuzzy prefix match works. 2015-11-10 19:44:44 +05:30