Kishore Nallan
|
734640cd2a
|
Fix size calculation for unsorted append.
|
2016-06-09 17:28:50 +05:30 |
|
Kishore Nallan
|
32cd67c9d1
|
The ART will store the frequency count in addition to the score.
In certain cases, the ability to identify the most similar tokens based on the popularity of the token is useful.
|
2016-06-08 22:18:52 +05:30 |
|
Kishore Nallan
|
bb0e7aefb9
|
Rename score to max_score for internal node and leaf structs.
|
2016-06-08 11:26:52 +05:30 |
|
Kishore Nallan
|
5591f564c8
|
Sort the results vector based on score finally.
Required when a multiple leaves of a given node are candidate token suggestions.
|
2016-06-08 10:23:02 +05:30 |
|
Kishore Nallan
|
04d02919b2
|
Fix memory corruption during unsorted append.
|
2016-06-04 19:05:47 +05:30 |
|
Kishore Nallan
|
c029e620d9
|
Clean up the match scoring logic.
Added more comments to illustrate what's happening.
|
2016-05-31 19:03:40 +05:30 |
|
Kishore Nallan
|
80d9f57b7b
|
Code clean-up.
|
2016-05-30 20:13:55 +05:30 |
|
Kishore Nallan
|
0f756efe74
|
Fix sorting - should be in ascending order.
|
2016-05-30 20:13:44 +05:30 |
|
Kishore Nallan
|
beba88c1da
|
Positional offsets are unsorted, so should be using unsorted append.
|
2016-05-30 20:12:04 +05:30 |
|
Kishore Nallan
|
3dde71e72e
|
Unsorted append to forarray.
|
2016-05-30 20:11:26 +05:30 |
|
Kishore Nallan
|
383212be46
|
Fix bugs in top-K implementation.
|
2016-05-15 09:01:05 +05:30 |
|
Kishore Nallan
|
884a83f53c
|
Use lower bound search to implement indexOf()
|
2016-05-15 09:00:42 +05:30 |
|
Kishore Nallan
|
10ff747802
|
Minor refactoring. Adding more comments.
|
2016-04-26 20:49:24 +05:30 |
|
Kishore Nallan
|
c667ed5d10
|
Fix static linking with libfor.
|
2016-04-25 21:51:02 +05:30 |
|
Kishore Nallan
|
f0f57f2e2d
|
Saving state.
|
2016-03-23 07:38:43 +05:30 |
|
Kishore Nallan
|
566c4ce666
|
Intersection of documents across the search tokens.
|
2016-02-29 19:47:05 +05:30 |
|
Kishore Nallan
|
47df6201b1
|
Append offset related fields to the art leaf during insertion.
|
2016-02-28 21:13:54 +05:30 |
|
Kishore Nallan
|
1a7350c0ec
|
Cartesian product of word suggestions for each query token to form search phrases.
|
2016-02-28 09:24:23 +05:30 |
|
Kishore Nallan
|
71a9c2709b
|
Bug fix: Wrong order of arguments when recursing.
|
2016-02-28 09:01:58 +05:30 |
|
Kishore Nallan
|
0ba5c4874f
|
Parameterized the number of fuzzy matches that are returned for words with typo.
|
2016-02-21 19:51:57 +05:30 |
|
Kishore Nallan
|
b88241d9e9
|
Bug fix: word suggestions were not showing up sorted on their document scores.
Somehow, std::max() on uint16_t does not seem to work. Using a MAX macro.
|
2016-02-21 19:21:20 +05:30 |
|
Kishore Nallan
|
8ff75e481d
|
Replace callbacks with a result vector.
Document IDs for the given search token will be populated into this result vector.
|
2016-02-20 23:14:17 +05:30 |
|
Kishore Nallan
|
1ffe38b5c8
|
Grow the forarray properly depending on the data stored.
|
2016-02-20 23:12:55 +05:30 |
|
Kishore Nallan
|
cb3b0e1a6e
|
Using a proper document struct when representing leaf values.
Removed experimental submodules. Only using `for` now (compressed array).
|
2016-01-31 11:20:07 +05:30 |
|
Kishore Nallan
|
ee77fb4d22
|
Add 2 more external dependencies via git submodule.
|
2016-01-24 14:35:40 +05:30 |
|
Kishore Nallan
|
22a63be16b
|
Add external deps via git modules.
|
2016-01-23 18:23:00 +05:30 |
|
Kishore Nallan
|
c095c166f0
|
Adding external dependencies.
|
2016-01-17 19:11:05 +05:30 |
|
Kishore Nallan
|
a662e43959
|
Top-K matches for a given substring seems to work.
|
2015-12-31 07:22:35 +05:30 |
|
Kishore Nallan
|
2dfc31a519
|
Sorting on popularity metric - WIP. Still has bugs.
|
2015-12-29 20:55:50 +05:30 |
|
Kishore Nallan
|
6e87b65598
|
Migrating ART to CPP.
|
2015-12-14 15:42:09 +05:30 |
|
Kishore Nallan
|
5246a1683d
|
Adding a max_score field to intermediate nodes that denote the maximum score of lead nodes.
This is useful for pruning search space when we want to identify top-K matches for a given prefix.
|
2015-12-14 08:23:28 +05:30 |
|
Kishore Nallan
|
8f91f11cb1
|
Iterate index only till end of key len, without considering depth of the term length.
|
2015-11-30 18:04:50 +05:30 |
|
Kishore Nallan
|
eb1e68620a
|
Fixed a LEAF node issue for "amzfing" with threshold of 2.
Was producing too many spurious single char matches.
|
2015-11-29 22:13:21 +05:30 |
|
Kishore Nallan
|
ba39be766c
|
Prevent early return during recursion inside loop.
Fixed "amazin" with 0 threshold.
|
2015-11-29 21:18:48 +05:30 |
|
Kishore Nallan
|
f836443ad9
|
Fixed a bug with NODE48 traversal for exact search of "amazing".
|
2015-11-29 19:35:12 +05:30 |
|
Kishore Nallan
|
50a125f7ea
|
Fixed a major bug with NODE256 iteration for prefix "twili".
|
2015-11-29 16:36:36 +05:30 |
|
Kishore Nallan
|
0d1eca8229
|
Move duplicating code to macro.
|
2015-11-29 09:08:39 +05:30 |
|
Kishore Nallan
|
619a3972d8
|
Fix another edge case involving early end of term.
|
2015-11-29 08:22:05 +05:30 |
|
Kishore Nallan
|
e4a2be3ac3
|
Rewriting fuzzy look-up using incremental levenshtein matrix. WIP.
|
2015-11-28 22:41:26 +05:30 |
|
Kishore Nallan
|
b7dbec8535
|
More bug fixes for fuzzy match.
|
2015-11-26 08:01:08 +05:30 |
|
Kishore Nallan
|
025d3b6bce
|
Fix bugs in fuzzy match.
|
2015-11-26 07:21:01 +05:30 |
|
Kishore Nallan
|
64f53b6420
|
Initial commit. Fuzzy prefix match works.
|
2015-11-10 19:44:44 +05:30 |
|