typesense

mirror of https://github.com/typesense/typesense.git synced 2025-05-24 15:50:42 +08:00

Author	SHA1	Message	Date
Kishore Nallan	aab5912110	Fuzzy search tests.	2016-11-07 19:36:28 +05:30
Kishore Nallan	ef105dcbd9	Reduce memory foot-print.	2016-10-04 21:31:55 +05:30
Kishore Nallan	c96a9d9b35	Adopt Damerau–Levenshtein distance, instead of plain Levenshtein. Specifically, we use the optimal string alignment distance. It treats transposition as a cost of 1, rather than 2. https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#Optimal_string_alignment_distance	2016-09-10 16:19:04 +05:30
Kishore Nallan	1a53d5692e	Fuzzy match rewrite - still need to work on matching perf.	2016-09-04 12:22:16 +05:30
Kishore Nallan	334ce264a5	For now, disable prefix matches to be considered as whole matches.	2016-09-02 10:34:34 +05:30
Kishore Nallan	1c09ec38a8	Removed redundant token_count field from the leaf.	2016-08-24 09:27:32 +05:30
Kishore Nallan	30cd057201	Split-up fuzzy lookup into separate stages. 1. Collect all the nodes where cost exceeds threshold. 2. Sort these nodes based on a score. 3. Perform top-k iteration to locate high scoring leaves. This ensures that small scoring leaves don't end up trumping leaves with higher score (as it was noticed).	2016-06-10 23:20:44 +05:30
Kishore Nallan	4face51091	Calculation of hits for a token had a bug. Should use search rather than prefix lookup for finding the hits so far for the exact token.	2016-06-09 17:31:32 +05:30
Kishore Nallan	32cd67c9d1	The ART will store the frequency count in addition to the score. In certain cases, the ability to identify the most similar tokens based on the popularity of the token is useful.	2016-06-08 22:18:52 +05:30
Kishore Nallan	bb0e7aefb9	Rename `score` to `max_score` for internal node and leaf structs.	2016-06-08 11:26:52 +05:30
Kishore Nallan	47df6201b1	Append offset related fields to the art leaf during insertion.	2016-02-28 21:13:54 +05:30
Kishore Nallan	0ba5c4874f	Parameterized the number of fuzzy matches that are returned for words with typo.	2016-02-21 19:51:57 +05:30
Kishore Nallan	b88241d9e9	Bug fix: word suggestions were not showing up sorted on their document scores. Somehow, std::max() on uint16_t does not seem to work. Using a MAX macro.	2016-02-21 19:21:20 +05:30
Kishore Nallan	8ff75e481d	Replace callbacks with a result vector. Document IDs for the given search token will be populated into this result vector.	2016-02-20 23:14:17 +05:30
Kishore Nallan	cb3b0e1a6e	Using a proper document struct when representing leaf values. Removed experimental submodules. Only using `for` now (compressed array).	2016-01-31 11:20:07 +05:30
Kishore Nallan	ee77fb4d22	Add 2 more external dependencies via git submodule.	2016-01-24 14:35:40 +05:30
Kishore Nallan	a662e43959	Top-K matches for a given substring seems to work.	2015-12-31 07:22:35 +05:30
Kishore Nallan	2dfc31a519	Sorting on popularity metric - WIP. Still has bugs.	2015-12-29 20:55:50 +05:30
Kishore Nallan	5246a1683d	Adding a max_score field to intermediate nodes that denote the maximum score of lead nodes. This is useful for pruning search space when we want to identify top-K matches for a given prefix.	2015-12-14 08:23:28 +05:30
Kishore Nallan	50a125f7ea	Fixed a major bug with NODE256 iteration for prefix "twili".	2015-11-29 16:36:36 +05:30
Kishore Nallan	e4a2be3ac3	Rewriting fuzzy look-up using incremental levenshtein matrix. WIP.	2015-11-28 22:41:26 +05:30
Kishore Nallan	64f53b6420	Initial commit. Fuzzy prefix match works.	2015-11-10 19:44:44 +05:30

22 Commits