Kishore Nallan
4e10fadeb7
Settle for partial matches when the whole query produces no results.
2016-11-26 17:13:16 +05:30
Kishore Nallan
396e10be5d
Refactor collection's search method to be more judicious in using higher costs.
...
Earlier, even if one token produced no result, ALL tokens were searched with a higher cost. This change ensures that we first retry only the token that did not produce results with a larger cost before doing the same for other tokens.
2016-11-24 21:39:20 +05:30
Kishore Nallan
44d55cb13d
Fixed a search issue: tokens that are not found in the index should be skipped.
2016-11-19 16:56:59 +05:30
Kishore Nallan
5736888935
Tests for collection.
2016-11-13 21:59:32 +05:30
Kishore Nallan
ea0da73cfb
Fix C++ 11 warnings.
2016-11-13 09:56:13 +05:30
Kishore Nallan
18a4528540
Forarray tests.
2016-11-13 09:53:30 +05:30
Kishore Nallan
9bb24331cc
Fuzzy search test - multiple results.
2016-11-12 21:30:22 +05:30
Kishore Nallan
aab5912110
Fuzzy search tests.
2016-11-07 19:36:28 +05:30
Kishore Nallan
c7e58efafd
Add some regression tests for checking out of bounds.
2016-11-06 08:30:00 +05:30
Kishore Nallan
7a0187e6b3
Import and port art tests.
2016-11-01 18:19:21 +05:30
Kishore Nallan
c229b715c5
Build RocksDB as a shared library.
2016-10-22 20:42:02 +05:30
Kishore Nallan
ee68da6f53
Build RocksDB and H2O also as part of the build process.
2016-10-21 09:18:13 +05:30
Kishore Nallan
a789137d55
Build libfor automatically as part of the build process.
2016-10-19 14:46:56 +05:30
Kishore Nallan
da4c31065a
Download and build libfor right from CMake.
2016-10-16 22:22:09 +05:30
Kishore Nallan
d2b903a931
Set-up google test.
2016-10-16 22:15:11 +05:30
Kishore Nallan
c8eba7cf11
Adopt sequence ID as generated document ID, instead of using UUID.
2016-10-08 21:17:33 +05:30
Kishore Nallan
596430c036
Remove entry from rocksdb and art when required.
2016-10-05 21:24:40 +05:30
Kishore Nallan
ef105dcbd9
Reduce memory foot-print.
2016-10-04 21:31:55 +05:30
Kishore Nallan
3e3e08aeca
Log resident memory right after indexing.
2016-10-02 19:12:26 +05:30
Kishore Nallan
d8eee0d04a
Util for logging exec time.
2016-10-02 19:11:59 +05:30
Kishore Nallan
9d5a120dab
Replace unordered_map with sparsepp hashmap. Much faster!
2016-09-27 22:03:41 +05:30
Kishore Nallan
080eceea79
Remove bit packing - use proper struct.
2016-09-27 20:53:38 +05:30
Kishore Nallan
1cf5eb9d9c
Fix path to source directory for make.
2016-09-26 08:18:00 +05:30
Kishore Nallan
5cd8b72d0b
Fixed a bug in top-K sorting.
2016-09-25 13:10:34 +05:30
Kishore Nallan
e777afc97f
API for removing a document from index.
2016-09-24 18:08:57 +05:30
Kishore Nallan
9f75b70b07
Add document end-point.
2016-09-13 21:35:21 +05:30
Kishore Nallan
59f25dca39
Fix libfor repository URL - updated CMakeLists & README.
2016-09-13 18:22:46 +05:30
Kishore Nallan
e7c6c6d3cb
Fixed multi word queries.
2016-09-12 14:25:07 +05:30
Kishore Nallan
2f26b95c5b
Intermediate matching nodes should not be pushed to the results vector.
2016-09-11 12:13:04 +05:30
Kishore Nallan
c96a9d9b35
Adopt Damerau–Levenshtein distance, instead of plain Levenshtein.
...
Specifically, we use the optimal string alignment distance. It treats transposition as a cost of 1, rather than 2.
https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#Optimal_string_alignment_distance
2016-09-10 16:19:04 +05:30
Kishore Nallan
618a2020ed
Delete the old fuzzy search implementation.
2016-09-06 17:43:25 +05:30
Kishore Nallan
c25f7ccfdb
Parameterize the num_typos from the query end-point.
2016-09-05 19:13:44 +05:30
Kishore Nallan
93de59be29
Added some conditions for search space reduction that puts performance back to original implementation.
2016-09-05 10:58:32 +05:30
Kishore Nallan
1a53d5692e
Fuzzy match rewrite - still need to work on matching perf.
2016-09-04 12:22:16 +05:30
Kishore Nallan
334ce264a5
For now, disable prefix matches to be considered as whole matches.
2016-09-02 10:34:34 +05:30
Kishore Nallan
aa46985bab
Handle spaces in query string.
2016-08-30 22:07:17 +05:30
Kishore Nallan
44da808f16
RocksDB based persistence.
2016-08-28 22:04:58 +05:30
Kishore Nallan
1d3af330dd
JSON document as input to collection.add
method.
2016-08-28 09:23:30 +05:30
Kishore Nallan
2804b145dd
Add OS X build instructions.
2016-08-27 22:48:01 +05:30
Kishore Nallan
4d2ba27cab
Release memory of value stored when art node is destroyed.
2016-08-27 19:49:52 +05:30
Kishore Nallan
94db15b715
Fixed various issues flagged by Valgrind.
2016-08-27 13:44:53 +05:30
Kishore Nallan
1c36238f19
Fix debug flag.
2016-08-24 22:32:16 +05:30
Kishore Nallan
1e71058917
Length of char* was being calculated wrongly.
...
Need to consider the terminating null character.
2016-08-24 22:31:50 +05:30
Kishore Nallan
1c09ec38a8
Removed redundant token_count field from the leaf.
2016-08-24 09:27:32 +05:30
Kishore Nallan
2a77a1ad66
Removed redundant storage of length in offsets array.
2016-08-24 08:46:02 +05:30
Kishore Nallan
c079b22cbd
Fix typo in test document harness.
...
Added better print debugging in the process.
2016-08-23 22:37:54 +05:30
Kishore Nallan
9b6547f050
Refactor index
to be called as collection
.
2016-08-23 20:32:37 +05:30
Kishore Nallan
ae34ae3195
Add JSON dep.
2016-08-23 20:31:11 +05:30
Kishore Nallan
7147fa7ed5
Added design and todo docs.
2016-08-16 21:17:37 +05:30
Kishore Nallan
0eeb75b385
Boost dep is not needed.
2016-08-16 14:57:29 +05:30