195 Commits

Author SHA1 Message Date
Kishore Nallan
aab5912110 Fuzzy search tests. 2016-11-07 19:36:28 +05:30
Kishore Nallan
c7e58efafd Add some regression tests for checking out of bounds. 2016-11-06 08:30:00 +05:30
Kishore Nallan
7a0187e6b3 Import and port art tests. 2016-11-01 18:19:21 +05:30
Kishore Nallan
c229b715c5 Build RocksDB as a shared library. 2016-10-22 20:42:02 +05:30
Kishore Nallan
ee68da6f53 Build RocksDB and H2O also as part of the build process. 2016-10-21 09:18:13 +05:30
Kishore Nallan
a789137d55 Build libfor automatically as part of the build process. 2016-10-19 14:46:56 +05:30
Kishore Nallan
da4c31065a Download and build libfor right from CMake. 2016-10-16 22:22:09 +05:30
Kishore Nallan
d2b903a931 Set-up google test. 2016-10-16 22:15:11 +05:30
Kishore Nallan
c8eba7cf11 Adopt sequence ID as generated document ID, instead of using UUID. 2016-10-08 21:17:33 +05:30
Kishore Nallan
596430c036 Remove entry from rocksdb and art when required. 2016-10-05 21:24:40 +05:30
Kishore Nallan
ef105dcbd9 Reduce memory foot-print. 2016-10-04 21:31:55 +05:30
Kishore Nallan
3e3e08aeca Log resident memory right after indexing. 2016-10-02 19:12:26 +05:30
Kishore Nallan
d8eee0d04a Util for logging exec time. 2016-10-02 19:11:59 +05:30
Kishore Nallan
9d5a120dab Replace unordered_map with sparsepp hashmap. Much faster! 2016-09-27 22:03:41 +05:30
Kishore Nallan
080eceea79 Remove bit packing - use proper struct. 2016-09-27 20:53:38 +05:30
Kishore Nallan
1cf5eb9d9c Fix path to source directory for make. 2016-09-26 08:18:00 +05:30
Kishore Nallan
5cd8b72d0b Fixed a bug in top-K sorting. 2016-09-25 13:10:34 +05:30
Kishore Nallan
e777afc97f API for removing a document from index. 2016-09-24 18:08:57 +05:30
Kishore Nallan
9f75b70b07 Add document end-point. 2016-09-13 21:35:21 +05:30
Kishore Nallan
59f25dca39 Fix libfor repository URL - updated CMakeLists & README. 2016-09-13 18:22:46 +05:30
Kishore Nallan
e7c6c6d3cb Fixed multi word queries. 2016-09-12 14:25:07 +05:30
Kishore Nallan
2f26b95c5b Intermediate matching nodes should not be pushed to the results vector. 2016-09-11 12:13:04 +05:30
Kishore Nallan
c96a9d9b35 Adopt Damerau–Levenshtein distance, instead of plain Levenshtein.
Specifically, we use the optimal string alignment distance. It treats transposition as a cost of 1, rather than 2.

https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance#Optimal_string_alignment_distance
2016-09-10 16:19:04 +05:30
Kishore Nallan
618a2020ed Delete the old fuzzy search implementation. 2016-09-06 17:43:25 +05:30
Kishore Nallan
c25f7ccfdb Parameterize the num_typos from the query end-point. 2016-09-05 19:13:44 +05:30
Kishore Nallan
93de59be29 Added some conditions for search space reduction that puts performance back to original implementation. 2016-09-05 10:58:32 +05:30
Kishore Nallan
1a53d5692e Fuzzy match rewrite - still need to work on matching perf. 2016-09-04 12:22:16 +05:30
Kishore Nallan
334ce264a5 For now, disable prefix matches to be considered as whole matches. 2016-09-02 10:34:34 +05:30
Kishore Nallan
aa46985bab Handle spaces in query string. 2016-08-30 22:07:17 +05:30
Kishore Nallan
44da808f16 RocksDB based persistence. 2016-08-28 22:04:58 +05:30
Kishore Nallan
1d3af330dd JSON document as input to collection.add method. 2016-08-28 09:23:30 +05:30
Kishore Nallan
2804b145dd Add OS X build instructions. 2016-08-27 22:48:01 +05:30
Kishore Nallan
4d2ba27cab Release memory of value stored when art node is destroyed. 2016-08-27 19:49:52 +05:30
Kishore Nallan
94db15b715 Fixed various issues flagged by Valgrind. 2016-08-27 13:44:53 +05:30
Kishore Nallan
1c36238f19 Fix debug flag. 2016-08-24 22:32:16 +05:30
Kishore Nallan
1e71058917 Length of char* was being calculated wrongly.
Need to consider the terminating null character.
2016-08-24 22:31:50 +05:30
Kishore Nallan
1c09ec38a8 Removed redundant token_count field from the leaf. 2016-08-24 09:27:32 +05:30
Kishore Nallan
2a77a1ad66 Removed redundant storage of length in offsets array. 2016-08-24 08:46:02 +05:30
Kishore Nallan
c079b22cbd Fix typo in test document harness.
Added better print debugging in the process.
2016-08-23 22:37:54 +05:30
Kishore Nallan
9b6547f050 Refactor index to be called as collection. 2016-08-23 20:32:37 +05:30
Kishore Nallan
ae34ae3195 Add JSON dep. 2016-08-23 20:31:11 +05:30
Kishore Nallan
7147fa7ed5 Added design and todo docs. 2016-08-16 21:17:37 +05:30
Kishore Nallan
0eeb75b385 Boost dep is not needed. 2016-08-16 14:57:29 +05:30
Kishore Nallan
e6306ac432 Remove crow as dep. 2016-08-14 15:37:45 +05:30
Kishore Nallan
4f10586d13 Add skeleton HTTP server for serving the RESTish API. 2016-08-14 12:20:41 +05:30
Kishore Nallan
a228d153a6 Update README. 2016-08-14 12:19:35 +05:30
Kishore Nallan
a927a32018 Breaking down the long search method into smaller chunks. 2016-08-07 15:59:49 -07:00
Kishore Nallan
ba33da1d51 Lots of code clean up.
* Move stuff out of main to classes
* Standardize naming conventions.
2016-08-07 14:55:26 -07:00
Kishore Nallan
6c2974aaeb Add crow as a dep - http framework. 2016-08-07 14:54:26 -07:00
Kishore Nallan
e1f4b3d513 Constantize arguments, some clean-up code. 2016-08-05 18:26:31 -07:00