Kishore Nallan
f8be8f4f6a
Handle normalization of unicode elegantly.
2018-01-26 12:54:00 +00:00
Kishore Nallan
c3298ba6d8
Address -Wall and -Wextra warnings.
2018-01-25 20:08:13 +05:30
Kishore Nallan
491de5a325
Remove ascii special characters during string normalization.
...
Unicode special chars are retained verbatim - will be addressed in future.
2018-01-16 21:16:24 +05:30
Kishore Nallan
192b00e71f
Address API review comments.
...
1. Move document specific actions under /documents
2. Document creation echoes the full document
3. Collections summary returns full detail on each each collection
4. Collections summary endpoint has no nested root attribute
5. When collection or document is deleted, the whole entity is returned in response
2018-01-02 21:35:24 +05:30
Kishore Nallan
f6612cb34e
Refactor scoring loop.
2017-12-30 21:14:31 +05:30
Kishore Nallan
21d3de6145
Improve handling of replication errors / edge cases.
2017-12-24 21:03:30 +05:30
Kishore Nallan
01275c38f2
Use unorderd_map for low-volume meta datastructures.
...
Order of spp:sparse_hash_map during iteration is different in clang and gcc.
2017-12-20 09:16:15 +05:30
Kishore Nallan
8d5f7c18a3
Fix a couple of test compile warnings.
2017-12-17 21:42:45 +05:30
Kishore Nallan
60288631be
Check partial node text iteratively for prefix match.
2017-12-17 10:26:37 +05:30
Kishore Nallan
b1ac1d7d7c
Allow API key to be passed as a GET parameter as a backup to a header.
...
Required for authentication of JSONP requests from JS client.
2017-12-17 07:35:59 +05:30
Kishore Nallan
3a9743aa90
Parameterize API key used by replication.
2017-12-16 22:12:03 +05:30
Kishore Nallan
fe0db59877
Support indexing of bool fields.
2017-12-13 20:24:44 +05:30
Kishore Nallan
78b9ee52ec
Make match score computation predictable and consistent across multiple indexes.
2017-11-12 22:31:29 +05:30
Kishore Nallan
c1c7d493e5
Allow an integer to be passed to a field defined as a float.
...
In the real world, this is common and it's better to be practical here.
2017-11-05 10:45:45 +05:30
Kishore Nallan
3907c2d3f9
Fixed an out-of-bounds bug with highlighting.
2017-11-03 21:07:56 +05:30
Kishore Nallan
a7479171b1
Limit number of prefix candidates to a constant.
...
Using number of results for comparison results in change of total count during pagination.
2017-11-02 18:04:55 +05:30
Kishore Nallan
93430444d0
Prefix sort field can also be a float
2017-10-25 18:18:07 +05:30
Kishore Nallan
a2ce56fd67
Allow string fields to be filtered on.
...
A rather convenient feature to have - but it would be a match of all tokens without typo tolerance.
2017-10-22 19:52:09 +05:30
Kishore Nallan
da295e90e8
Tests for array utils.
2017-10-22 17:05:25 +05:30
Kishore Nallan
0e4517d901
Combine search, facet and sort fields into a single "fields".
...
1. All numerical fields are added to sort index automatically since that makes logical sense.
2. Search fields to be used as a facet are to have a `facet: true` property - removes duplication.
3. If someone wants to use a faceted field also to search against (rare scenario), then they can duplicate that field without the `facet: true` property.
2017-10-17 21:51:07 +05:30
Kishore Nallan
bb3ca4211a
Test for pagination.
2017-09-24 22:00:57 +05:30
Kishore Nallan
b3689e16aa
Improve test harness to cover some missing cases.
2017-09-23 21:21:13 +05:30
Kishore Nallan
b0cb3ceb41
Set a ceiling on num_typos so that 1 and 2 char prefix searches make sense.
2017-09-22 20:59:26 +05:30
Kishore Nallan
e24e0fae5d
Node score should be a int32_t.
2017-09-21 19:40:41 +05:30
Kishore Nallan
901626652a
Make type definitions less verbose.
...
Use string[] instead of STRING_ARRAY and so on.
2017-09-19 22:01:08 +05:30
Kishore Nallan
58bc73312d
Replicate deletion of document and dropping of collection.
2017-09-14 22:44:29 +05:30
Kishore Nallan
6a465a0289
API for fetching all transactions from a given transaction sequence number.
...
Relying on RocksDB underlying APIs for that. The updates are sent in a Base64 encoding in the JSON response.
2017-08-31 09:42:11 +05:30
Kishore Nallan
d351523655
Allow results to be sorted on a float field.
2017-08-20 21:15:48 +05:30
Kishore Nallan
3104dea42a
Generify the topster container to hold both integer and float.
...
Benchmarked to ensure that performance is on par.
2017-08-20 15:25:11 +05:30
Kishore Nallan
ea550f167c
For prefix search, only the last term in the query should be considered as prefix.
2017-08-19 10:42:49 +05:30
Kishore Nallan
f5848be750
Address prefix search issues.
...
Score based comparison was broken - test has been enhanced.
2017-08-18 23:17:28 +05:30
Kishore Nallan
38fbbea71f
Ensure that the token ranking field is an unsigned int.
2017-08-10 18:29:59 -04:00
Kishore Nallan
e384b777a1
Collection operations on float fields.
2017-08-10 18:20:58 -04:00
Kishore Nallan
a2f475d7fc
Enable ART to index and search on floating point numbers.
2017-08-09 18:17:26 -04:00
Kishore Nallan
6a6785ef74
Short circuit to speed up single token searches.
...
- Refactor token position population
- Store only the query index in topster instead of storing the full offsets.
- Calculate the offsets finally on the results that are to be returned.
2017-08-08 17:39:23 -04:00
Kishore Nallan
916aaf6526
API for fetching a document ID and listing all collections.
2017-07-28 20:39:51 +05:30
Kishore Nallan
ffba0371b0
Proper API responses when pagination exceeds result boundaries.
2017-07-07 18:36:56 +05:30
Kishore Nallan
c471cd50c3
Implement authentication against an API auth key.
...
The key should be passed via X-API-KEY HTTP header.
2017-07-04 22:18:47 +05:30
Kishore Nallan
06ff49df4a
Added a few more tests.
2017-07-01 22:57:59 +05:30
Kishore Nallan
8295707ed4
Allow pagination of results.
...
`page` and `per_page` can be specificed. Simpler to reason about than using the usual `start` and `offset` fields.
2017-06-15 17:14:10 +05:30
Kishore Nallan
57e03efe1f
Contexual snippet only for longer strings.
...
Strings under a defined constant token length will be fully highlighted, instead of showing a snippet of relevant matching portion.
2017-06-14 08:53:23 +02:00
Kishore Nallan
50e08726da
String field tokens which match with query tokens are highlighted in the results.
2017-06-09 14:59:06 -05:00
Kishore Nallan
1d5146f7ff
Track best-matched token offsets needed for highlighting.
...
- We store the best matched token offset positions in Topster KV
- Using run-length encoding (via unions) to pack the offset diffs intelligently
2017-06-09 13:32:03 -05:00
Kishore Nallan
20a3139dd2
Tweak score calculation - number of words present is more important than candidate rank score.
2017-05-27 17:47:26 +05:30
Kishore Nallan
b7bc974b8e
Expose token ranking field properly via the API.
2017-05-27 14:02:32 +05:30
Kishore Nallan
6d3613b750
Limit facets returned to top 10.
2017-05-20 15:25:12 +05:30
Kishore Nallan
a25d2f590d
Sort order is required only during query time.
2017-05-14 17:36:48 +05:30
Kishore Nallan
1992d92eaf
Tests for asc/desc sort order.
2017-05-14 12:25:59 +05:30
Kishore Nallan
060959ad70
Fixed wrong found
counts.
2017-05-13 22:31:56 +05:30
Kishore Nallan
f62247cd32
Make the sort_fields
take order of sorting.
2017-05-07 21:33:04 +05:30