Kishore Nallan
a6dced3c43
Do all validations upfront before attempting to index fields.
2017-10-23 21:32:51 +05:30
Kishore Nallan
a2ce56fd67
Allow string fields to be filtered on.
...
A rather convenient feature to have - but it would be a match of all tokens without typo tolerance.
2017-10-22 19:52:09 +05:30
Kishore Nallan
0e4517d901
Combine search, facet and sort fields into a single "fields".
...
1. All numerical fields are added to sort index automatically since that makes logical sense.
2. Search fields to be used as a facet are to have a `facet: true` property - removes duplication.
3. If someone wants to use a faceted field also to search against (rare scenario), then they can duplicate that field without the `facet: true` property.
2017-10-17 21:51:07 +05:30
Kishore Nallan
58bc73312d
Replicate deletion of document and dropping of collection.
2017-09-14 22:44:29 +05:30
Kishore Nallan
58a10877ee
Basics of background replication.
...
Supports inserts. Deletion will be tackled next.
2017-09-14 09:58:44 +05:30
Kishore Nallan
d351523655
Allow results to be sorted on a float field.
2017-08-20 21:15:48 +05:30
Kishore Nallan
f5848be750
Address prefix search issues.
...
Score based comparison was broken - test has been enhanced.
2017-08-18 23:17:28 +05:30
Kishore Nallan
e384b777a1
Collection operations on float fields.
2017-08-10 18:20:58 -04:00
Kishore Nallan
6a6785ef74
Short circuit to speed up single token searches.
...
- Refactor token position population
- Store only the query index in topster instead of storing the full offsets.
- Calculate the offsets finally on the results that are to be returned.
2017-08-08 17:39:23 -04:00
Kishore Nallan
3e54cb4022
API for summary of a collection, including the number of documents indexed in the collection.
2017-07-29 11:46:55 +05:30
Kishore Nallan
916aaf6526
API for fetching a document ID and listing all collections.
2017-07-28 20:39:51 +05:30
Kishore Nallan
ffba0371b0
Proper API responses when pagination exceeds result boundaries.
2017-07-07 18:36:56 +05:30
Kishore Nallan
8295707ed4
Allow pagination of results.
...
`page` and `per_page` can be specificed. Simpler to reason about than using the usual `start` and `offset` fields.
2017-06-15 17:14:10 +05:30
Kishore Nallan
57e03efe1f
Contexual snippet only for longer strings.
...
Strings under a defined constant token length will be fully highlighted, instead of showing a snippet of relevant matching portion.
2017-06-14 08:53:23 +02:00
Kishore Nallan
1d5146f7ff
Track best-matched token offsets needed for highlighting.
...
- We store the best matched token offset positions in Topster KV
- Using run-length encoding (via unions) to pack the offset diffs intelligently
2017-06-09 13:32:03 -05:00
Kishore Nallan
b7bc974b8e
Expose token ranking field properly via the API.
2017-05-27 14:02:32 +05:30
Kishore Nallan
7531f9b13c
Add const
in more places.
2017-05-22 18:59:14 +05:30
Kishore Nallan
61bfdf027b
Fix valgrind errors, plugging other leaks.
2017-05-21 15:59:16 +05:30
Kishore Nallan
a5cd45e362
Rewrite facet implementation.
2017-05-20 13:25:56 +05:30
Kishore Nallan
3478aef573
API for deleting a document by ID.
2017-05-17 20:31:24 +05:30
Kishore Nallan
a25d2f590d
Sort order is required only during query time.
2017-05-14 17:36:48 +05:30
Kishore Nallan
060959ad70
Fixed wrong found
counts.
2017-05-13 22:31:56 +05:30
Kishore Nallan
f62247cd32
Make the sort_fields
take order of sorting.
2017-05-07 21:33:04 +05:30
Kishore Nallan
70dda716c5
Parameterize the token ordering field.
2017-03-26 21:26:01 +05:30
Kishore Nallan
222e2c689a
Handle indexing document that does not have all the fields defined in the schema.
2017-03-25 21:45:06 +05:30
Kishore Nallan
7af95e7f22
Refactored the facet implementation and as well as the query interface.
...
- Facetable and Rankable fields must be defined upfront during collection creation
- During query time, specific rank and facet fields can be mentioned but they should belong to the set declared previously
2017-03-19 19:25:42 +05:30
Kishore Nallan
4776b41dc1
Facet implementation.
2017-03-13 21:09:27 +05:30
Kishore Nallan
96921be016
Parse filter query string.
2017-03-06 21:17:13 +05:30
Kishore Nallan
0760e4d01b
String based filtering.
2017-03-04 20:58:29 +05:30
Kishore Nallan
14168c48fc
Support for "IN" style numerical filter.
2017-03-03 21:50:43 +05:30
Kishore Nallan
aa9945c3c0
Implemented filter on a single int32 value.
2017-02-12 12:51:28 +05:30
Kishore Nallan
3ef10b5bb0
Fix ordering of sequence id rocksdb keys.
2017-02-04 21:32:24 +05:30
Kishore Nallan
b880cfd531
Refactor forarray - split into individual classes.
2017-02-04 16:27:07 +05:30
Kishore Nallan
cab0b36699
Skeleton for filter support.
2017-02-02 09:20:06 +05:30
Kishore Nallan
4e468fb0b9
Index and search on multi-valued numeric field.
2017-01-28 20:32:42 -06:00
Kishore Nallan
385e8cb7a2
Index and search on multi-valued string field.
2017-01-28 10:37:46 -06:00
Kishore Nallan
8475cba007
Minor refactoring of collection manager.
2017-01-26 13:54:30 -06:00
Kishore Nallan
216ac7997a
Restore in-memory index on restart.
2017-01-24 22:13:49 -05:00
Kishore Nallan
a6cacf19d0
Return total number of results found in the API.
2017-01-22 21:39:46 +05:30
Kishore Nallan
b7654baa74
Persist collection's next_seq_id.
2017-01-09 22:14:06 +05:30
Kishore Nallan
3e8f9298a9
Remove redundant string conversion for collection_id.
2017-01-08 22:02:35 +05:30
Kishore Nallan
d831c49817
Move duplicate ID detection right inside topster.
2017-01-08 21:44:36 +05:30
Kishore Nallan
2f08eca12e
Initial sketch for persisting meta information about collections.
2017-01-08 19:47:17 +05:30
Kishore Nallan
2b6293650e
Search across multiple fields.
...
Need to write more tests.
2017-01-01 19:56:26 +05:30
Kishore Nallan
54a60398ab
Parameterize rank fields.
2016-12-29 21:45:38 +05:30
Kishore Nallan
0b88e669f6
Make ART fuzzy_search take min_cost and max_cost instead of only max_cost.
2016-12-28 18:16:43 +05:30
Kishore Nallan
12276b651f
Base work for supporting multiple indexable fields.
2016-12-22 22:26:33 +05:30
Kishore Nallan
e1526319f7
Building up support for prefix based searching and for ranking token suggestions by either frequency or max_score.
2016-11-27 14:56:15 +05:30
Kishore Nallan
4e10fadeb7
Settle for partial matches when the whole query produces no results.
2016-11-26 17:13:16 +05:30
Kishore Nallan
396e10be5d
Refactor collection's search method to be more judicious in using higher costs.
...
Earlier, even if one token produced no result, ALL tokens were searched with a higher cost. This change ensures that we first retry only the token that did not produce results with a larger cost before doing the same for other tokens.
2016-11-24 21:39:20 +05:30