TokenCountPayloadFilter
De JDONREF Wiki
Include in integer payloads the count of tokens with the same payload within the same field.
Sample
For example, the document :
{ "fullName": "BOULEVARD|1 DE|1 PARIS|1 L|2 HOPITAL|2" }
indexed with a mapping like :
"fullName" : {"type": "string", "term_vector" : "with_positions_offsets_payloads", "index_analyzer":"myAnalyzer"}
and settings like :
{
"index" : {
"analysis" : {
"analyzer": {
"myAnalyzer" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["delimited_payload_filter", "lowercase", "tokencount_payload_filter"]
},
"filter" : {
"delimited_payload_filter" : {
"type": "delimited_payload_filter",
"delimiter" : "|",
"encoding" : "int"
},
"tokencount_payload_filter" : {
"type": "tokencountpayloads",
"factor": 1000
}
}
}
will index the tokens BOULEVARD, DE, PARIS, L, HOPITAL with the respective payloads : 3001, 3001, 3001, 2002, 2002.
- 3001 means there is 3 tokens with payload 1 ( 3*factor +1 ).
- 2002 means there is 2 tokens with payload 2 ( 2*factor +2 ).
These factored tokens can be used with the checker All from PayloadCheckerSpanQuery.
Features
| Setting | description |
| factor | (Mandatory) The factor by which the count of tokens with a given payload will be multiplied. |
| ignored_types | (none) The token's payload associated with these types won't be modified. Others will. |
