9.13. Text Search Functions and Operators
Table 9.41, Table 9.42 and Table 9.43 summarize the functions and operators that are provided for full text searching. See Chapter 12 for a detailed explanation of PostgreSQL's text search facility.
Table 9.41. Text Search Operators
Operator | Return Type | Description | Example | Result |
---|---|---|---|---|
@@ | boolean |
tsvector matches tsquery ? | to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat') | t |
@@@ | boolean | deprecated synonym for @@
| to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat') | t |
|| | tsvector | concatenate tsvector s | 'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector | 'a':1 'b':2,5 'c':3 'd':4 |
&& | tsquery | AND tsquery s together | 'fat | rat'::tsquery && 'cat'::tsquery | ( 'fat' | 'rat' ) & 'cat' |
|| | tsquery | OR tsquery s together | 'fat | rat'::tsquery || 'cat'::tsquery | ( 'fat' | 'rat' ) | 'cat' |
!! | tsquery | negate a tsquery
| !! 'cat'::tsquery | !'cat' |
<-> | tsquery |
tsquery followed by tsquery
| to_tsquery('fat') <-> to_tsquery('rat') | 'fat' <-> 'rat' |
@> | boolean |
tsquery contains another ? | 'cat'::tsquery @> 'cat & rat'::tsquery | f |
<@ | boolean |
tsquery is contained in ? | 'cat'::tsquery <@ 'cat & rat'::tsquery | t |
Note
The
tsquery
containment operators consider only the lexemes listed in the two queries, ignoring the combining operators.
In addition to the operators shown in the table, the ordinary B-tree comparison operators (=
, <
, etc) are defined for types tsvector
and tsquery
. These are not very useful for text searching but allow, for example, unique indexes to be built on columns of these types.
Table 9.42. Text Search Functions
Function | Return Type | Description | Example | Result |
---|---|---|---|---|
array_to_tsvector(text[])
| tsvector | convert array of lexemes to tsvector
| array_to_tsvector('{fat,cat,rat}'::text[]) | 'cat' 'fat' 'rat' |
get_current_ts_config()
| regconfig | get default text search configuration | get_current_ts_config() | english |
length(tsvector)
| integer | number of lexemes in tsvector
| length('fat:2,4 cat:3 rat:5A'::tsvector) | 3 |
numnode(tsquery)
| integer | number of lexemes plus operators in tsquery
| numnode('(fat & rat) | cat'::tsquery) | 5 |
plainto_tsquery([ config regconfig , ] query text)
| tsquery | produce tsquery ignoring punctuation | plainto_tsquery('english', 'The Fat Rats') | 'fat' & 'rat' |
phraseto_tsquery([ config regconfig , ] query text)
| tsquery | produce tsquery that searches for a phrase, ignoring punctuation | phraseto_tsquery('english', 'The Fat Rats') | 'fat' <-> 'rat' |
websearch_to_tsquery([ config regconfig , ] query text)
| tsquery | produce tsquery from a web search style query | websearch_to_tsquery('english', '"fat rat" or rat') | 'fat' <-> 'rat' | 'rat' |
querytree(query tsquery)
| text | get indexable part of a tsquery
| querytree('foo & ! bar'::tsquery) | 'foo' |
setweight(vector tsvector, weight "char")
| tsvector | assign weight to each element of vector
| setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A') | 'cat':3A 'fat':2A,4A 'rat':5A |
setweight(vector tsvector, weight "char", lexemes text[])
| tsvector | assign weight to elements of vector that are listed in lexemes
| setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}') | 'cat':3A 'fat':2,4 'rat':5A |
strip(tsvector)
| tsvector | remove positions and weights from tsvector
| strip('fat:2,4 cat:3 rat:5A'::tsvector) | 'cat' 'fat' 'rat' |
to_tsquery([ config regconfig , ] query text)
| tsquery | normalize words and convert to tsquery
| to_tsquery('english', 'The & Fat & Rats') | 'fat' & 'rat' |
to_tsvector([ config regconfig , ] document text)
| tsvector | reduce document text to tsvector
| to_tsvector('english', 'The Fat Rats') | 'fat':2 'rat':3 |
to_tsvector([ config regconfig , ] document json(b)) | tsvector | reduce each string value in the document to a tsvector , and then concatenate those in document order to produce a single tsvector
| to_tsvector('english', '{"a": "The Fat Rats"}'::json) | 'fat':2 'rat':3 |
json(b)_to_tsvector([ config regconfig, ] document json(b), filter json(b)) | tsvector | reduce each value in the document, specified by filter to a tsvector , and then concatenate those in document order to produce a single tsvector . filter is a jsonb array, that enumerates what kind of elements need to be included into the resulting tsvector . Possible values for filter are "string" (to include all string values), "numeric" (to include all numeric values in the string format), "boolean" (to include all Boolean values in the string format "true" /"false" ), "key" (to include all keys) or "all" (to include all above). These values can be combined together to include, e.g. all string and numeric values. | json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]') | '123':5 'fat':2 'rat':3 |
ts_delete(vector tsvector, lexeme text)
| tsvector | remove given lexeme from vector
| ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat') | 'cat':3 'rat':5A |
ts_delete(vector tsvector, lexemes text[]) | tsvector | remove any occurrence of lexemes in lexemes from vector
| ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat']) | 'cat':3 |
ts_filter(vector tsvector, weights "char"[])
| tsvector | select only elements with given weights from vector
| ts_filter('fat:2,4 cat:3b rat:5A'::tsvector, '{a,b}') | 'cat':3B 'rat':5A |
ts_headline([ config regconfig, ] document text, query tsquery [, options text ])
| text | display a query match | ts_headline('x y z', 'z'::tsquery) | x y <b>z</b> |
ts_headline([ config regconfig, ] document json(b), query tsquery [, options text ]) | text | display a query match | ts_headline('{"a":"x y z"}'::json, 'z'::tsquery) | {"a":"x y <b>z</b>"} |
ts_rank([ weights float4[], ] vector tsvector, query tsquery [, normalization integer ])
| float4 | rank document for query | ts_rank(textsearch, query) | 0.818 |
ts_rank_cd([ weights float4[], ] vector tsvector, query tsquery [, normalization integer ])
| float4 | rank document for query using cover density | ts_rank_cd('{0.1, 0.2, 0.4, 1.0}', textsearch, query) | 2.01317 |
ts_rewrite(query tsquery, target tsquery, substitute tsquery)
| tsquery | replace target with substitute within query | ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery) | 'b' & ( 'foo' | 'bar' ) |
ts_rewrite(query tsquery, select text) | tsquery | replace using targets and substitutes from a SELECT command | SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases') | 'b' & ( 'foo' | 'bar' ) |
tsquery_phrase(query1 tsquery, query2 tsquery)
| tsquery | make query that searches for query1 followed by query2 (same as <-> operator) | tsquery_phrase(to_tsquery('fat'), to_tsquery('cat')) | 'fat' <-> 'cat' |
tsquery_phrase(query1 tsquery, query2 tsquery, distance integer) | tsquery | make query that searches for query1 followed by query2 at distance distance
| tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10) | 'fat' <10> 'cat' |
tsvector_to_array(tsvector)
| text[] | convert tsvector to array of lexemes | tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector) | {cat,fat,rat} |
tsvector_update_trigger()
| trigger | trigger function for automatic tsvector column update | CREATE TRIGGER ... tsvector_update_trigger(tsvcol, 'pg_catalog.swedish', title, body) | |
tsvector_update_trigger_column()
| trigger | trigger function for automatic tsvector column update | CREATE TRIGGER ... tsvector_update_trigger_column(tsvcol, configcol, title, body) | |
unnest(tsvector, OUT lexeme text, OUT positions smallint[], OUT weights text)
| setof record | expand a tsvector to a set of rows | unnest('fat:2,4 cat:3 rat:5A'::tsvector) | (cat,{3},{D}) ... |
Note
All the text search functions that accept an optional
regconfig
argument will use the configuration specified by default_text_search_config when that argument is omitted.
The functions in Table 9.43 are listed separately because they are not usually used in everyday text searching operations. They are helpful for development and debugging of new text search configurations.
Table 9.43. Text Search Debugging Functions
Function | Return Type | Description | Example | Result |
---|---|---|---|---|
ts_debug([ config regconfig, ] document text, OUT alias text, OUT description text, OUT token text, OUT dictionaries regdictionary[], OUT dictionary regdictionary, OUT lexemes text[])
| setof record | test a configuration | ts_debug('english', 'The Brightest supernovaes') | (asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ... |
ts_lexize(dict regdictionary, token text)
| text[] | test a dictionary | ts_lexize('english_stem', 'stars') | {star} |
ts_parse(parser_name text, document text, OUT tokid integer, OUT token text)
| setof record | test a parser | ts_parse('default', 'foo - bar') | (1,foo) ... |
ts_parse(parser_oid oid, document text, OUT tokid integer, OUT token text) | setof record | test a parser | ts_parse(3722, 'foo - bar') | (1,foo) ... |
ts_token_type(parser_name text, OUT tokid integer, OUT alias text, OUT description text)
| setof record | get token types defined by parser | ts_token_type('default') | (1,asciiword,"Word, all ASCII") ... |
ts_token_type(parser_oid oid, OUT tokid integer, OUT alias text, OUT description text) | setof record | get token types defined by parser | ts_token_type(3722) | (1,asciiword,"Word, all ASCII") ... |
ts_stat(sqlquery text, [ weights text, ] OUT word text, OUT ndoc integer, OUT nentry integer)
| setof record | get statistics of a tsvector column | ts_stat('SELECT vector from apod') | (foo,10,15) ... |
© 1996–2019 The PostgreSQL Global Development Group
Licensed under the PostgreSQL License.
https://www.postgresql.org/docs/12/functions-textsearch.html