postgresql - 对复杂jsonb内容进行全文检索

标签 postgresql full-text-search jsonb postgresql-9.6

我有相当复杂的 jsonb 列,其中包含嵌套数组和对象。我需要对其进行全文搜索。 json 示例:

{
"buyer": {
    "email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="576667666767676663666e2332242317323c35243238792522" rel="noreferrer noopener nofollow">[email protected]</a>",
    "person": {
        "phone": "1010001419",
        "taxId": "590202081324",
        "address": "г Москва, ул Авиаторов, д 34 ",
        "lastName": "Зайцева",
        "passport": {
            "issuer": "йцукйцук",
            "deptCode": "123241",
            "issueDate": [
                1111,
                11,
                11
            ],
            "numAndSeries": "0001212810"
        },
        "birthDate": [
            1952,
            2,
            18
        ],
        "firstName": "Зоя",
        "birthPlace": "фывфыв",
        "patronymic": "Антоновна",
        "citizenship": "Россия"
    }
},
"dealNo": "05-0000004",
"created": [
    2017,
    3,
    6
],
"services": [
    "SGR"
],
"transactId": "602032128",
"dealDetails": {
    "secondary": {
        "deposit": 200000,
        "sellers": [
            {
                "bank": {
                    "bic": "044525225",
                    "city": "Москва",
                    "name": "ПУБЛИЧНОЕ АКЦИОНЕРНОЕ ОБЩЕСТВО \"СБЕРБАНК РОССИИ\"",
                    "correspondentAccount": "30101810400000000225"
                },
                "email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="6e0a1d081d2e1d0a08401c1b" rel="noreferrer noopener nofollow">[email protected]</a>",
                "amount": 4800000,
                "person": {
                    "phone": "1234132512",
                    "taxId": "590202081324",
                    "address": "г Москва, ул Марьинский Парк, д 45 стр 1 ",
                    "lastName": "Трутненко",
                    "passport": {
                        "issuer": "",
                        "deptCode": "",
                        "issueDate": [
                            -999999999,
                            1,
                            1
                        ],
                        "numAndSeries": ""
                    },
                    "birthDate": [
                        1111,
                        11,
                        11
                    ],
                    "firstName": "ываыаы",
                    "birthPlace": "фывфыв",
                    "patronymic": null,
                    "citizenship": "Россия"
                },
                "account": "48213412341234234234"
            }
        ],
        "propertyAddress": "г Москва, ул Вавилова, д 19 "
    }
},
"bankContacts": {
    "bankOfficeId": 3561,
    "mortgageManager": {
        "casId": 88928,
        "email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="b9cadbcddccacdd4d0d288f9c0d8d7dddcc197cbcc" rel="noreferrer noopener nofollow">[email protected]</a>",
        "phone": "79853622342",
        "lastName": "Дзержински",
        "firstName": "Макар",
        "patronymic": "Олегович"
    },
    "mortgageDeptHead": {
        "casId": 88923,
        "email": "<a href="https://stackoverflow.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="b1c2d3c5d4c2c5c3d2d8daf1c8d0dfd5d4c99fc3c4" rel="noreferrer noopener nofollow">[email protected]</a>",
        "phone": "72384798798",
        "lastName": "Михрюткин",
        "firstName": "Валентин",
        "patronymic": "Геннадьевич"
    }
},
"contractInfo": {
    "city": "Москва",
    "price": 5000000,
    "cadastralNum": "65:65:76876:876",
    "contractDate": [
        1111,
        11,
        11
    ]
},
"creditContract": {
    "number": "41221312",
    "ownCapital": 1000000,
    "loanCapital": 4000000
}

}

实际上我需要搜索 deal_nobuyer.person.phonebuyer.person.address.**。(此处的所有文本值), dealDetails.secondary.sellers[].(此处为所有文本值), bankContacts.(此处为所有文本值) 执行此操作的最佳方法是什么?

我使用postgresql 9.6

最佳答案

这就是我非常相似的任务的样子。

数据库表如下所示:

 CREATE TABLE sites (
   id text NOT NULL,
   doc jsonb,
   PRIMARY KEY (id)
 )

我们存储在 doc 列中的数据是复杂的嵌套 JSONB 数据:

   {
      "_id": "123",
      "type": "Site",
      "identification": "Custom ID",
      "title": "SITE 1",
      "address": "UK, London, Mr Tom's street, 2",
      "buildings": [
          {
               "uuid": "12312",
               "identification": "Custom ID",
               "name": "BUILDING 1",
               "deposits": [
                   {
                      "uuid": "12312",
                      "identification": "Custom ID",             
                      "audits": [
                          {
                             "uuid": "12312",         
                              "sample_id": "SAMPLE ID"                
                          }
                       ]
                   }
               ]
          } 
       ]
    }

所以我的 JSONB 的结构如下所示:

SITE 
  -> ARRAY OF BUILDINGS
     -> ARRAY OF DEPOSITS
       -> ARRAY OF AUDITS

我们需要通过每种条目类型中的某些值来实现全文搜索:

SITE (identification, title, address)
BUILDING (identification, name)
DEPOSIT (identification)
AUDIT (sample_id)

SQL 查询应仅在这些字段值中运行全文搜索。

关于postgresql - 对复杂jsonb内容进行全文检索,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44794073/

相关文章:

android - Kivy Android Sqlite3 fts 应用程序崩溃并出现 "no such module"错误

sql - PostgreSQL 使用另一列的值更新 JSONB 列

sql - 使用 postgres 中存储过程的输出创建一个临时表

postgresql - 如何从 WSL 连接到 windows postgres 数据库

postgresql - Gitlab Omibus版安装的postgreSQL默认密码是多少?

postgresql - 如何使用 PostgreSQL 将变量放入 JSONB?

postgresql - 如何返回自定义对象数组并加入其他一些表?

postgresql - PostgreSQL的“反向”外键“约束”?

mysql - mysql 忽略# 字符的全文搜索

MongoDB $text 运算符匹配文档,其中搜索字符串是子字符串