sql - postgresql函数混淆

标签 sql function postgresql

如果我这样写一个查询:

with WordBreakDown (idx, word, wordlength) as (
    select 
        row_number() over () as idx,
        word,
        character_length(word) as wordlength
    from
    unnest(string_to_array('yo momma so fat', ' ')) as word
)
select 
    cast(wbd.idx + (
        select SUM(wbd2.wordlength)
        from WordBreakDown wbd2
        where wbd2.idx <= wbd.idx
        ) - wbd.wordlength as integer) as position,
    cast(wbd.word as character varying(512)) as part
from
    WordBreakDown wbd;  

...我得到一个 4 行的表格,如下所示:

1;"yo"
4;"momma"
10;"so"
13;"fat"

...这就是我想要的。 但是,如果我将其包装成这样的函数:

drop type if exists split_result cascade;
create type split_result as(
    position integer,
    part character varying(512)
);

drop function if exists split(character varying(512), character(1));    
create function split(
    _s character varying(512), 
    _sep character(1)
    ) returns setof split_result as $$
begin

    return query
    with WordBreakDown (idx, word, wordlength) as (
        select 
            row_number() over () as idx,
            word,
            character_length(word) as wordlength
        from
        unnest(string_to_array(_s, _sep)) as word
    )
    select 
        cast(wbd.idx + (
            select SUM(wbd2.wordlength)
            from WordBreakDown wbd2
            where wbd2.idx <= wbd.idx
            ) - wbd.wordlength as integer) as position,
        cast(wbd.word as character varying(512)) as part
    from
        WordBreakDown wbd;  

end;
$$ language plpgsql;

select * from split('yo momma so fat', ' ');

...我得到:

1;"yo momma so fat"

我正在为这个问题挠头。我搞砸了什么?

更新 根据以下建议,我已经替换了函数:

CREATE OR REPLACE FUNCTION split(_string character varying(512), _sep character(1))
  RETURNS TABLE (postition int, part character varying(512)) AS
$BODY$
BEGIN
    RETURN QUERY
    WITH wbd AS (
        SELECT (row_number() OVER ())::int AS idx
              ,word
              ,length(word) AS wordlength
        FROM   unnest(string_to_array(_string, rpad(_sep, 1))) AS word
        )
    SELECT (sum(wordlength) OVER (ORDER BY idx))::int + idx - wordlength
          ,word::character varying(512) -- AS part
    FROM wbd;  
END;
$BODY$ LANGUAGE plpgsql;

...它保留了我的原始函数签名以获得最大的兼容性,并获得了大部分的性能提升。感谢回答者,我发现这是一次多方面的学习经历。您的解释确实帮助我理解了发生的事情。

最佳答案

注意这个:

select length(' '::character(1));
 length
--------
      0
(1 row)

造成这种混淆的一个原因是 SQL 标准中对 character 类型的奇怪定义。来自 Postgres documentation for character types :

字符类型的值在物理上用空格填充到指定宽度 n,并以这种方式存储和显示。但是,填充空间在语义上被视为无关紧要。 尾随空格在比较两个字符类型的值时被忽略,并且它们在将字符值转换为其他字符串类型之一时将被删除

所以你应该使用string_to_array(_s, rpad(_sep,1))

关于sql - postgresql函数混淆,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9690960/

相关文章:

sql - 如何使用instr()搜索多个子字符串

mysql - 使用 group by 优化 MySQL 范围查询

mysql - 具有多个连接和排序依据的慢 MySQL 查询

SQL 返回递增的数字

sql - 仅对具有多行的组求和值

sql - 如何在 SQL 中将顺序的、带时间戳的行组合在一起并返回每个组的日期范围

c# - 将成员委托(delegate)给经典函数有更多优点或缺点?

azure - Azure 函数的出站 IP 地址

sql-server-2005 - 在 windows xp sp2 上创建从 sql server 2005 到 postgresql 8.3.12 的数据库链接

sql - 返回具有相同 ref_id 的每个元素的最后金额