Seperate phone numbers from string in cell - random order

问题: I have a bunch of data that contains a phone number and a birthday as well as other data. {1997-06-28,07742367858} {07791100873,1996-07-14} {30/01/1997,07974335488} {199...

问题:

I have a bunch of data that contains a phone number and a birthday as well as other data.

{1997-06-28,07742367858}
{07791100873,1996-07-14}
{30/01/1997,07974335488}
{1997-04-04,07701003703}
{1996-03-11,07480227283}
{1998-06-20,07713817233}
{1996-09-13,07435148920}
{"21 03 2000",07548542539,1st}
{1996-03-09,07539248008}
{07484642432,1996-03-01}

I am trying to extract the phone number from this, however unsure on how to get this out when the data is not always in the same order.

I would expect to one column that return a phone number, the next which returned a birthday then another which return any arbitrary value in the 3rd column slot.


回答1:

You can try to sort parts of each string by the number of digits they contain. This can be done with the expression:

select length(regexp_replace('1997-06-28', 'D', '', 'g'))

 length 
--------
      8
(1 row)

The query removes curly brackets from strings, splits them by comma, sorts elements by the number of digits and aggregates back to arrays:

with my_data(str) as (
values
    ('{1997-06-28,07742367858}'),
    ('{07791100873,1996-07-14}'),
    ('{30/01/1997,07974335488}'),
    ('{1997-04-04,07701003703}'),
    ('{1996-03-11,07480227283}'),
    ('{1998-06-20,07713817233}'),
    ('{1996-09-13,07435148920}'),
    ('{"21 03 2000",07548542539,1st}'),
    ('{1996-03-09,07539248008}'),
    ('{07484642432,1996-03-01}')
)

select id, array_agg(elem order by length(regexp_replace(elem, 'D', '', 'g')) desc)
from (
    select id, trim(unnest(string_to_array(str, ',')), '"') as elem
    from (
        select trim(str, '{}') as str, row_number() over () as id
        from my_data
        ) s
    ) s
group by id

Result:

 id |           array_agg            
----+--------------------------------
  1 | {07742367858,1997-06-28}
  2 | {07791100873,1996-07-14}
  3 | {07974335488,30/01/1997}
  4 | {07701003703,1997-04-04}
  5 | {07480227283,1996-03-11}
  6 | {07713817233,1998-06-20}
  7 | {07435148920,1996-09-13}
  8 | {07548542539,"21 03 2000",1st}
  9 | {07539248008,1996-03-09}
 10 | {07484642432,1996-03-01}
(10 rows)   

See also this answer Looking for solution to swap position of date format DMY to YMD if you want to normalize dates. You should modify the function:

create or replace function iso_date(text)
returns date language sql immutable as $$
    select case
        when $1 like '__/__/____' then to_date($1, 'DD/MM/YYYY')
        when $1 like '____/__/__' then to_date($1, 'YYYY/MM/DD')
        when $1 like '____-__-__' then to_date($1, 'YYYY-MM-DD')
        when trim($1, '"') like '__ __ ____' then to_date(trim($1, '"'), 'DD MM YYYY')
    end
$$;

and use it:

select id, a[1] as phone, iso_date(a[2]) as birthday, a[3] as comment
from (
    select id, array_agg(elem order by length(regexp_replace(elem, 'D', '', 'g')) desc) as a
    from (
        select id, trim(unnest(string_to_array(str, ',')), '"') as elem
        from (
            select trim(str, '{}') as str, row_number() over () as id
            from my_data
            ) s
        ) s
    group by id
    ) s

 id |    phone    |  birthday  | comment 
----+-------------+------------+---------
  1 | 07742367858 | 1997-06-28 | 
  2 | 07791100873 | 1996-07-14 | 
  3 | 07974335488 | 1997-01-30 | 
  4 | 07701003703 | 1997-04-04 | 
  5 | 07480227283 | 1996-03-11 | 
  6 | 07713817233 | 1998-06-20 | 
  7 | 07435148920 | 1996-09-13 | 
  8 | 07548542539 | 2000-03-21 | 1st
  9 | 07539248008 | 1996-03-09 | 
 10 | 07484642432 | 1996-03-01 | 
(10 rows)
  • 发表于 2019-01-12 02:16
  • 阅读 ( 277 )
  • 分类:网络文章

条评论

请先 登录 后评论
不写代码的码农
小编

篇文章

作家榜 »

  1. 小编 文章
返回顶部
部分文章转自于网络,若有侵权请联系我们删除