问题:
I have a Postgres table with products data from different manufacturers, here the simplified table structure:
CREATE TABLE test_table (
sku text,
manufa...
可以将文章内容翻译成中文,广告屏蔽插件会导致该功能失效:
问题:
I have a Postgres table with products data from different manufacturers, here the simplified table structure:
CREATE TABLE test_table (
sku text,
manufacturer_name text,
price double precision,
stock int
);
INSERT INTO test_table
VALUES ('sku1', 'Manufacturer1', 110.00, 22),
('sku1', 'Manufacturer2', 120.00, 15),
('sku1', 'Manufacturer3', 130.00, 1),
('sku1', 'Manufacturer3', 30.00, 11),
('sku2', 'Manufacturer1', 10.00, 2),
('sku2', 'Manufacturer2', 9.00, 3),
('sku3', 'Manufacturer2', 21.00, 3),
('sku3', 'Manufacturer2', 1.00, 7),
('sku3', 'Manufacturer3', 19.00, 5);
I need to output each Manufacturer for each sku but if there are several identical manufacturers for the same sku I need to select the Manufacturer with the lowest price (note that I also need to include 'stock' column), here desired results:
| sku | man1_price | man1_stock | man2_price | man2_stock | man3_price | man3_stock |
|------|------------|------------|------------|------------|------------|------------|
| sku1 | 110.0 | 22 | 120.0 | 15 | 30.0 | 11 |
| sku2 | 10.0 | 2 | 9.0 | 3 | | |
| sku3 | | | 1.0 | 7 | 19.0 | 5 |
I tried to use Postgres crosstab()
:
SELECT *
FROM crosstab('SELECT sku, manufacturer_name, price
FROM test_table
ORDER BY 1,2',
$$ SELECT DISTINCT manufacturer_name FROM test_table ORDER BY 1 $$
)
AS ct (sku text, "man1_price" double precision,
"man2_price" double precision,
"man3_price" double precision
);
But this produces a table with only one price
column. And I didn't find a way to include the stock
column.
I also tried to use conditional aggregation:
SELECT sku,
MIN(CASE WHEN manufacturer_name = 'Manufacturer1' THEN price END) as man1_price,
MIN(CASE WHEN manufacturer_name = 'Manufacturer1' THEN stock END) as man1_stock,
MIN(CASE WHEN manufacturer_name = 'Manufacturer2' THEN price END) as man2_price,
MIN(CASE WHEN manufacturer_name = 'Manufacturer2' THEN stock END) as man2_stock,
MIN(CASE WHEN manufacturer_name = 'Manufacturer3' THEN price END) as man3_price,
MIN(CASE WHEN manufacturer_name = 'Manufacturer3' THEN stock END) as man3_stock
FROM test_table
GROUP BY sku
ORDER BY sku
And this query also doesn't work in my case - it simply selects min stock level - but if there are few the same Manufacturers for the same sku but with different prices/stocks - this query select min price from one manufacturer and min stock from another.
How can I output each manufacturer's price
and corresponding stock
from that table?
P.S. Thank you all for such helpful answers. My Postgres table is rather small - there no more than 15k of products, (I don't know if such numbers can be useful for proper comparing) but since Erwin Brandstetter asked to compare different queries performance I ran 3 queries with EXPLAIN ANALYZE
, here is their execution time:
Erwin Brandstetter query: 400 - 450 ms
Kjetil S query: 250 - 300 ms
Gordon Linoff query: 200 - 250 ms
a_horse_with_no_name query: 250 - 300 ms
Again - I'm not sure if those numbers can be useful as a reference. For my case, I chose the combined version of Kjetil S
and Gordon Linoff
queries but Erwin Brandstetter
and a_horse_with_no_name
variants are also very useful and interesting. It's worth noting that if my table in the future would end up having more then few Manufacturers - adjusting query and typing their names each time would be tiresome - and hence the query from a_horse_with_no_name
answer would be the most convenient to use.
回答1:
Your last select almost works. But you should add a where condition where rows with non-minimum prices per sku per manufacturer are removed. This produces your expected result:
select
sku,
min( case when manufacturer_name='Manufacturer1' then price end ) man1_price,
min( case when manufacturer_name='Manufacturer1' then stock end ) man1_stock,
min( case when manufacturer_name='Manufacturer2' then price end ) man2_price,
min( case when manufacturer_name='Manufacturer2' then stock end ) man2_stock,
min( case when manufacturer_name='Manufacturer3' then price end ) man3_price,
min( case when manufacturer_name='Manufacturer3' then stock end ) man3_stock
from test_table t
where not exists (
select 1 from test_table
where sku=t.sku
and manufacturer_name=t.manufacturer_name
and price<t.price
)
group by sku
order by 1;
回答2:
I find using a JSON result much easier these days then using a complicated pivot. Producing a single aggregated JSON value doesn't break the inherent restriction of SQL that the number of columns must be known before the query is executed (and must be the same for all rows).
You could using something like this:
select sku,
jsonb_object_agg(manufacturer_name,
jsonb_build_object('price', price, 'stock', stock, 'isMinPrice', price = min_price)) as price_info
from (
select sku,
manufacturer_name,
price,
min(price) over (partition by sku) as min_price,
stock
from test_table
) t
group by sku;
The above returns the following result using your sample data:
sku | price_info
-----+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
sku1 | {"Manufacturer1": {"price": 110, "stock": 22, "isMinPrice": false}, "Manufacturer2": {"price": 120, "stock": 15, "isMinPrice": false}, "Manufacturer3": {"price": 30, "stock": 11, "isMinPrice": true}}
sku2 | {"Manufacturer1": {"price": 10, "stock": 2, "isMinPrice": false}, "Manufacturer2": {"price": 9, "stock": 3, "isMinPrice": true}}
sku3 | {"Manufacturer2": {"price": 1, "stock": 7, "isMinPrice": true}, "Manufacturer3": {"price": 19, "stock": 5, "isMinPrice": false}}
回答3:
I would use distinct on
to limit the data to one manufacturer to one price. And I like the filter
functionality in Postgres. So:
select sku,
max(price) filter (where manufacturer_name = 'Manufacturer1') as man1_price,
max(stock) filter (where manufacturer_name = 'Manufacturer1') as man1_stock,
max(price) filter (where manufacturer_name = 'Manufacturer2') as man2_price,
max(stock) filter (where manufacturer_name = 'Manufacturer2') as man2_stock,
max(price) filter (where manufacturer_name = 'Manufacturer3') as man3_price,
max(stock) filter (where manufacturer_name = 'Manufacturer3') as man3_stock
from (select distinct on (manufacturer_name, sku) t.*
from test_table t
order by manufacturer_name, sku, price
) t
group by sku
order by sku;
回答4:
crosstab()
must provide a static column definition list. Your 2nd parameter:
$$ SELECT DISTINCT manufacturer_name FROM test_table ORDER BY 1 $$
... provides a dynamic list of values that would require a dynamic column definition list. That's not going to work - except by incidence.
The core problem of your task is that crosstab()
expects a single value column from the query in its first parameter. But you want to process two value columns per row (price
and stock
).
One way around this is to pack multiple values in a composite type and extract values in the outer SELECT
.
Create a composite type once:
CREATE TYPE price_stock AS (price float8, stock int);
A temporary table or view also serves the purpose.
Then:
SELECT sku
, (man1).price, (man1).stock
, (man2).price, (man2).stock
, (man3).price, (man3).stock
FROM crosstab(
'SELECT sku, manufacturer_name, (price, stock)::price_stock
FROM test_table
ORDER BY 1,2'
, $$VALUES ('Manufacturer1'),('Manufacturer2'),('Manufacturer3')$$
)
AS ct (sku text
, man1 price_stock
, man2 price_stock
, man3 price_stock
);
For a quick test, or if the row of your underlying table isn't too wide, you can also just use its row type, without creating a custom type:
SELECT sku
, (man1).price, (man1).stock
, (man2).price, (man2).stock
, (man3).price, (man3).stock
FROM crosstab(
'SELECT sku, manufacturer_name, t
FROM test_table t
ORDER BY 1,2'
, $$VALUES ('Manufacturer1'),('Manufacturer2'),('Manufacturer3')$$
)
AS ct (sku text
, man1 test_table
, man2 test_table
, man3 test_table
);
db<>fiddle here
Related: