

 Amazon Redshift 將不再支援從修補程式 198 開始建立新的 Python UDFs。現有 Python UDF 將繼續正常運作至 2026 年 6 月 30 日。如需詳細資訊，請參閱[部落格文章](https://aws.amazon.com/blogs/big-data/amazon-redshift-python-user-defined-functions-will-reach-end-of-support-after-june-30-2026/)。

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 識別具有資料扭曲或未排序資料列的資料表
<a name="identify-tables-with-data-skew-or-unsorted-rows"></a>

下列查詢會識別具有不均資料配送 (資料偏度) 或高比例未排序資料列的資料表。

低的 `skew` 值指出資料表資料已正確配送。如果資料表有 4.00 或更高的 `skew` 值，請考慮修改其資料配送樣式。如需詳細資訊，請參閱[次佳資料分佈](query-performance-improvement-opportunities.md#suboptimal-data-distribution)。

如果資料表的 `pct_unsorted` 值大於 20%，請考慮執行 [VACUUM](r_VACUUM_command.md) 命令。如需詳細資訊，請參閱[未排序或排序錯誤的資料列](query-performance-improvement-opportunities.md#unsorted-or-mis-sorted-rows)。

同時另外檢閱每個資料表的 `mbytes` 和 `pct_of_total` 值。這些資料欄會識別資料表的大小，以及資料表耗用的原始磁碟空間百分比。原始磁碟空間包括 Amazon Redshift 保留供內部使用的空間，因此會大於名目磁碟容量，它是可供使用者使用的磁碟空間容量。使用此資訊來確保您的可用磁碟空間至少是您最大資料表的 2.5 倍。有此可用空間可讓系統在處理複雜查詢時將中繼結果寫入至磁碟。

```
select trim(pgn.nspname) as schema, 
trim(a.name) as table, id as tableid, 
decode(pgc.reldiststyle,0, 'even',1,det.distkey ,8,'all') as distkey, dist_ratio.ratio::decimal(10,4) as skew, 
det.head_sort as "sortkey", 
det.n_sortkeys as "#sks", b.mbytes,  
decode(b.mbytes,0,0,((b.mbytes/part.total::decimal)*100)::decimal(5,2)) as pct_of_total, 
decode(det.max_enc,0,'n','y') as enc, a.rows, 
decode( det.n_sortkeys, 0, null, a.unsorted_rows ) as unsorted_rows , 
decode( det.n_sortkeys, 0, null, decode( a.rows,0,0, (a.unsorted_rows::decimal(32)/a.rows)*100) )::decimal(5,2) as pct_unsorted 
from (select db_id, id, name, sum(rows) as rows, 
sum(rows)-sum(sorted_rows) as unsorted_rows 
from stv_tbl_perm a 
group by db_id, id, name) as a 
join pg_class as pgc on pgc.oid = a.id
join pg_namespace as pgn on pgn.oid = pgc.relnamespace
left outer join (select tbl, count(*) as mbytes 
from stv_blocklist group by tbl) b on a.id=b.tbl
inner join (select attrelid, 
min(case attisdistkey when 't' then attname else null end) as "distkey",
min(case attsortkeyord when 1 then attname  else null end ) as head_sort , 
max(attsortkeyord) as n_sortkeys, 
max(attencodingtype) as max_enc 
from pg_attribute group by 1) as det 
on det.attrelid = a.id
inner join ( select tbl, max(mbytes)::decimal(32)/min(mbytes) as ratio 
from (select tbl, trim(name) as name, slice, count(*) as mbytes
from svv_diskusage group by tbl, name, slice ) 
group by tbl, name ) as dist_ratio on a.id = dist_ratio.tbl
join ( select sum(capacity) as  total
from stv_partitions where part_begin=0 ) as part on 1=1
where mbytes is not null 
order by  mbytes desc;
```