Saturday, April 26, 2014

Display MSSQL PDW table rows by node, distribution, partition and COLUMNSTORE state


DBCC PDW_SHOWSPACEUSED () - Displays the number of rows, disk space reserved, and disk space used for a table in a SQL Server PDW database, or displays the disk space reserved and used for an entire SQL Server PDW database.

Advantages
- Useful to check data skew by distribution and node.

Disadvantages
- Not able to display data that sit on delta store, causing confusion that there are no record in the table.
- Not able to further filter.
- Multiple execution require for different table.

DBCC PDW_SHOWPARTITIONSTATS() - Displays the size and number of rows for each partition of a table in a SQL Server PDW database.

Advantages
- Display all row count regardless the data sit on delta store or column store.

Disadvantages
- Does not indicate how many rows of record sit on delta store.
- Not able to further filter.
- Multiple execution require for different table.

Query
SELECT
       Table_Name = OBJECT_NAME(pdw_table_mappings.object_id)
,      Table_Distribution = CASE WHEN pdw_table_distribution_properties.distribution_policy_desc = 'HASH' THEN 'DISTRIBUTE' ELSE pdw_table_distribution_properties.distribution_policy_desc END
,      Index_Type = indexes.type_desc
,      PDW_Node = pdw_nodes_partitions.pdw_node_id
,      Distribution = pdw_distributions.distribution_id
,      Partition_Number = pdw_nodes_partitions.partition_number
,      Partition_Position = CASE WHEN partition_functions.boundary_value_on_right = 1 THEN 'RIGHT' WHEN partition_functions.boundary_value_on_right = 0 THEN 'LEFT' END
,      Partition_Value = partition_range_values.value
,      Rows = CASE WHEN indexes.type IN (0,1) THEN
                           pdw_nodes_partitions.rows
                     ELSE
                           pdw_nodes_column_store_row_groups.total_rows
                     END
,      ColumnStore_State = pdw_nodes_column_store_row_groups.state_description

FROM   sys.pdw_table_mappings

              LEFT JOIN sys.indexes
              ON indexes.object_id = pdw_table_mappings.object_id
              AND indexes.type <> 2

              LEFT JOIN sys.pdw_table_distribution_properties
              ON pdw_table_distribution_properties.object_id = pdw_table_mappings.object_id
                          
              LEFT JOIN sys.pdw_nodes_tables
              ON pdw_nodes_tables.name = pdw_table_mappings.physical_name
             
              LEFT JOIN sys.pdw_distributions
              ON ((pdw_table_distribution_properties.distribution_policy = 2 AND pdw_distributions.pdw_node_id = pdw_nodes_tables.pdw_node_id) OR 1 = 2)
              AND ((pdw_table_distribution_properties.distribution_policy = 2 AND pdw_distributions.name = RIGHT(pdw_nodes_tables.name, 1)) OR 1 = 2)

              LEFT JOIN (   SELECT object_id, index_id, partition_number, pdw_node_id, rows = sum(rows)
                                  FROM sys.pdw_nodes_partitions
                                  GROUP BY object_id, index_id, partition_number, pdw_node_id)pdw_nodes_partitions
              ON pdw_nodes_partitions.object_id = pdw_nodes_tables.object_id
              AND pdw_nodes_partitions.pdw_node_id = pdw_nodes_tables.pdw_node_id
              AND pdw_nodes_partitions.index_id = indexes.index_id
             
              LEFT JOIN sys.partitions
              ON partitions.object_id = indexes.object_id
              AND partitions.index_id = indexes.index_id
              AND partitions.partition_number = pdw_nodes_partitions.partition_number
             
              LEFT JOIN sys.partition_schemes
              ON partition_schemes.data_space_id = indexes.data_space_id

              LEFT JOIN sys.partition_functions
              ON partition_functions.function_id = partition_schemes.function_id

              LEFT JOIN sys.partition_range_values
              ON partition_range_values.function_id = partition_functions.function_id
              AND COALESCE(partition_range_values.boundary_id + 1, 1) = partitions.partition_number   
                    
              LEFT JOIN (   SELECT object_id, index_id, partition_number, state_description, pdw_node_id, total_rows = SUM(total_rows)
                                  FROM sys.pdw_nodes_column_store_row_groups
                                  WHERE total_rows IS NOT NULL
                                  GROUP BY object_id, index_id, partition_number, state_description, pdw_node_id)pdw_nodes_column_store_row_groups
              ON pdw_nodes_column_store_row_groups.object_id = pdw_nodes_partitions.object_id
              AND pdw_nodes_column_store_row_groups.index_id = pdw_nodes_partitions.index_id
              AND pdw_nodes_column_store_row_groups.partition_number = pdw_nodes_partitions.partition_number
              AND pdw_nodes_column_store_row_groups.pdw_node_id = pdw_nodes_partitions.pdw_node_id

Output



No comments:

Post a Comment