Thursday, December 19, 2013

Interpreting Skewness

What does Skew metric mean?

Overview

You can see this word "Skewness" or "Skew factor" in a lot of places regarding Teradta: documents, applications, etc. Skewed table, skewed cpu. It is something wrong, but what does it explicitly mean? How to interpret it?

Let's do some explanation and a bit simple maths.

Teradata is a massive parallel system, where uniform units (AMPs) do the same tasks on that data parcel they are responsible for. In an ideal world all AMPs share the work equally, no one must work more than the average. The reality is far more cold, it is a rare situation when this equality (called "even distribution") exists.
It is obvious that uneven distribution will cause wrong efficiency of using the parallel infrastructure.

But how bad is the situation? Exactly that is what Skewness characterizes.

Definitions

Let "RESOURCE" mean the amount of resource (CPU, I/O, PERM space) consumed by an AMP.
Let AMPno is the number of AMPs in the Teradata system.

Skew factor := 100 - ( AVG ( "RESOURCE" ) / NULLIFZERO ( MAX ("RESOURCE") ) * 100 )

Total[Resource] := SUM("RESOURCE")

Impact[Resource] := MAX("RESOURCE") * AMPno

Parallel Efficiency := Total[Resource] / Impact[Resource] * 100

or with some transformation:

Parallel Efficiency := 100 - Skew factor

Analysis

Codomain

0 <= "Skew factor" < 100

"Total[Resource]" <= "Impact[Resource]"

0<"Parallel Efficiency"<=100

Meaning

Skew factor : This percent of the consumed real resources are wasted
Eg. an 1Gbytes table with skew factor of 75 will allocate 4Gbytes*

Total[Resource] :Virtual resource consumption, single sum of individual resource consumptions , measured on  AMPs as independent systems

Impact[Resource] :Real resource consumption impacted on the parallel infrastructure

Parallel Efficiency : As it says. Eg. Skew=80: 20%

* Theoretically if there is/are complementary characteristics resource allocation (consumes that less resources on that AMP where my load has excess) that can compensate the parallel inefficiency from system point of view, but the probability of it tends to zero.

Illustration



Skew := Yellow / (Yellow + Green) * 100 [percent]


The "Average" level indicates the mathematical average of AMP level resource consumptions (Total[Resource]), while "Peak" is maximum of AMP level resource consumptions: the real consumption from "parallel system view" (Impact[Resource])

On finding skewed tables I will write a post later.
PRISE Tuning Assistant helps you to find queries using CPU or I/O and helps to get rid of skewness.


2 comments:

  1. Hi Akos,

    Thanks for all your efforts.

    I am confused about
    0<"Parallel Efficiency"<=100

    How parallel efficiency cannot be zero?

    Thanks,
    Niraj

    ReplyDelete
  2. Hi Niraj,

    Sorry for the late answer.
    If you analyze the formula of parralel efficiency, it can only be zero if the Total[resource] is zero. Otherwise can be any low, but above zero.

    br,
    Ákos

    ReplyDelete