HEVC – What are CTU, CU, CTB, CB, PB, and TB?

HEVC, also known as H.265 or MPEG-H part 2(ISO/IEC 23008-2), is just around the corner. A good overview was published. The reference implementation is available. Active discussion can be read in the JCT-VC document management system. The Final Draft International Standard (FDIS) is expected be produced in January 2013.

However, as always, the standard writers love cryptic acronyms. Probably the first acronyms which discourage standard readers are block structure coding terminologies, namely CTU, CU, CTB, CB, PB, and TB.

They are basically replacement of Macroblocks and blocks in prior standards. Unlike 10 years ago, we have much higher frame sizes to deal with. 4K production became practical and people start talking about 8K. Even mobile device has higher than HD frame size such as 2048 x 1530. We need larger macroblocks to efficiently encode the motion vectors for these frame size. On the other hand, small detail is still important and we sometimes want to perform prediction and transformation at the granularity of 4×4.

How could we support wide variety of block sizes in efficient manner? That’s a challenge HEVC is trying to solve with those acronyms.

Let’s start from the higher level. Suppose we have a picture to encode. HEVC divides the picture into CTUs (Coding Tree Unit).

The width and height of CTU are signaled in a sequence parameter set, meaning that all the CTUs in a video sequence have the same size: 64×64, 32×32, or 16×16.

We need to understand an important naming convention here. In HEVC standard, if something is called xxxUnit, it indicates a coding logical unit which is in turn encoded into an HEVC bit stream. On the other hand, if something is called xxxBlock, it indicates a portion of video frame buffer where a process is target to.

CTU – Coding Tree Unit is therefore a logical unit. It usually consists of three blocks, namely luma (Y) and two chroma samples (Cb and Cr), and associated syntax elements. Each block is called CTB (Coding Tree Block).

Each CTB still has the same size as CTU – 64×64, 32×32, or 16×16. Depending on a part of video frame, however, CTB may be too big to decide whether we should perform inter-picture prediction or intra-picture prediction. Thus, each CTB can be differently split into multiple CBs (Coding Blocks) and each CB becomes the decision making point of prediction type. For example, some CTBs are split to 16×16 CBs while others are split to 8×8 CBs. HEVC supports CB size all the way from the same size as CTB to as small as 8×8.

The following picture illustrates how 64×64 CTB can be split into CBs.

CB is the decision point whether to perform inter-picture or intra-picture prediction. More precisely, the prediction type is coded in CU (Coding Unit). CU consists of three CBs (Y, Cb, and Cr) and associated syntax elements.

CB is good enough for prediction type decision, but it could still be too large to store motion vectors (inter prediction) or intra prediction mode. For example, a very small object like snowfall may be moving in the middle of 8×8 CB – we want to use different MVs depending on the portion in CB.

Snowfall

Thus, PB was introduced. Each CB can be split to PBs differently depending on the temporal and/or spatial predictability.

Once the prediction is made, we need to code residual (difference between predicted image and actual image) with DCT-like transformation. Again, CB could be too big for this because a CB may contains both a detailed part (high frequency) and a flat part (low frequency). Therefore, each CB can be differently split into TBs (Transform Block). Note that TB doesn’t have to be aligned with PB. It is possible and often makes sense to perform single transform across residuals from multiple PBs, vise versa.

Let’s read a draft standard text regarding to these terminologies. They should make more sense now.

CTU (coding tree unit): A coding tree block of luma samples, two corresponding coding tree blocks of chroma samples of a picture that has three sample arrays, or a coding tree block of samples of a monochrome picture or a picture that is coded using three separate colour planes and syntax structures used to code the samples. The division of a slice into coding tree units is a partitioning.

CTB (coding tree block): An NxN block of samples for some value of N. The division of one of the arrays that compose a picture that has three sample arrays or of the array that compose a picture in monochrome format or a picture that is coded using three separate colour planes into coding tree blocks is a partitioning.

CB (coding block): An NxN block of samples for some value of N. The division of a coding tree block into coding blocks is a partitioning.

CU (coding unit): A coding block of luma samples, two corresponding coding blocks of chroma samples of a picture that has three sample arrays, or a coding block of samples of a monochrome picture or a picture that is coded using three separate colour planes and syntax structures used to code the samples. The division of a coding tree unit into coding units is a partitioning.

PB (prediction block): A rectangular MxN block of samples on which the same prediction is applied. The division of a coding block into prediction blocks is a partitioning.

TB (transform block): A rectangular MxN block of samples on which the same transform is applied. The division of a coding block into transform blocks is a partitioning.

Advertisements

About Moto

Engineer who likes coding
This entry was posted in Video and tagged , , , , , , , . Bookmark the permalink.

20 Responses to HEVC – What are CTU, CU, CTB, CB, PB, and TB?

  1. Pingback: Video/Audio Codec | canlinflexray

  2. me says:

    Good summary on CTU/CU/PU

  3. Pingback: Slice vs Tile in H.265 | Cash's Blog

  4. Pingback: HEVC – What are CTU, CU, CTB, CB, PB, and TB? | CODE: Sequoia | CODE Paint

  5. tony says:

    Excellent post, clears every thing

  6. kusemanohar says:

    I actually started reading the HEVC review paper, felt lost pretty fast. But this post save my day…!

  7. suresh says:

    what is frame buffer ? what is the importance of frame buffer ? And How do we configure ?

  8. NV says:

    Excellent summary.

  9. Excellent post. Great work.

  10. pewpew says:

    Thank you for the informative and understandable post

  11. Asma Idris says:

    I am working on HEVC motion estimation as my Final year project.
    I have extracted frames of video and I have the pixel value of a particular frame like this:
    [240x320x3 uint8]
    How to partition this value further into PUs? Need Help. Reply as soon as possible

  12. Sagar Shinde says:

    good one….

  13. jcamilorada says:

    What a great post, thats the idea of a good teacher make something complex easy to understand. Thanks for share. Which tool did you use for generated graphics?

  14. Pingback: H.265 – The Arrival of Video Compression’s Future – S3 Security Systems

  15. Eric says:

    It’s really an execellent summary and explaination for newbies to HEVC

  16. Pingback: H.265 / HEVC Codec – 세빛기술 블로그

  17. Jonás Regueira Rodríguez says:

    Congratulations, thank you very much for this great explanation.

  18. Alex Keys says:

    Great summary. Thanks man

  19. Phong Kah Ho says:

    Well written, both concise and clear, for one with some prior art experience.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s