HEVC – What are CTU, CU, CTB, CB, PB, and TB?

HEVC, also known as H.265 or MPEG-H part 2(ISO/IEC 23008-2), is just around the corner. A good overview was published. The reference implementation is available. Active discussion can be read in the JCT-VC document management system. The Final Draft International Standard (FDIS) is expected be produced in January 2013.

However, as always, the standard writers love cryptic acronyms. Probably the first acronyms which discourage standard readers are block structure coding terminologies, namely CTU, CU, CTB, CB, PB, and TB.

They are basically replacement of Macroblocks and blocks in prior standards. Unlike 10 years ago, we have much higher frame sizes to deal with. 4K production became practical and people start talking about 8K. Even mobile device has higher than HD frame size such as 2048 x 1530. We need larger macroblocks to efficiently encode the motion vectors for these frame size. On the other hand, small detail is still important and we sometimes want to perform prediction and transformation at the granularity of 4×4.

How could we support wide variety of block sizes in efficient manner? That’s a challenge HEVC is trying to solve with those acronyms.

Let’s start from the higher level. Suppose we have a picture to encode. HEVC divides the picture into CTUs (Coding Tree Unit).

The width and height of CTU are signaled in a sequence parameter set, meaning that all the CTUs in a video sequence have the same size: 64×64, 32×32, or 16×16.

We need to understand an important naming convention here. In HEVC standard, if something is called xxxUnit, it indicates a coding logical unit which is in turn encoded into an HEVC bit stream. On the other hand, if something is called xxxBlock, it indicates a portion of video frame buffer where a process is target to.

CTU – Coding Tree Unit is therefore a logical unit. It usually consists of three blocks, namely luma (Y) and two chroma samples (Cb and Cr), and associated syntax elements. Each block is called CTB (Coding Tree Block).

Each CTB still has the same size as CTU – 64×64, 32×32, or 16×16. Depending on a part of video frame, however, CTB may be too big to decide whether we should perform inter-picture prediction or intra-picture prediction. Thus, each CTB can be differently split into multiple CBs (Coding Blocks) and each CB becomes the decision making point of prediction type. For example, some CTBs are split to 16×16 CBs while others are split to 8×8 CBs. HEVC supports CB size all the way from the same size as CTB to as small as 8×8.

The following picture illustrates how 64×64 CTB can be split into CBs.

CB is the decision point whether to perform inter-picture or intra-picture prediction. More precisely, the prediction type is coded in CU (Coding Unit). CU consists of three CBs (Y, Cb, and Cr) and associated syntax elements.

CB is good enough for prediction type decision, but it could still be too large to store motion vectors (inter prediction) or intra prediction mode. For example, a very small object like snowfall may be moving in the middle of 8×8 CB – we want to use different MVs depending on the portion in CB.

Snowfall

Thus, PB was introduced. Each CB can be split to PBs differently depending on the temporal and/or spatial predictability.

Once the prediction is made, we need to code residual (difference between predicted image and actual image) with DCT-like transformation. Again, CB could be too big for this because a CB may contains both a detailed part (high frequency) and a flat part (low frequency). Therefore, each CB can be differently split into TBs (Transform Block). Note that TB doesn’t have to be aligned with PB. It is possible and often makes sense to perform single transform across residuals from multiple PBs, vise versa.

Let’s read a draft standard text regarding to these terminologies. They should make more sense now.

CTU (coding tree unit): A coding tree block of luma samples, two corresponding coding tree blocks of chroma samples of a picture that has three sample arrays, or a coding tree block of samples of a monochrome picture or a picture that is coded using three separate colour planes and syntax structures used to code the samples. The division of a slice into coding tree units is a partitioning.

CTB (coding tree block): An NxN block of samples for some value of N. The division of one of the arrays that compose a picture that has three sample arrays or of the array that compose a picture in monochrome format or a picture that is coded using three separate colour planes into coding tree blocks is a partitioning.

CB (coding block): An NxN block of samples for some value of N. The division of a coding tree block into coding blocks is a partitioning.

CU (coding unit): A coding block of luma samples, two corresponding coding blocks of chroma samples of a picture that has three sample arrays, or a coding block of samples of a monochrome picture or a picture that is coded using three separate colour planes and syntax structures used to code the samples. The division of a coding tree unit into coding units is a partitioning.

PB (prediction block): A rectangular MxN block of samples on which the same prediction is applied. The division of a coding block into prediction blocks is a partitioning.

TB (transform block): A rectangular MxN block of samples on which the same transform is applied. The division of a coding block into transform blocks is a partitioning.

27 Responses to HEVC – What are CTU, CU, CTB, CB, PB, and TB?

Pingback: Video/Audio Codec | canlinflexray
me says:

August 29, 2013 at 4:51 pm

Good summary on CTU/CU/PU

Pingback: Slice vs Tile in H.265 | Cash's Blog
Pingback: HEVC – What are CTU, CU, CTB, CB, PB, and TB? | CODE: Sequoia | CODE Paint
tony says:

November 17, 2013 at 11:36 am

Excellent post, clears every thing

kusemanohar says:

March 24, 2014 at 2:24 am

I actually started reading the HEVC review paper, felt lost pretty fast. But this post save my day…!

suresh says:

June 26, 2014 at 2:32 am

what is frame buffer ? what is the importance of frame buffer ? And How do we configure ?

NV says:

December 9, 2014 at 11:19 pm

Excellent summary.

Tsviatko Jongov says:

January 14, 2015 at 12:24 am

Excellent post. Great work.

pewpew says:

April 7, 2015 at 11:38 pm

Thank you for the informative and understandable post

Asma Idris says:

August 16, 2015 at 3:33 am

I am working on HEVC motion estimation as my Final year project.
I have extracted frames of video and I have the pixel value of a particular frame like this:
[240x320x3 uint8]
How to partition this value further into PUs? Need Help. Reply as soon as possible

- Ian Murimi says:
  
  April 8, 2016 at 8:14 pm
  
  hello am also doing my final year project on the same. did you get the help you needed? please if you did am also in need… this is my email ianmurimi@hotmail.com
  
Sagar Shinde says:

September 24, 2015 at 1:47 am

good one….

jcamilorada says:

October 8, 2015 at 10:09 am

What a great post, thats the idea of a good teacher make something complex easy to understand. Thanks for share. Which tool did you use for generated graphics?

Pingback: H.265 – The Arrival of Video Compression’s Future – S3 Security Systems
Eric says:

June 14, 2016 at 8:05 pm

It’s really an execellent summary and explaination for newbies to HEVC

Pingback: H.265 / HEVC Codec – 세빛기술 블로그
Jonás Regueira Rodríguez says:

September 4, 2016 at 4:49 am

Congratulations, thank you very much for this great explanation.

Alex Keys says:

December 18, 2016 at 9:06 pm

Great summary. Thanks man

Phong Kah Ho says:

February 27, 2017 at 6:34 pm

Well written, both concise and clear, for one with some prior art experience.

Alexey Pavlov says:

August 14, 2017 at 1:10 am

Could you explain me rule of decision splitting CTB? How identify best size of CB?

siddartha says:

January 3, 2018 at 4:36 am

Very Useful. thank you for the pictorial representation

faxal axim says:

October 30, 2018 at 12:00 pm

SIR from where i can get H,265 matlab codec

Pingback: Improvement of CTU Split Mode Decision in H.265 by Machine Learning, Part 1 – Developer Journal
Pingback: HEVC – What are CTU, CU, CTB, CB, PB, and TB? | CODE: Sequoia – BLOG.DONGHWI.KIM
Pingback: How can I determine if a video can be encoded successfully with HEVC (x265) encoding - Boot Panic
Pingback: Slice vs Tile in H.265 – Cash Chou's Blog

	Ashis Kumar Sahu on Understanding SCTE-35
	Slice vs Tile in H.2… on HEVC – What are CTU, CU,…
	Manish Pednekar on Understanding SCTE-35
	How can I determine… on HEVC – What are CTU, CU,…
	Bartek Zdanowski on Understanding SCTE-35