# TL-B syntax and semantics (https://docs-i0yym09dy-ton-core-docs.vercel.app/llms/foundations/tlb/syntax-and-semantics/content.md)



## Syntax [#syntax]

Each line of a TL-B file is either a TL-B scheme (i.e., type declaration), a comment, or a blank line.

### TL-B Scheme [#tl-b-scheme]

The TL-B scheme describes how to serialize a certain algebraic data structure into a binary format. Here are some examples:

<Image src="/images/tlb-scheme.png" darkSrc="/images/tlb-scheme-dark.png" alt="General syntax of TL-B schemas" />

In general, each TL-B scheme has the following structure:

* **Constructor** that consists of
  * optional constructor name;
  * tag: empty, `$` or `#`;
  * prefix code or `_`.

```tlb
bool_true$1;
transfer#5fcc3d14;
some#_;
_#_.
```

* **Fields definitions**, each of which consists of
  * optional field name (`ident`);
  * type expression (type-expr).

```tlb
val:(## 32);
src:MsgAddressInt;
```

* **Constraints**: optional expressions that restrict values which are instances of the `Nat` type.

```tlb
{ n <= 100 };
{ ~b = a + 10 };
{ anycast = 0 };
{ b >= c }.
```

* **Parameters declarations**: declare fields of types `#` (natural numbers) or `Type` (types of types) that may be used as parameters for parameterized types.
  Always framed by curly `{}` brackets.

```tlb
_ {x:#} my_val:(## x) = A x;
_ {X:Type} my_val:(## 32) next_val:X = A X;
```

* **Combinator name**: the right side of the TL-B scheme that represents the name of the defined combinator. Could be parameterized.

```tlb
_ = MsgAddrSmpl;

// Parameterized combinators
_ = Maybe X;
_ = Hashmap n X.
```

### Comments [#comments]

The comments follow the same conventions as in C++.

```tlb
/*
This is
a comment
*/

// This is a single-line comment
```

## Semantics [#semantics]

From a high-level perspective, the **right-hand side** of each *scheme* is a type, either simple (such as `Bit` or `True`) or parametrized (such as `Hashmap n X`) and
the **left-hand side** describes a way to define, or even to serialize, a value of the type indicated in the right-hand side.

Below, we gradually describe each component of TL-B schemes.

### Constructors [#constructors]

Constructors define a combinator's type, including its state during serialization. Each constructor begins with the (possibly empty `_`) string name,
such as a `message` or `bool_true`, immediately followed by an optional constructor tag, such as `#_` or `$10`, which describes the
bitstring used to encode (serialize) the constructor in question.

Tags may be given in either **binary** (after a dollar sign) or **hexadecimal** notation (after a hash sign).
If a tag is not explicitly provided, the TL-B parser must compute a default 32-bit constructor tag by hashing with
the [CRC32 algorithm](https://en.wikipedia.org/wiki/Cyclic_redundancy_check) the text of the scheme with `| 0x80000000` defining this constructor in a certain fashion. Therefore,
empty tags must be explicitly provided by `#_` or `$_`.

All constructor names must be distinct, and constructor tags for the same combinator must constitute a [prefix code](https://en.wikipedia.org/wiki/Prefix_code#:~:text=A%20prefix%20code%20is%20a,code%20word%20in%20the%20system.)
(otherwise the deserialization would not be unique), i.e., no tag can be a prefix of any other.

Also, there are size limitations:

* maximum number of constructors per type: `64`;
* maximum number of bits for a tag: `63`.

For example, each address in TON could be either an **internal** message or an **external**, see [general info page](/llms/foundations/addresses/overview/content.md).
Addresses are serialized according to the following TL-B schemes:

```tlb
addr_none$00 = MsgAddressExt;
addr_extern$01 ... = MsgAddressExt;
addr_std$10 ...  = MsgAddressInt;
addr_var$11 ... = MsgAddressInt;
...

_ _:MsgAddressInt = MsgAddress;
_ _:MsgAddressExt = MsgAddress;
```

When parsing the binary string `10...` that should be an instance of `MsgAddress` combinator, the parser extracts the initial two bits that determine the tag.
It then understands that this address is further serialized as `add_std` and continues to parse our string relative to the fields defined in this constructor.

All main variations of constructors are presented in the following table:

| Constructor                 | Serialization                                   |
| --------------------------- | ----------------------------------------------- |
| `some#3f5476ca`             | A 32-bit `uint` is serialized from a hex value. |
| `some#5fe`                  | A 12-bit `uint` is serialized from a hex value. |
| `some$0101` or `_$0101`     | Serialize the `0101` raw bits.                  |
| `some` or `some#`           | Serialize `crc32(equation) \| 0x80000000`.      |
| `some#_` or `some$_` or `_` | Serialize nothing.                              |

In addition to the standard hex tag definition, a hexadecimal number may be followed by an underscore `_` character.
This indicates that the tag should be interpreted as the hexadecimal value with the [least significant bit (LSB)](https://en.wikipedia.org/wiki/Bit_numbering#Least_significant_bit) removed.
For example, consider the following schema, which represents a stack integer value:

```tlb
vm_stk_int#0201_ ... = VmStackValue;
```

In this case, the tag is not equal to `0x0201`. To compute the actual tag, remove the LSB from the binary representation of `0x0201`:

```
0000001000000001 -> 000000100000000
```

The resulting tag is the 15-bit binary number `0b000000100000000`.

### Field definitions [#field-definitions]

Field definitions follow each constructor and its optional tag. A field definition has the format `ident:type-expr`, where:

* `ident` is the field's name. If you don't want to assign a specific name to the field, just leave it as `_`.
* `type-expr` is the field's type. It can be a simple type, a parameterized type with appropriate arguments, or a more complex expression.

**Note: the total size of all fields in a type must not exceed the limits of a single cell:  `1023` bits and `4` references**.

TL-B schemes define types. At the same time, the previously defined types can be used in other schemes in fields.
Therefore, in order to properly understand what types can be assigned to fields, we need to simultaneously figure out how to define the types themselves.

### Types [#types]

#### Simple [#simple]

Fields that are simple types are just examples of some previously defined or built-in types. They do not contain parameterization or any conditions.

For example, **Tick** and **Tock** transactions are designated for special system smart contracts that must be automatically invoked in every block.
Tick transactions are executed at the start of each masterchain block, while Tock transactions are initiated at the end. Here is how they are represented in TL-B:

```tlb
trans_tick_tock$001 is_tock:Bool storage_ph:TrStoragePhase
    compute_ph:TrComputePhase action:(Maybe ^TrActionPhase)
    aborted:Bool destroyed:Bool = TransactionDescr;
```

So, `is_tock`, `storage_ph`, `compute_ph`, `aborted`, and `destroyed` are fields with simple types.

Below are all the built-in types that can be used in defining fields:

* `#`: 32-bit unsigned integer;
* `## x`: unsigned integer with `x` bits;
* `#< x`: unsigned integer less than `x` bits, stored as `lenBits(x - 1)` bits up to 31 bits;
* `#<= x`: unsigned integer less than or equal to `x` bits, stored as `lenBits(x)` bits up to 32 bits;
* `Any` or `Cell`: remaining bits and references;
* `uint1`: `uint256` - 1 - 256 bits;
* `int1`: `int257` - 1 - 257 bits;
* `bits1`: `bits1023` - 1 - 1023 bits.

#### Contained complex expressions [#contained-complex-expressions]

* **Multiplicative expression for tuple creation**. The expression `x * T` creates a tuple of the natural length `x`, where each element is of type `T`.

```tlb
a$_ a:(## 32) = A;
b$_ b:(2 * A) = PairOf32-bits-uints;
```

* **Serialization in the ref cell**: `^[ ... ]` means that the fields inside the brackets are serialized in a separate cell, which is referenced from the current cell.

```tlb
_ a:(## 32) ^[ b:(## 32) c:(## 32) d:(## 32)] = A;
```

Chains of references are also allowed. In the following example, each variable (`a`, `b`, `c`) is stored in a separate cell, resulting in a chain of three referenced cells:

```tlb
_ ^[ a:(## 32) ^[ b:(## 32) ^[ c:(## 32) ] ] ] = A;
```

Other complex type expressions are related to the `Nat` type only. The `Nat` type is a built-in type that represents natural numbers.
The types `#`, `## x`, `#< x`, and `#<= x` together constitute the `Nat` type. In TL-B schemes, the `+` and `*` operations can be performed on `Nat`.

* **Constraints**: `Nat = Nat | Nat <= Nat | Nat < Nat | Nat >= Nat | Nat > Nat`. Each constraint must be enclosed in curly braces `{}`, and the variables used inside must be defined earlier.

```tlb
_ flag1:(## 10) flag2:# { flag1 + flag2 <= 100 } = Flag;
```

This constraint means that the sum of the flags fields must be less than or equal to `100`.

* **Condition operator**: `Nat?Type` means that if the natural number is positive, then the field has the type `T`. Otherwise, the field is omitted.

```tlb
_ a:(## 1) b:a?(## 32) = Example;
```

In the `Example` type, the field `b` is serialized only if the `a` field is equal to `1`.

* **Bit selector**: The expression `E . B` means to take bit `B` from the `Nat` value `E`.

```tlb
_ a:(## 2) b:(a . 1)?(## 32) = CondExample;
```

Similarly, in the `CondExample` type, the variable `b` is serialized only if the **second bit** of `a` is `1`.

For the real-world example, one may consider the following `McStateExtra` combinator that describes data stored in each masterchain block.

```tlb
masterchain_state_extra#cc26
  shard_hashes:ShardHashes
  config:ConfigParams
  ^[ flags:(## 16) { flags <= 1 }
     validator_info:ValidatorInfo
     prev_blocks:OldMcBlocksInfo
     after_key_block:Bool
     last_key_block:(Maybe ExtBlkRef)
     block_create_stats:(flags . 0)?BlockCreateStats ]
  global_balance:CurrencyCollection
= McStateExtra;
```

#### Parameterized [#parameterized]

**Parameterized types** are patterns in which other types are parameters. Such parameters are declared in curly brackets `{}` or must be declared previously
as a combinator's field. Only identifiers of the `Nat` and `Type` types can be parameters.

A simple example of a parameterized type is the following definition of a type `A` that is parameterized by a natural number `x`:

```tlb
_ {x:#} my_val:(## x) = A x;
```

During the deserialization process, it fetches an x-bit unsigned integer. For example:

```
_ value:(A 32) = My32UintValue;
```

During the deserialization process of `My32UintValue`, it fetches a 32-bit unsigned integer, as specified by the `32` natural parameter in the `A` type.

Let's consider another example where a combinator `A` parameterized by a type variable `X` is defined:

```tlb
_ {X:Type} my_val:(## 32) next_val:X = A X;
```

During the deserialization process, we will first fetch a 32-bit unsigned integer and then parse the bits and references of the X type.

An example usage of such a parameterized type can be:

```tlb
_ bit:(## 1) = Bit;
_ 32intwbit:(A Bit) = 32IntWithBit;
```

In this example, the `Bit` type is passed to `A` as a parameter.

There is a possibility to use partial applications with such parameterized types:

```tlb
_ {X:Type} {Y:Type} v1:X v2:Y = A X Y;
_ bit:(## 1) = Bit;
_ {X:Type} bits:(A Bit X) = BitA X;
```

Or even apply partial application to parameterized types themselves:

```tlb
_ {X:Type} v1:X = A X;
_ {X:Type} d1:X = B X;
_ {X:Type} bits:(A (B X)) = AB X;
```

It is also possible to use fields defined previously as parameters to types. The serialization will be determined at runtime.

```tlb
_ a:(## 8) b:(## a) = A;
```

This means that the size of the `b` field is stored inside the `a` field. When serializing type `A`, we first load the 8-bit unsigned integer from the `a` field and then use this value to determine the size of the `b` field.

This strategy also works for parameterized types:

```tlb
_ {input:#} c:(## input) = B input;
_ a:(## 8) c_in_b:(B a) = A;
```

Since parameters can be natural numbers, one can use arithmetic operations on them:

```tlb
_ {x:#} value:(## x) = ExampleMult (x * 2);
_ _:(ExampleMult 4) = 2BitInteger;

_ {x:#} value:(## x) = ExampleSum (x + 3);
_ _:(ExampleSum 4) = 1BitInteger;
```

The good real-world example of parameterized types is the definition of TVM tuples:

```tlb
vm_tupref_nil$_ = VmTupleRef 0;
vm_tupref_single$_ entry:^VmStackValue = VmTupleRef 1;
vm_tupref_any$_ {n:#} ref:^(VmTuple (n + 2)) = VmTupleRef (n + 2);
vm_tuple_nil$_ = VmTuple 0;
vm_tuple_tcons$_ {n:#} head:(VmTupleRef n) tail:^VmStackValue = VmTuple (n + 1);
vm_stk_tuple#07 len:(## 16) data:(VmTuple len) = VmStackValue;
```

For a detailed explanation of how it works, see the [complex and non-trivial examples page](/llms/languages/tl-b/complex-and-non-trivial-examples/content.md).

#### Special [#special]

Currently, TVM allows the following types of cells:

* Ordinary
* PrunedBranch
* Library
* MerkleProof
* MerkleUpdate

By default, all cells are classified and parsed as `Ordinary`. This applies to all cells described in the TL-B as well.

To enable the loading of special types in the constructor, prepend `!` before the constructor.

**Example**

```tlb
!merkle_update#02 {X:Type} old_hash:bits256 new_hash:bits256
  old:^X new:^X = MERKLE_UPDATE X;

!merkle_proof#03 {X:Type} virtual_hash:bits256 depth:uint16 virtual_root:^X = MERKLE_PROOF X;
```

This technique allows code generation to mark `SPECIAL` cells when printing a structure and ensures proper validation of structures with special cells.

### Implicit fields and the negate operator (`~`) [#implicit-fields-and-the-negate-operator-]

Some fields may be **implicit**. These fields are defined within curly brackets `{}`, as constraints and parameters of the parametrized types,
indicating that they are not directly serialized. Instead, their values must be deduced from other data, usually the parameters of the type being serialized.

Some occurrences of the indicators already defined earlier in a scheme are prefixed by a tilde `~`. This indicates that the indicator's occurrence is used
oppositely from the default behavior. On the left-hand side of the equation, it means that the indicator is deduced (computed) based on this occurrence,
rather than substituting its type's defined value. Conversely, on the right-hand side, the indicator is not deduced from
the serialized type but instead computed during the deserialization process. In other words,
a `~` transforms an *input argument* into an *output argument* or vice versa.

A simple example of the negate operator is the definition of the implicit indicator `b` based on another indicator `a`:

```tlb
_ a:(## 32) { b:# } { ~b = a + 100 } = B_Calc_Example;
```

So, after deserialization of `a`, the value of `b` is computed as `a + 100`. After this definition, you can use the new indicator as input for `Nat` types:

```tlb
_ a:(## 8) { b:# } { ~b = a + 10 }
  example_dynamic_var:(## b) = B_Calc_Example;
```

The size of `example_dynamic_var` is computed at runtime when we load `a` and use its value to determine the size of `example_dynamic_var`.

Alternatively, it can be applied to other types:

```tlb
_ {X:Type} a:^X = PutToRef X;
_ a:(## 32) { b:# } { ~b = a + 100 }
  my_ref: (PutToRef b) = B_Calc_Example;
```

#### Negate operator (`~`) in type definition [#negate-operator--in-type-definition]

```tlb
_ {m:#} n:(## m) = Define ~n m;
_ {n_from_define:#} defined_val:(Define ~n_from_define 8) real_value:(## n_from_define) = Example;
```

Assume we have a class `Define ~n m` that takes `m` and computes `n` by loading it from an `m`-bit unsigned integer.

In the `Example` type, we store the variable computed by the `Define` type into `n_from_define`. We also know it's an `8`-bit unsigned integer because we apply the `Define` type with `Define ~n_from_define 8`. Now, we can use the `n_from_define` variable for other kinds to determine the serialization process.

This technique leads to more complex type definitions, such as **Unions** that represent dynamic chains of some type.

```tlb
unary_zero$0 = Unary ~0;
unary_succ$1 {n:#} x:(Unary ~n) = Unary ~(n + 1);
_ u:(Unary Any) = UnaryChain;
```

and **Hashmaps**.

```tlb
hm_edge#_ {n:#} {X:Type} {l:#} {m:#} label:(HmLabel ~l n)
          {n = (~m) + l} node:(HashmapNode m X) = Hashmap n X;

hmn_leaf#_ {X:Type} value:X = HashmapNode 0 X;
hmn_fork#_ {n:#} {X:Type} left:^(Hashmap n X)
           right:^(Hashmap n X) = HashmapNode (n + 1) X;

hml_short$0 {m:#} {n:#} len:(Unary ~n) {n <= m} s:(n * Bit) = HmLabel ~n m;
hml_long$10 {m:#} n:(#<= m) s:(n * Bit) = HmLabel ~n m;
hml_same$11 {m:#} v:Bit n:(#<= m) = HmLabel ~n m;

unary_zero$0 = Unary ~0;
unary_succ$1 {n:#} x:(Unary ~n) = Unary ~(n + 1);

hme_empty$0 {n:#} {X:Type} = HashmapE n X;
hme_root$1 {n:#} {X:Type} root:^(Hashmap n X) = HashmapE n X;
```

For a detailed explanation of how these combinators are deserialized, see the [complex and non-trivial examples page](/llms/languages/tl-b/complex-and-non-trivial-examples/content.md).

## References [#references]

* [A description of an older version of TL](https://core.telegram.org/mtproto/TL);
* [`block.tlb`](https://github.com/ton-blockchain/ton/blob/master/crypto/block/block.tlb): the main TL-B file that describes all basic TON blockchain structures.
