Wednesday, January 22, 2025

p2sh – Why does a script size of 254 want three bytes?

If we have a look at the community format of the enter part of a transaction message we see

Discipline Description Dimension
Earlier Transaction hash doubled SHA256-hashed of a (earlier) to-be-used > transaction 32 bytes
Earlier Txout-index non detrimental integer indexing an output of the to-> be-used transaction 4 bytes
Txin-script size non detrimental integer VI = VarInt 1 – 9 bytes
Txin-script / scriptSig Script <in-script size>-many bytes
sequence_no usually 0xFFFFFFFF; irrelevant until transaction’s lock_time is > 0 4 bytes

So the Txin-script size is of sort Varint.

Variable size integer

Integer may be encoded relying on the represented worth to save lots of house. Variable size integers at all times precede an array/vector of a sort of information which will differ in size. Longer numbers are encoded in little endian.

Worth Storage size Format
< 0xFD 1 uint8_t
<= 0xFFFF 3 0xFD adopted by the size as uint16_t
<= 0xFFFF FFFF 5 0xFE adopted by the size as uint32_t
9 0xFF adopted by the size as uint64_t

(my emphasis)

So for script-lengths longer than 0xfc and fewer than 0xffff the script-length is represented as a Varint with three bytes, the primary byte of which should be 0xfd in order that the format may be unambiguously differentiated from different lengths of Varint.

  • If the primary byte we learn is decrease than 253 (0xFD) we all know the Varint is one byte lengthy and the worth we simply learn is the size of the script.

  • If the primary byte we learn is 253 (0xFD) we all know the Varint is three bytes lengthy and we have to discard this byte and skim the subsequent two bytes because the size of the script.

  • Ditto 254 (0xFE), 5 bytes, subsequent 4.

  • Ditto 255 (0xFF), 9 bytes, subsequent eight.

So we won’t characterize a script size of 254 in a one-byte Varint as a result of 254 (0xFE) is a particular marker byte-value for a five-byte Varint. As a substitute of utilizing a one-byte Varint, we’ve got to go to the subsequent dimension up of Varint – a three-byte Varint, and use it is particular marker byte worth to prefix a two-byte (Uint16_t) size worth.

The endianness of the Uint16_t is why the primary byte of the final two is the bottom a part of the Uint16_t worth. 254 is 0x00FE however in little-endian byte-order that’s 0xFE00.


What this teaches us is that the builders of this community protocol prioritised compactness of message dimension over readability and ease.

Given the prevalence of different protocols that take the alternative strategy to message dimension (think about SMTP with XML payloads like DOCX and so on or HTTP with HTML payloads) it should be debateable whether or not this was a vital or good selection. Possibly an easier, much less complicated alternative might have been used, optionally wrapped in a compression envelope. However the alternative was made and I assume it’s one which Bitcoin builders should stay with.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles