The keys I perceive, t
+ 32-byte hash.
However my drawback are the values. I perceive from sources comparable to What are the keys used within the blockchain levelDB (ie what are the important thing:worth pairs)? that the values ought to encode three values: dat file quantity, block offset, and tx offset inside block.
However I’ve observed that every worth has a special sizes between 5 and 10 on the primary thousand entries, so I am unsure how one can decode the values into these three fields. Are these fields merely 3 varint values?
Here is my Plyvel code that prints out the lengths utilizing plyvel==1.5.1, Bitcoin Core v26.0.0 on Ubuntu 23.10:
#!/usr/bin/env python3
import struct
import plyvel
def decode_varint(knowledge):
"""
https://github.com/alecalve/python-bitcoin-blockchain-parser/blob/c06f420995b345c9a193c8be6e0916eb70335863/blockchain_parser/utils.py#L41
"""
assert(len(knowledge) > 0)
dimension = int(knowledge[0])
assert(dimension <= 255)
if dimension < 253:
return dimension, 1
if dimension == 253:
format_ = '<H'
elif dimension == 254:
format_ = '<I'
elif dimension == 255:
format_ = '<Q'
else:
# Ought to by no means be reached
assert 0, "unknown format_ for dimension : %s" % dimension
dimension = struct.calcsize(format_)
return struct.unpack(format_, knowledge[1:size+1])[0], dimension + 1
ldb = plyvel.DB('/house/ciro/snap/bitcoin-core/widespread/.bitcoin/indexes/txindex/', compression=None)
i = 0
for key, worth in ldb:
if key[0:1] == b't':
txid = bytes(reversed(key[1:])).hex()
print(i)
print(txid)
print(len(worth))
print(worth.hex(' '))
worth = bytes(reversed(worth))
file, off = decode_varint(worth)
blk_off, off = decode_varint(worth[off:])
tx_off, off = decode_varint(worth[off:])
print((txid, file, blk_off, tx_off))
print()
i += 1
however it will definitely blows up at:
131344
ec4de461b0dd1350b7596f95c0d7576aa825214d9af0e8c54de567ab0ce70800
7
42 ff c0 43 8b 94 35
Traceback (most up-to-date name final):
File "/house/ciro/bak/git/bitcoin-strings-with-txids/./tmp.py", line 39, in <module>
blk_off, off = decode_varint(worth[off:])
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/house/ciro/bak/git/bitcoin-strings-with-txids/./tmp.py", line 29, in decode_varint
return struct.unpack(format_, knowledge[1:size+1])[0], dimension + 1
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
struct.error: unpack requires a buffer of 8 bytes
So I ponder if I guessed the format unsuitable, or if it is only a bug in my code.
Evaluating to: https://en.bitcoin.it/wiki/Protocol_documentation#Variable_length_integer I might decode:
42 ff c0 43 8b 94 35
manually as:
- 42
- ff: count on 8 bytes subsequent
- c0 43 8b 94 35: solely 5 bytes left, blowup
I additionally tried to inverse worth:
worth = bytes(reversed(worth))
however then it blows up very early, undoubtedly unsuitable.
I additionally tried to disregard the error to see if there are others, however there have been a whole bunch them, so one thing is unquestionably unsuitable with my methodology.
Associated: