How to Decode Input Data from a Transaction?
When transactions are transmitted over the network or exchanged between applications, they are serialized. Serialization is most commonly used for encoding data structures for transmission over a network or for storage in a file. The serialization format of a transaction output is shown in Transaction output serialization. This article focuses on discussing steps to decode input data from a transaction.
Table of Content
- What is Serialization?
- Raw Bitcoin Transaction
- Format of Various Fields in Serialized Transaction
- Decoding the Transaction
- Detailed Explanation
- Verification
- Conclusion
What is Serialization?
Serialization is the process of converting the internal representation of a data structure into a format that can be transmitted one byte at a time, also known as a byte stream.
- The process of converting from the byte-stream representation of a transaction to a library’s internal representation data structure is called deserialization or transaction parsing.
- The process of converting back to a byte stream for transmission over the network, for hashing, or storage on disk is called serialization.
- Most Bitcoin libraries have built-in functions for transaction serialization and deserialization.
Raw Bitcoin Transaction
Below is a sample raw Bitcoin transaction:
0100000002b9a2c28ea00905a5d24a172598b9574fbd973fc085df49901208a358d29a233b000000001716001415ff0337937ecadd10ce56ffdfd4674817613223f0ffffff8415e7099fba6d81474912d22eb5113bcdcfe45a4a1ae0c2701549a46326384f010000001716001415ff0337937ecadd10ce56ffdfd4674817613223f0ffffff02007daf01000000001976a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac2073f6010000000017a914423877331b30a905240c7e1f2adee4ebaa47c5f68700000000
When split into individual fields looks like this:
{
“version”: “01000000”,
“inputcount”: “02”,
“inputs”: [
{
“txid”: “b9a2c28ea00905a5d24a172598b9574fbd973fc085df49901208a358d29a233b”,
“vout”: “00000000”,
“scriptsigsize”: “17”,
“scriptsig”: “16001415ff0337937ecadd10ce56ffdfd4674817613223”,
“sequence”: “f0ffffff”
},
{
“txid”: “8415e7099fba6d81474912d22eb5113bcdcfe45a4a1ae0c2701549a46326384f”,
“vout”: “01000000”,
“scriptsigsize”: “17”,
“scriptsig”: “16001415ff0337937ecadd10ce56ffdfd4674817613223”,
“sequence”: “f0ffffff”
}
],
“outputcount”: “02”,
“outputs”: [
{
“amount”: “007daf0100000000”,
“scriptpubkeysize”: “19”,
“scriptpubkey”: “76a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac”
},
{
“amount”: “2073f60100000000”,
“scriptpubkeysize”: “17”,
“scriptpubkey”: “a914423877331b30a905240c7e1f2adee4ebaa47c5f687”
}
],
“locktime”: “00000000”
}
Format of Various Fields in Serialized Transaction
Version
Example: 01000000
Size: 4 bytes
Format: Little-Endian
Description: The version number for the transaction. Used to enable new features.
Input Count
Example: 02
Size: variable
Format: Compact Size
Description: Indicates the number of inputs.
Input(s)
TXID:
Example: b9a2c28ea00905a5d24a172598b9574fbd973fc085df49901208a358d29a233b
Size: 32 bytes
Format: Natural Byte Order
Description: The TXID of the transaction containing the output you want to spend.
VOUT:
Example: 00000000
Size: 4 bytes
Format: Little-Endian
Description: The index number of the output you want to spend.
ScriptSig Size:
Example: 17
Size: variable
Format: Compact Size
Description: The size in bytes of the upcoming ScriptSig.
ScriptSig:
Example: 16001415ff0337937ecadd10ce56ffdfd4674817613223
Size: variable
Format: Script
Description: The unlocking code for the output you want to spend.
Sequence:
Example: f0ffffff
Size: 4 bytes
Format: Little-Endian
Description: Set whether the transaction can be replaced or when it can be mined.
Output Count
Example: 02
Size: variable
Format: Compact Size
Description: Indicates the number of outputs.
Output(s)
Amount:
Example: 007daf0100000000
Size: 8 bytes
Format: Little-Endian
Description: The value of the output in satoshis.
ScriptPubKey Size:
Example: 19
Size: variable
Format: Compact Size
Description: The size in bytes of the upcoming ScriptPubKey.
ScriptPubKey:
Example: 76a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac
Size: variable
Format: Script
Description: The locking code for this output.
Witness
Stack Items:
Example: 02
Size: variable
Format: Compact Size
Description: The number of items to be pushed on to the stack as part of the unlocking code.
Field
Size:
Example: 47
Size: variable
Format: Compact Size
Description: The size of the upcoming stack item.
Item:
Example: 304…b01
Size: variable
Format: Bytes
Description: The data to be pushed on to the stack.
Locktime
Example: 00000000
Size: 4 bytes
Format: Little-Endian
Description: Set a time or height after which the transaction can be mined.
Note:
When any transaction is serialized these fields are added in the given sequence only.
Decoding the Transaction
Below is the Python program to decode the transaction:
import struct
def decode_varint(stream):
n = stream[0]
if n < 0xFD:
return n, 1
elif n == 0xFD:
return struct.unpack("<H", stream[1:3])[0], 3
elif n == 0xFE:
return struct.unpack("<I", stream[1:5])[0], 5
elif n == 0xFF:
return struct.unpack("<Q", stream[1:9])[0], 9
def deserialize_tx(serialized_tx):
offset = 0
version = struct.unpack("<L", serialized_tx[offset:offset+4])[0]
offset += 4
vin_count, vin_offset = decode_varint(serialized_tx[offset:])
offset += vin_offset
vin = []
for _ in range(vin_count):
txid = serialized_tx[offset:offset+32][::-1].hex()
offset += 32
vout = struct.unpack("<L", serialized_tx[offset:offset+4])[0]
offset += 4
script_length, script_offset = decode_varint(serialized_tx[offset:])
offset += script_offset
scriptsig = serialized_tx[offset:offset+script_length].hex()
offset += script_length
sequence = struct.unpack("<L", serialized_tx[offset:offset+4])[0]
offset += 4
vin.append({"txid": txid, "vout": vout, "scriptsig": scriptsig,
"sequence": sequence})
vout_count, vout_offset = decode_varint(serialized_tx[offset:])
offset += vout_offset
vout = []
for _ in range(vout_count):
value = struct.unpack("<Q", serialized_tx[offset:offset+8])[0]
offset += 8
script_length, script_offset = decode_varint(serialized_tx[offset:])
offset += script_offset
scriptpubkey = serialized_tx[offset:offset+script_length].hex()
offset += script_length
vout.append({"value": value, "scriptpubkey": scriptpubkey})
locktime = struct.unpack("<L", serialized_tx[offset:offset+4])[0]
return {
"version": version,
"vin": vin,
"vout": vout,
"locktime": locktime
}
serialized_tx = "0100000002b9a2c28ea00905a5d24a172598b9574fbd973fc085df49901208a358d29a233b000000001716001415ff0337937ecadd10ce56ffdfd4674817613223f0ffffff8415e7099fba6d81474912d22eb5113bcdcfe45a4a1ae0c2701549a46326384f010000001716001415ff0337937ecadd10ce56ffdfd4674817613223f0ffffff02007daf01000000001976a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac2073f6010000000017a914423877331b30a905240c7e1f2adee4ebaa47c5f68700000000"
transaction = deserialize_tx(bytes.fromhex(serialized_tx))
print("Version:", transaction["version"])
print("Vin:")
for inp in transaction["vin"]:
print(" TXID:", inp["txid"])
print(" Vout:", inp["vout"])
print(" ScriptSig:", inp["scriptsig"])
print(" Sequence:", inp["sequence"])
print("Vout:")
for out in transaction["vout"]:
print(" Value:", out["value"])
print(" ScriptPubKey:", out["scriptpubkey"])
print("Locktime:", transaction["locktime"])
Output:
Detailed Explanation
1. decode_varint(stream) Function
- This function decodes a variable-length integer from the byte stream.
- It reads the first byte to determine the format of the integer:
- If the byte is less than 0xFD, it’s a single-byte integer.
- If the byte is 0xFD, the next 2 bytes represent the integer.
- If the byte is 0xFE, the next 4 bytes represent the integer.
- If the byte is 0xFF, the next 8 bytes represent the integer.
- The function returns the decoded integer and the number of bytes consumed.
2. deserialize_tx(serialized_tx) Function
- This function takes the raw byte format of a Bitcoin transaction and deserializes it into a dictionary.
- It begins by setting the offset to 0.
- It reads the transaction version (4 bytes, little-endian) from the byte stream.
- It decodes the number of transaction inputs (vin_count) and advances the offset accordingly.
- For each input (vin), it:
- Reads the transaction ID (txid, 32 bytes, reversed) and converts it to hexadecimal.
- Reads the output index (vout, 4 bytes, little-endian).
- Decodes the length of the scriptSig (script_length) and reads the scriptSig.
- Reads the sequence number (4 bytes, little-endian).
- Appends these values to the vin list.
- It decodes the number of transaction outputs (vout_count) and advances the offset accordingly.
- For each output (vout), it:
- Reads the output value (value, 8 bytes, little-endian).
- Decodes the length of the scriptPubKey (script_length) and reads the scriptPubKey.
- Appends these values to the vout list.
- Finally, it reads the transaction locktime (4 bytes, little-endian).
- It returns a dictionary containing the version, list of inputs (vin), list of outputs (vout), and locktime.
3. Deserialization of Serialized Transaction
- The serialized transaction is provided as a hexadecimal string.
- The deserialize_tx function is called with the byte format of the serialized transaction.
- The function returns a dictionary representing the deserialized transaction.
- The deserialized transaction is then printed, including its version, inputs (vin), outputs (vout), and locktime.
This process allows you to convert a raw Bitcoin transaction into a structured format for analysis or further processing.
Verification
To verify the output, you can use the following code. Put the fields that you get in the required format, and match the serialized transaction that comes as the output.
Below is the Python program to verify the serialized transaction:
import hashlib
import struct
import json
def serialize_tx(transaction):
serialized = b""
# Version
serialized += struct.pack("<L", transaction["version"])
# Input count
serialized += encode_varint(len(transaction["vin"]))
for inp in transaction["vin"]:
# TXID in little-endian format
txid_bytes = bytes.fromhex(inp["txid"])[::-1]
serialized += txid_bytes
# Output index
serialized += struct.pack("<L", inp["vout"])
# ScriptSig
scriptsig_bytes = bytes.fromhex(inp.get("scriptsig", ""))
# Script length
serialized += encode_varint(len(scriptsig_bytes))
serialized += scriptsig_bytes
# Sequence
serialized += struct.pack("<L", inp["sequence"])
# Output count
serialized += encode_varint(len(transaction["vout"]))
for out in transaction["vout"]:
# Output value
serialized += struct.pack("<Q", out["value"])
# ScriptPubKey
scriptpubkey_bytes = bytes.fromhex(out["scriptpubkey"])
# ScriptPubKey size
serialized += encode_varint(len(scriptpubkey_bytes))
serialized += scriptpubkey_bytes
# Locktime
serialized += struct.pack("<L", transaction["locktime"])
txid = hashlib.sha256(hashlib.sha256(serialized).digest()).digest()[::-1].hex()
return serialized
def compact_size(value):
if value < 0xFD:
return value.to_bytes(1, "little")
elif value <= 0xFFFF:
return b"\xfd" + value.to_bytes(2, "little")
elif value <= 0xFFFFFFFF:
return b"\xfe" + value.to_bytes(4, "little")
else:
return b"\xff" + value.to_bytes(8, "little")
def encode_varint(n):
if n < 0xFD:
return struct.pack("<B", n)
elif n <= 0xFFFF:
return b"\xfd" + struct.pack("<H", n)
elif n <= 0xFFFFFFFF:
return b"\xfe" + struct.pack("<I", n)
else:
return b"\xff" + struct.pack("<Q", n)
# Example transaction
transaction = {
"version": 1,
"locktime": 0,
"vin": [
{
"txid": "3b239ad258a308129049df85c03f97bd4f57b99825174ad2a50509a08ec2a2b9",
"vout": 0,
"prevout": {
"scriptpubkey": "a914423877331b30a905240c7e1f2adee4ebaa47c5f687",
"scriptpubkey_asm": "OP_HASH160 OP_PUSHBYTES_20 423877331b30a905240c7e1f2adee4ebaa47c5f6 OP_EQUAL",
"scriptpubkey_type": "p2sh",
"scriptpubkey_address": "37jAAWEdJ9D9mXybRobcveioxSkt7Lkwog",
"value": 2504928
},
"scriptsig": "16001415ff0337937ecadd10ce56ffdfd4674817613223",
"scriptsig_asm": "OP_PUSHBYTES_22 001415ff0337937ecadd10ce56ffdfd4674817613223",
"witness": [
"3044022037656a38eff538cb3ccdcd4f47ca80118bcfd60414363b7bc08b1469adedece902206f182c48452cf2b6e6897b03354ed32d0357fc3607f5089b509fef17c498a5cd01",
"035658f6dd92339165f76caff02f63316433c4b68a247d40b1b323fb1690279e42"
],
"is_coinbase": False,
"sequence": 4294967280,
"inner_redeemscript_asm": "OP_0 OP_PUSHBYTES_20 15ff0337937ecadd10ce56ffdfd4674817613223"
},
{
"txid": "4f382663a4491570c2e01a4a5ae4cfcd3b11b52ed2124947816dba9f09e71584",
"vout": 1,
"prevout": {
"scriptpubkey": "a914423877331b30a905240c7e1f2adee4ebaa47c5f687",
"scriptpubkey_asm": "OP_HASH160 OP_PUSHBYTES_20 423877331b30a905240c7e1f2adee4ebaa47c5f6 OP_EQUAL",
"scriptpubkey_type": "p2sh",
"scriptpubkey_address": "37jAAWEdJ9D9mXybRobcveioxSkt7Lkwog",
"value": 58711992
},
"scriptsig": "16001415ff0337937ecadd10ce56ffdfd4674817613223",
"scriptsig_asm": "OP_PUSHBYTES_22 001415ff0337937ecadd10ce56ffdfd4674817613223",
"witness": [
"3044022040a303cd51c50bdf296e89661f8bb3ea411d639c5149cab461a63855d5bf97c502207ac0ce4133e22daba97abc5d65f219c49cea1959247ec398c9c6eaceb4fcd9d801",
"035658f6dd92339165f76caff02f63316433c4b68a247d40b1b323fb1690279e42"
],
"is_coinbase": False,
"sequence": 4294967280,
"inner_redeemscript_asm": "OP_0 OP_PUSHBYTES_20 15ff0337937ecadd10ce56ffdfd4674817613223"
}
],
"vout": [
{
"scriptpubkey": "76a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac",
"scriptpubkey_asm": "OP_DUP OP_HASH160 OP_PUSHBYTES_20 71a3d2f54b0917dc9d2c877b2861ac52967dec7f OP_EQUALVERIFY OP_CHECKSIG",
"scriptpubkey_type": "p2pkh",
"scriptpubkey_address": "1BMscNZbFKdUDYi3bnF5XEmkWT3WPmRBDJ",
"value": 28278016
},
{
"scriptpubkey": "a914423877331b30a905240c7e1f2adee4ebaa47c5f687",
"scriptpubkey_asm": "OP_HASH160 OP_PUSHBYTES_20 423877331b30a905240c7e1f2adee4ebaa47c5f6 OP_EQUAL",
"scriptpubkey_type": "p2sh",
"scriptpubkey_address": "37jAAWEdJ9D9mXybRobcveioxSkt7Lkwog",
"value": 32928544
}
]
}
Serialized=serialize_tx(transaction)
print("Serialized Transaction : ", Serialized.hex())
txid = hashlib.sha256(hashlib.sha256(serialize_tx(transaction)).digest()).digest()[::-1].hex()
print("TXID of transaction : ", txid)
Output:
The output is the same serialized transaction that we took in the beginning to deserialize.
Conclusion
Decoding input data from a transaction is a critical process and it involves several steps like identifying the type of transaction, understanding the format of various fields in the serialized transaction, decoding the transaction, and verification of the decoded transaction.
FAQs related to How to Decode Input Data from a Transaction?
1. What is input data in transaction?
The input data refers to the data included in the transaction such as function calls, parameters, and other instructions for the smart contracts that tells the blockchain what to do with the transaction.
2. List some tools that can be used to decode input data from a transaction.
Tools that can be used to decode input data from a transaction includes Etherscan, Web3.js, Ether.js, ABI Decoder.
3. Why it is important to decode the input data from transactions?
Decoing input data from transactions helps to verify the purpose and authenticity of transactions, debugging and developing smart contracts, and monitoring and analyzing the blockchain activity.