How to Decode Input Data from a Transaction?

When transactions are transmitted over the network or exchanged between applications, they are serialized. Serialization is most commonly used for encoding data structures for transmission over a network or for storage in a file. The serialization format of a transaction output is shown in Transaction output serialization. This article focuses on discussing steps to decode input data from a transaction.

Table of Content

  • What is Serialization?
  • Raw Bitcoin Transaction
  • Format of Various Fields in Serialized Transaction
  • Decoding the Transaction
  • Detailed Explanation
  • Verification
  • Conclusion

What is Serialization?

Serialization is the process of converting the internal representation of a data structure into a format that can be transmitted one byte at a time, also known as a byte stream.

  1. The process of converting from the byte-stream representation of a transaction to a library’s internal representation data structure is called deserialization or transaction parsing.
  2. The process of converting back to a byte stream for transmission over the network, for hashing, or storage on disk is called serialization.
  3. Most Bitcoin libraries have built-in functions for transaction serialization and deserialization.

Raw Bitcoin Transaction

Below is a sample raw Bitcoin transaction:

0100000002b9a2c28ea00905a5d24a172598b9574fbd973fc085df49901208a358d29a233b000000001716001415ff0337937ecadd10ce56ffdfd4674817613223f0ffffff8415e7099fba6d81474912d22eb5113bcdcfe45a4a1ae0c2701549a46326384f010000001716001415ff0337937ecadd10ce56ffdfd4674817613223f0ffffff02007daf01000000001976a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac2073f6010000000017a914423877331b30a905240c7e1f2adee4ebaa47c5f68700000000

When split into individual fields looks like this:

{

“version”: “01000000”,

“inputcount”: “02”,

“inputs”: [

{

“txid”: “b9a2c28ea00905a5d24a172598b9574fbd973fc085df49901208a358d29a233b”,

“vout”: “00000000”,

“scriptsigsize”: “17”,

“scriptsig”: “16001415ff0337937ecadd10ce56ffdfd4674817613223”,

“sequence”: “f0ffffff”

},

{

“txid”: “8415e7099fba6d81474912d22eb5113bcdcfe45a4a1ae0c2701549a46326384f”,

“vout”: “01000000”,

“scriptsigsize”: “17”,

“scriptsig”: “16001415ff0337937ecadd10ce56ffdfd4674817613223”,

“sequence”: “f0ffffff”

}

],

“outputcount”: “02”,

“outputs”: [

{

“amount”: “007daf0100000000”,

“scriptpubkeysize”: “19”,

“scriptpubkey”: “76a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac”

},

{

“amount”: “2073f60100000000”,

“scriptpubkeysize”: “17”,

“scriptpubkey”: “a914423877331b30a905240c7e1f2adee4ebaa47c5f687”

}

],

“locktime”: “00000000”

}

Format of Various Fields in Serialized Transaction

Version

Example: 01000000

Size: 4 bytes

Format: Little-Endian

Description: The version number for the transaction. Used to enable new features.

Input Count

Example: 02

Size: variable

Format: Compact Size

Description: Indicates the number of inputs.

Input(s)

TXID:

Example: b9a2c28ea00905a5d24a172598b9574fbd973fc085df49901208a358d29a233b

Size: 32 bytes

Format: Natural Byte Order

Description: The TXID of the transaction containing the output you want to spend.

VOUT:

Example: 00000000

Size: 4 bytes

Format: Little-Endian

Description: The index number of the output you want to spend.

ScriptSig Size:

Example: 17

Size: variable

Format: Compact Size

Description: The size in bytes of the upcoming ScriptSig.

ScriptSig:

Example: 16001415ff0337937ecadd10ce56ffdfd4674817613223

Size: variable

Format: Script

Description: The unlocking code for the output you want to spend.

Sequence:

Example: f0ffffff

Size: 4 bytes

Format: Little-Endian

Description: Set whether the transaction can be replaced or when it can be mined.

Output Count

Example: 02

Size: variable

Format: Compact Size

Description: Indicates the number of outputs.

Output(s)

Amount:

Example: 007daf0100000000

Size: 8 bytes

Format: Little-Endian

Description: The value of the output in satoshis.

ScriptPubKey Size:

Example: 19

Size: variable

Format: Compact Size

Description: The size in bytes of the upcoming ScriptPubKey.

ScriptPubKey:

Example: 76a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac

Size: variable

Format: Script

Description: The locking code for this output.

Witness

Stack Items:

Example: 02

Size: variable

Format: Compact Size

Description: The number of items to be pushed on to the stack as part of the unlocking code.

Field

Size:

Example: 47

Size: variable

Format: Compact Size

Description: The size of the upcoming stack item.

Item:

Example: 304…b01

Size: variable

Format: Bytes

Description: The data to be pushed on to the stack.

Locktime

Example: 00000000

Size: 4 bytes

Format: Little-Endian

Description: Set a time or height after which the transaction can be mined.

Note:

When any transaction is serialized these fields are added in the given sequence only.

Decoding the Transaction

Below is the Python program to decode the transaction:

Python
import struct

def decode_varint(stream):
    n = stream[0]
    if n < 0xFD:
        return n, 1
    elif n == 0xFD:
        return struct.unpack("<H", stream[1:3])[0], 3
    elif n == 0xFE:
        return struct.unpack("<I", stream[1:5])[0], 5
    elif n == 0xFF:
        return struct.unpack("<Q", stream[1:9])[0], 9

def deserialize_tx(serialized_tx):
    offset = 0
    
    version = struct.unpack("<L", serialized_tx[offset:offset+4])[0]
    offset += 4
    
    vin_count, vin_offset = decode_varint(serialized_tx[offset:])
    offset += vin_offset
    vin = []
    for _ in range(vin_count):
        txid = serialized_tx[offset:offset+32][::-1].hex()
        offset += 32
        vout = struct.unpack("<L", serialized_tx[offset:offset+4])[0]
        offset += 4
        script_length, script_offset = decode_varint(serialized_tx[offset:])
        offset += script_offset
        scriptsig = serialized_tx[offset:offset+script_length].hex()
        offset += script_length
        sequence = struct.unpack("<L", serialized_tx[offset:offset+4])[0]
        offset += 4
        vin.append({"txid": txid, "vout": vout, "scriptsig": scriptsig, 
                    "sequence": sequence})
    
    vout_count, vout_offset = decode_varint(serialized_tx[offset:])
    offset += vout_offset
    vout = []
    for _ in range(vout_count):
        value = struct.unpack("<Q", serialized_tx[offset:offset+8])[0]
        offset += 8
        script_length, script_offset = decode_varint(serialized_tx[offset:])
        offset += script_offset
        scriptpubkey = serialized_tx[offset:offset+script_length].hex()
        offset += script_length
        vout.append({"value": value, "scriptpubkey": scriptpubkey})
    
    locktime = struct.unpack("<L", serialized_tx[offset:offset+4])[0]
    
    return {
        "version": version,
        "vin": vin,
        "vout": vout,
        "locktime": locktime
    }

serialized_tx = "0100000002b9a2c28ea00905a5d24a172598b9574fbd973fc085df49901208a358d29a233b000000001716001415ff0337937ecadd10ce56ffdfd4674817613223f0ffffff8415e7099fba6d81474912d22eb5113bcdcfe45a4a1ae0c2701549a46326384f010000001716001415ff0337937ecadd10ce56ffdfd4674817613223f0ffffff02007daf01000000001976a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac2073f6010000000017a914423877331b30a905240c7e1f2adee4ebaa47c5f68700000000"
transaction = deserialize_tx(bytes.fromhex(serialized_tx))
print("Version:", transaction["version"])
print("Vin:")
for inp in transaction["vin"]:
    print("  TXID:", inp["txid"])
    print("  Vout:", inp["vout"])
    print("  ScriptSig:", inp["scriptsig"])
    print("  Sequence:", inp["sequence"])
print("Vout:")
for out in transaction["vout"]:
    print("  Value:", out["value"])
    print("  ScriptPubKey:", out["scriptpubkey"])
print("Locktime:", transaction["locktime"])

Output:

Decoded Transaction

Detailed Explanation

1. decode_varint(stream) Function

  • This function decodes a variable-length integer from the byte stream.
  • It reads the first byte to determine the format of the integer:
    • If the byte is less than 0xFD, it’s a single-byte integer.
    • If the byte is 0xFD, the next 2 bytes represent the integer.
    • If the byte is 0xFE, the next 4 bytes represent the integer.
    • If the byte is 0xFF, the next 8 bytes represent the integer.
  • The function returns the decoded integer and the number of bytes consumed.

2. deserialize_tx(serialized_tx) Function

  • This function takes the raw byte format of a Bitcoin transaction and deserializes it into a dictionary.
  • It begins by setting the offset to 0.
  • It reads the transaction version (4 bytes, little-endian) from the byte stream.
  • It decodes the number of transaction inputs (vin_count) and advances the offset accordingly.
  • For each input (vin), it:
    • Reads the transaction ID (txid, 32 bytes, reversed) and converts it to hexadecimal.
    • Reads the output index (vout, 4 bytes, little-endian).
    • Decodes the length of the scriptSig (script_length) and reads the scriptSig.
    • Reads the sequence number (4 bytes, little-endian).
    • Appends these values to the vin list.
  • It decodes the number of transaction outputs (vout_count) and advances the offset accordingly.
  • For each output (vout), it:
    • Reads the output value (value, 8 bytes, little-endian).
    • Decodes the length of the scriptPubKey (script_length) and reads the scriptPubKey.
    • Appends these values to the vout list.
    • Finally, it reads the transaction locktime (4 bytes, little-endian).
    • It returns a dictionary containing the version, list of inputs (vin), list of outputs (vout), and locktime.

3. Deserialization of Serialized Transaction

  • The serialized transaction is provided as a hexadecimal string.
  • The deserialize_tx function is called with the byte format of the serialized transaction.
  • The function returns a dictionary representing the deserialized transaction.
  • The deserialized transaction is then printed, including its version, inputs (vin), outputs (vout), and locktime.

This process allows you to convert a raw Bitcoin transaction into a structured format for analysis or further processing.

Verification

To verify the output, you can use the following code. Put the fields that you get in the required format, and match the serialized transaction that comes as the output.

Below is the Python program to verify the serialized transaction:

Python
import hashlib
import struct
import json

def serialize_tx(transaction):
    serialized = b""
    
    # Version
    serialized += struct.pack("<L", transaction["version"])  

    # Input count
    serialized += encode_varint(len(transaction["vin"]))  
    for inp in transaction["vin"]:
      
        # TXID in little-endian format
        txid_bytes = bytes.fromhex(inp["txid"])[::-1]  
        serialized += txid_bytes
        
        # Output index
        serialized += struct.pack("<L", inp["vout"])  
        
        # ScriptSig
        scriptsig_bytes = bytes.fromhex(inp.get("scriptsig", ""))
        
        # Script length
        serialized += encode_varint(len(scriptsig_bytes))  
        serialized += scriptsig_bytes
        
        # Sequence
        serialized += struct.pack("<L", inp["sequence"])  

    # Output count    
    serialized += encode_varint(len(transaction["vout"]))  
    for out in transaction["vout"]:
      
        # Output value
        serialized += struct.pack("<Q", out["value"])  
        
        # ScriptPubKey
        scriptpubkey_bytes = bytes.fromhex(out["scriptpubkey"])
        
        # ScriptPubKey size
        serialized += encode_varint(len(scriptpubkey_bytes))  
        serialized += scriptpubkey_bytes

    # Locktime
    serialized += struct.pack("<L", transaction["locktime"])  

    txid = hashlib.sha256(hashlib.sha256(serialized).digest()).digest()[::-1].hex()

    return serialized


def compact_size(value):
    if value < 0xFD:
        return value.to_bytes(1, "little")
    elif value <= 0xFFFF:
        return b"\xfd" + value.to_bytes(2, "little")
    elif value <= 0xFFFFFFFF:
        return b"\xfe" + value.to_bytes(4, "little")
    else:
        return b"\xff" + value.to_bytes(8, "little")


def encode_varint(n):
    if n < 0xFD:
        return struct.pack("<B", n)
    elif n <= 0xFFFF:
        return b"\xfd" + struct.pack("<H", n)
    elif n <= 0xFFFFFFFF:
        return b"\xfe" + struct.pack("<I", n)
    else:
        return b"\xff" + struct.pack("<Q", n)

# Example transaction
transaction = {
  "version": 1,
  "locktime": 0,
  "vin": [
    {
      "txid": "3b239ad258a308129049df85c03f97bd4f57b99825174ad2a50509a08ec2a2b9",
      "vout": 0,
      "prevout": {
        "scriptpubkey": "a914423877331b30a905240c7e1f2adee4ebaa47c5f687",
        "scriptpubkey_asm": "OP_HASH160 OP_PUSHBYTES_20 423877331b30a905240c7e1f2adee4ebaa47c5f6 OP_EQUAL",
        "scriptpubkey_type": "p2sh",
        "scriptpubkey_address": "37jAAWEdJ9D9mXybRobcveioxSkt7Lkwog",
        "value": 2504928
      },
      "scriptsig": "16001415ff0337937ecadd10ce56ffdfd4674817613223",
      "scriptsig_asm": "OP_PUSHBYTES_22 001415ff0337937ecadd10ce56ffdfd4674817613223",
      "witness": [
        "3044022037656a38eff538cb3ccdcd4f47ca80118bcfd60414363b7bc08b1469adedece902206f182c48452cf2b6e6897b03354ed32d0357fc3607f5089b509fef17c498a5cd01",
        "035658f6dd92339165f76caff02f63316433c4b68a247d40b1b323fb1690279e42"
      ],
      "is_coinbase": False,
      "sequence": 4294967280,
      "inner_redeemscript_asm": "OP_0 OP_PUSHBYTES_20 15ff0337937ecadd10ce56ffdfd4674817613223"
    },
    {
      "txid": "4f382663a4491570c2e01a4a5ae4cfcd3b11b52ed2124947816dba9f09e71584",
      "vout": 1,
      "prevout": {
        "scriptpubkey": "a914423877331b30a905240c7e1f2adee4ebaa47c5f687",
        "scriptpubkey_asm": "OP_HASH160 OP_PUSHBYTES_20 423877331b30a905240c7e1f2adee4ebaa47c5f6 OP_EQUAL",
        "scriptpubkey_type": "p2sh",
        "scriptpubkey_address": "37jAAWEdJ9D9mXybRobcveioxSkt7Lkwog",
        "value": 58711992
      },
      "scriptsig": "16001415ff0337937ecadd10ce56ffdfd4674817613223",
      "scriptsig_asm": "OP_PUSHBYTES_22 001415ff0337937ecadd10ce56ffdfd4674817613223",
      "witness": [
        "3044022040a303cd51c50bdf296e89661f8bb3ea411d639c5149cab461a63855d5bf97c502207ac0ce4133e22daba97abc5d65f219c49cea1959247ec398c9c6eaceb4fcd9d801",
        "035658f6dd92339165f76caff02f63316433c4b68a247d40b1b323fb1690279e42"
      ],
      "is_coinbase": False,
      "sequence": 4294967280,
      "inner_redeemscript_asm": "OP_0 OP_PUSHBYTES_20 15ff0337937ecadd10ce56ffdfd4674817613223"
    }
  ],
  "vout": [
    {
      "scriptpubkey": "76a91471a3d2f54b0917dc9d2c877b2861ac52967dec7f88ac",
      "scriptpubkey_asm": "OP_DUP OP_HASH160 OP_PUSHBYTES_20 71a3d2f54b0917dc9d2c877b2861ac52967dec7f OP_EQUALVERIFY OP_CHECKSIG",
      "scriptpubkey_type": "p2pkh",
      "scriptpubkey_address": "1BMscNZbFKdUDYi3bnF5XEmkWT3WPmRBDJ",
      "value": 28278016
    },
    {
      "scriptpubkey": "a914423877331b30a905240c7e1f2adee4ebaa47c5f687",
      "scriptpubkey_asm": "OP_HASH160 OP_PUSHBYTES_20 423877331b30a905240c7e1f2adee4ebaa47c5f6 OP_EQUAL",
      "scriptpubkey_type": "p2sh",
      "scriptpubkey_address": "37jAAWEdJ9D9mXybRobcveioxSkt7Lkwog",
      "value": 32928544
    }
  ]
}

Serialized=serialize_tx(transaction)
print("Serialized Transaction : ", Serialized.hex())
txid = hashlib.sha256(hashlib.sha256(serialize_tx(transaction)).digest()).digest()[::-1].hex()
print("TXID of transaction : ", txid)

Output:

The output is the same serialized transaction that we took in the beginning to deserialize.

Serialized Transaction

Conclusion

Decoding input data from a transaction is a critical process and it involves several steps like identifying the type of transaction, understanding the format of various fields in the serialized transaction, decoding the transaction, and verification of the decoded transaction.

FAQs related to How to Decode Input Data from a Transaction?

1. What is input data in transaction?

The input data refers to the data included in the transaction such as function calls, parameters, and other instructions for the smart contracts that tells the blockchain what to do with the transaction.

2. List some tools that can be used to decode input data from a transaction.

Tools that can be used to decode input data from a transaction includes Etherscan, Web3.js, Ether.js, ABI Decoder.

3. Why it is important to decode the input data from transactions?

Decoing input data from transactions helps to verify the purpose and authenticity of transactions, debugging and developing smart contracts, and monitoring and analyzing the blockchain activity.