The Mathematics Behind Crypto Transactions

Aidan Pak

AIDAN PAK

APR 16, 2023

A quick warning, this article is fairly technical. It is best to have an elementary understanding of how the blockchain works before reading. This Bloomberg article is an excellent primer on blockchain technology. With that forewarning out of the way, let's dive into how private keys allow individuals to be the sole possessors of their assets.

Example Ethereum Transaction From Wallet A to Wallet B

address A = 3685cf9da3eb23c3c82c23cde008b30e6b822fb2ccfe5a27c1ac596bc99450fd:27

address B = 1FWQiwK27EnGXb6BiBMRLJvunJQZZPMcGd

In the Ethereum transaction below, address A sent 0.14367159 BTC to address B. The large alphanumeric strings you see assigned to the variable address A and address B are blockchain addresses. These strings are completely unique and function as the 'username' of the wallet. The blockchain address is also 'essentially' a shortened version of the public key. The public key is a large alphanumeric string that is mathematically derived from the private key. More specifically, the public key is derived by inputting the private key into a specific, one-way, cryptographic function. The reason why a shortened-version of the public key (blockchain address) is used as the 'username' of the wallet rather than the public key itself is for readability. However, thinking of a blockchain address as essentially the public key is appropriate.

Wallet Generation

When blockchain address A was instantiated, the first step was to generate the private key. Using a random number generator, Ethereum software selects a number from the set [1, 2²⁵⁶]. 2²⁵⁶ is an incredibly large number, pronounced 115 quattuorvigintillion… The private key, call it k (k is an integer), is inserted into the equation, K = k * G, where G is a generator point of the form (x,y). The output is a unique K, of the form (x,y), in which just one of the coordinates in the (x,y) pair is then assigned to the public key, establishing the public-private key pairing.

The critical aspect of the equation, K = k * G is that it's not typical multiplication. Rather, the operation '*' is known as elliptic curve point multiplication. This one-way (irreversible) operation is done over the specific curve: y² = (x³ + 7) over the field of prime numbers (𝔽p). While the entire explanation of why elliptic curve multiplication is irreversible is out of scope for this article (full explanation: here), it is a critical aspect of cryptography and hence the secrecy of the public, private key pairing. Specifically, using an irreversible function to generate the public, private key pairing allows for the public key (output of the function) to be publicly available and confidently used as the public blockchain address without the fear of a private key (input) being mathematically computable with the exception of brute force.

Side Note on Irreversibility

If you want to better understand irreversibility, take a look at an example of another irreversible function, Modular arithmetic. It is defined: r = x mod n, where r = the remainder of x divided by n. (e.g: 10 mod 3 = the remainder of 10/3, which equals 1) or (e.g: 14 mod 2 = the remainder of 14/2, which equals 0). Modular arithmetic is irreversible since given an output; there is no mathematical operation/algorithm to solve for the input besides guess and check. For example, take the function: r = x mod 3. If I were to tell you that "r=1", then there would be no way to determine the exact value of x. The value of x could be 4, 7, 10, 13, and so on to infinity. (4 mod 3 = 1, 7 mod 3 = 1, 10 mod 3 = 1). This is significant because imagine if the equation was instead PubKey = PrivKey mod 3. The PubKey variable can be used as a 'username' of a wallet and thus exposed to anyone, and the PrivKey variable 'secret key' can be used to generate the public key, but the private key would not be inferrable via a reverse operation. While Elliptic curve cryptography is much more complex than modular arithmetic, the property of irreversibility similarly holds.

Transactions

So now that we have established wallet generation and the process of creating a public, private key pairing, the intuitive question is how this cryptographic encryption mechanism enables transactions without a central intermediary facilitating the transaction? The process is similar to going to a bank and transferring funds to another party. In order to do so, the bank would need to verify that you are, in fact, the owner of the account and are approving the transaction to send funds to the recipient account. In a crypto transaction, these same criteria are satisfied via mathematical proof. More specifically, the owner of a blockchain address can propose a new transaction by using the private key to cryptographically prove they are, in fact, the owner of the wallet and are therefore approving a new transaction.

When a new transaction is proposed to the network, the request must contain a transaction message (ie: 'address 1 paid address 2 x amount') and a digital signature. A digital signature for a specific public key is the given output when inputting the private key into a ECDSA, a one-way cryptographic function. The digital signature serves as a mathematical proof of ownership since it can only be created by using the private key. Thus, whenever a transaction is accompanied by a digital signature, it is possible to verify that the individual who proposed the new transaction does in fact know the private key and hence "owns" the wallet.

An overview of the process is shown in the top of the graphic below (ignore SHA-256 for now), but essentially, a transaction message (ie: Person A paid Person B 3 Bitcoin) is encrypted by using the private key in the ECDSA equation to generate a digital signature.

This transaction is thus verifiable (bottom of graphic) since anyone can prove the transaction's validity by calculating two results. 1. The result of running SHA-256 on the transaction message. 2. Decrypting the digital signature using the public key. If these two results match, then the verifier can mathematically prove the individual who created the digital signature owns the private key without the individual ever needing to expose the private key itself.

Let's walk through the process in a little more detail through an example. First, we will generate an example public, private key pairing with elliptic curve multiplication (points along curve).

For simplicity, we will now refer to the public key as 'Person A'. Now that we have the public, private key pairing for our blockchain address, the first step in the transaction process is to run the transaction message through SHA-256. SHA-256 is a hashing function, which is defined as a "mathematical process that takes input data of any size, performs an operation on it, and returns output data of a fixed size." Hash functions are a peculiar mathematical phenomena in which the input is extremely hard to guess from the output. This is because the output is always of fixed length and is pseudorandom compared to the input.

Notice the difference between the two hashes below.

Despite the messages only differing by the casing of 'B', the outputs are extremely different. This property is critical during the verification process as it ensures that the original message is not manipulated in any way. If the transaction was not hashed, an individual could simply change the transaction message to say something else after the digital signature is created, and the verification of the transaction would still hold.

Now that we have the public, private key pairing and the hashed message, the next step is to generate the digital signature. The hashed message and private key are plugged into the cryptographic function, ECDSA signing algorithm: signature = k^(-1) * (hash + r * privKey) (mod n)

[ignore k and r, essentially random-generated numbers with a little more math involved]

Below is an example signature:

The 'signature value' was generated by taking the message, running SHA-256, then plugging the value of the hashed message, private key, k, and r, into the ECDSA signing algorithm.

Now the transaction is ready to be submitted to the network and the data will specifically include 1. the original transaction message and 2. the digital signature.

In order to verify the validity of the transaction and that person A is in fact the owner of the private key, a network validator could run the ECDSA verification algorithm. This process decrypts the digital signature using the public key and compares that result to the output of running the transaction message through SHA-256.

Result 1 — Hashed Message

ie: SHA-256 on 'Person A paid Person B 3 Bitcoin'

Result 1 = be188d100d77c431def6911727ea6f98336c0142bbddca7845af6cc76b40648d

Result 2 — ECDSA Verification Algorithm

(Process of decrypting digital signature using public key)

Digital Signature = 304502200f12a3f9a82f4da2a78a55f22d3549262093a4b52897279ce94749d6f9f6d615022100ce2366ddc086d81f05b416c8ce2d1ecea73bb0fd8afe59dab2e3c10031e2a34d

s1 = digital signature-1 (mod 257)

R' = (hash * s1) * G + (r * s1) * pubKey

Result 2 = x coordinate of R

Result 2 = be188d100d77c431def6911727ea6f98336c0142bbddca7845af6cc76b40648d

Notice that Result 1 == Result 2, and from this information, one could mathematically verify that the initiator of the transaction does in fact know the private key and is thus the wallet owner. Using this method, a network validator can thus approve the transaction and submit the exchange to the Blockchain.