Part Three: Creating and Signing Ethereum Transactions

Transactions must be signed by a private key in order to hold any value. Since all digital assets are created from transactions, signatures play a critical role in any blockchain. In this article, we’ll learn how Ethereum transactions are signed.

Jose Aguinaga
Portis

--

In our previous articles, we went through the process of creating a private key and learned what can be done with it. Specifically, in part one, we learned that keys are nothing more than random numbers of astronomical proportions, whereas in part two we looked at how these numbers can create Ethereum wallets that hold digital assets. In the final part of our miniseries, we will explore how to use these numbers to sign operations to manipulate digital assets and more.

Transactions — out with the old, in with the new

Blockchain transactions are not that different from banking transactions. Sending money to someone, moving money to your savings account, and even taking out loans are all things that can easily be done digitally with most banks nowadays. The internet era has given us the ability to execute most of these actions online without any physical interaction.

Despite their ease of use, these actions are by no means easy nor cheap for the banks. Behind the scenes, there are several third parties involved in the clearance, verification, and confirmation of your banking operations. If that wasn’t enough, to remain compliant with banking regulations, financial institutions have to go through lengthy measures to make sure that you are indeed the individual authorized to operate your account. All these layers incur a major cost and are one of the many reasons why payment gateways like Visa or Mastercard charge fees on every purchase made, usually based on the amount transacted. And of course, at any point in time, any of these operations can be frozen, and in some cases, reverted.

In 2017, the European Union required financial institutions to implement the Payment Services Directive 2 (Directive 2015/2366), which meant that banks had to implement Strong Customer Authentication​ (SCA) alongside other security directives. Due to the costs of PSD2+SCA, by March of 2019, only 59% of European banks managed to comply with the requirements, pushing the deadline by another year.

Blockchain transactions behave under a different set of rules

Due to the distributed and permissionless nature of a public blockchain, anyone can sign and broadcast transactions to the network. Depending on the blockchain, you will incur a fee for allowing the transaction to be “mined” (i.e., picked up by a miner and included in the blockchain), but the fee will be usually based on user demand in the blockchain, and not the value of the assets in the transaction. For example, sending $1 from one Ethereum account to another will cost the same amount as sending $1M. Both transactions can be equally accepted by miners, which will append them into valid blocks to be broadcast to the blockchain.

Blocks contain a series of transactions and are appended one after the other. Since part of the data for calculating a new block comes from the previous block, the name “blockchain” comes from the “chain” created by these mathematical proofs, the “blocks”. The data structure used for easy handling and verifying of these proofs is called a “Merkle tree,” and it is part of the reason why faking a transaction or block within a blockchain is nearly impossible, yet very easy to detect.

In addition, blockchain transactions require no verification from any central party. For transactions to be valid, they only need to be signed with a private key using the Digital Signature Algorithm (DSA) corresponding to their blockchain. Ethereum and Bitcoin blockchains use the ECDSA algorithm, whereas other projects like Cardano or Polkadot rely on the EdDSA algorithm. Both rely on Elliptic Curves, yet the latter uses twisted Edwards curves, an improvement on generic digital signatures. Although a transaction can be signed by any private key, transfer transactions will only be executed successfully if the account connected to the private key used to sign the transaction contains enough funds.

Elliptic-curve signature algorithms rely on the discrete log problem, which is classically defined as given an integer k such that a^k ≡ b (mod p), where p is prime, find k. Unlike other public-key algorithms like RSA, which rely on the factorization problem on particularly large prime numbers (which have recently been targets of attacks using lattice-based cryptography), there is no known efficient method for computing k today (described as different given points P and Q in an elliptic curve). This is the main reason why all blockchain-based systems rely on elliptic curves, although the specific curves and signing algorithms used vary.

The moment a transaction has been signed, broadcast into the network, and mined into a successful block within the network, there’s no way to revert the transaction. Unlike banking operations, successfully mined blockchain transactions cannot be reverted nor restored to the state of a previous transaction. The nature of most public blockchain transactions renders them visible, and thus, the blockchain used for these transactions is the ultimate source of truth for these assets.

Ethereum transactions structure

Now that we have fully understood the nature of blockchain transactions, we are ready to create our first Ethereum-based transaction. We’ll start with a simple transfer transaction: a transfer of 0.1 ETH to the address 0x17A98d2b11Dfb784e63337d2170e21cf5DD04631. A transaction can be described using JavaScript Object Notation (JSON), so when creating this transaction, it would look as follows using MyEtherWallet (by using send offline after logging in):

Immediately, there are a few values that jump out: nonce, gasLimit, gasPrice, data, and chainId. None have anything to do with the contents of our transaction, but, rather, with how our transaction is executed. This is because, in order to send a transaction in Ethereum, you have to define some additional parameters that tell miners how your transaction should be processed. Two of these attributes from our transaction involve “gas”, a measurement unit of computational effort that must be paid to an Ethereum miner in order to commit the transaction to the blockchain network. One is the gasPrice (expressed in a unit called Gwei, which equals 1/10⁹ Ether, Ethereum’s native token), and the other is the gasLimit, which is the maximal amount of gas that is allowed to be used in your transaction. These values can be estimated from an Ethereum node, and as such, are usually filled out automatically by wallet providers.

Numerical values in Ethereum are usually represented in wei, the minimal unit within the Ethereum blockchain, which is equivalent to 1/10¹⁸ of a single Ether (like Satoshi in Bitcoin). Gas prices are commonly expressed in giga wei (Gwei for short), which equals 1/10⁹ of Ether. Gas prices are a complicated topic in the Ethereum network, as they tend to fluctuate. EIP-1559, a recently approved change to the network, set to go live later this year, should help mitigate this challenging volatility. On top of wei and giga wei, there are other units to represent these values. To convert across multiple units, you can use https://eth-converter.com/, and to estimate and visualize gas prices, you can use https://ethgasstation.info/.

In addition to gas parameters, you have to specify on which particular Ethereum network this transaction will be executed. The Ethereum network includes the main network (mainnet) with chaidId 1, but there are other testing networks (testnets) to which your transaction can be submitted without any risk of losing economical value since testnet ETH can be requested or funded via an online faucet. Usually, when developing a Dapp, you will first run it on a local network, then deploy it to a testnet as a final step before going live on mainnet.

Last but not least, we have data and nonce. In case you want to submit some additional data, you can append this as part of the transaction. When interacting with smart contracts, the data field will include your instruction to that contract. A nonce (“number only used once”) is a numerical value used by the Ethereum network to keep track of your transactions, helping avoid double-spending in the network as well as replay attacks. Sometimes transactions get stuck in the network due to low gas prices, so broadcasting a transaction with a higher price but the same nonce would effectively “replace” a pending transaction in the network as soon as it’s picked up by a miner (once the “slow” transaction is seen, because it has the same nonce as an already approved transaction, it will be rejected).

Signing an Ethereum transaction

Grabbing our previous JSON, we can finally go ahead and start the signing process. As we described, this process involves the ECDSA algorithm. To sign our transaction with ECDSA, we’ll be using the popular library ethers.js, which already wraps the required calls to the elliptic curve package for using the secp256k1 curve with the ECDSA algorithm.

You can test this code online in Runkit and match it with the result from MyEtherWallet (MEW) using the private key 0x616E6769652E6A6A706572657A616775696E6167612E6574682E6C696E6B0D0A. The result, 0xf86b80843b9aca008252089417a98d2b11dfb784e63337d2170e21cf5dd0463188016345785d8a00008025a02e47aa4c37e7003af4d3b7d20265691b6c03baba509c0556d21acaca82876cb4a01b5711b8c801584c7875370ed2e9b60260b390cdb63cf57fa6d77899102279a0, represents your signed transaction, ready to be broadcast to the Ethereum network. You can use MEW directly or Alchemy’s online utility Composer, which allows you to pass your signed transaction to the Ethereum network using eth_sendRawTransaction, the RPC API method used to communicate with an Ethereum node.

Sign now, relay later

Signing transactions as executed above is called “offline-signing.” Since we have our private keys under our control, we can create the signing verification using our Ethereum account and broadcast it at a later time to the Ethereum network. Many online wallets do both signing and broadcasting at the same time (e.g. Metamask, Portis). However, offline-signing is particularly useful for applications such as state channels, which are smart contracts that keep track of balances between two accounts, and, upon submission of signed transactions, funds can be transferred. Offline-signing is also a common practice in Decentralized Exchanges (DEXes), where buy and sell orders are stored off-chain and are only settled on-chain when matched with an order that fits a previously signed transaction. They also play a big role in layer 2 solutions such as zkRollup and Optimism.

Using Portis, you can sign transactions to interact with the Gas Station Network (GSN). To interact with the GSN, Portis subscribes to a pool of relayers that are able to pay for the gas fees of your transaction. These relayers subscribe to a decentralized contract (like this one in the Ropsten testnet) and Portis sends them a request to relay your transaction. You are still required to sign your transactions (an unsigned signature is meaningless after all), but the Portis widget does all the previous processes behind the scenes, so users can start signing transactions and interacting with smart contracts, even when using a brand new wallet that has no ETH to pay for gas fees. Give it a try in our cryptopuppers app! In case you want to learn more, the specification for the GSN (EIP-1613) is available here, and you can see the presentation from TabooKey’s team working with Portis’ beloved cryptopuppers demo here.

Today we covered signing transactions, closing our series about private keys. If you followed our complete private keys series, you now understand where Ethereum accounts come from and how Ethereum transactions are made.

--

--

Jose Aguinaga
Portis
Writer for

Web3/Full-Stack. DevOps/Cryptography Enthusiast. Head of Engineering at @hoprnet, previously @MyBit_dapp, @numbrs, @plaid. JavaScript, startups, fintech.