An Experiment In Designing a New Smart Contract Language
Over the past few months, I’ve been working with Todd Proebsting on a small experiment to design and implement a new smart contract language called “Smart”. In this blog post, I’ll explain the motivation behind the project and share what we came up with.
This was designed as a short-term experiment, so it’s time to move on to other things. But hopefully some of the ideas we worked on have value to the community at large.
Background and motivation
Consensys Diligence has the broad mission of “solving smart contract security”. Early on, the team focused almost exclusively on auditing smart contracts that were ready to be deployed. Over time, we published best practices and incubated the tools that became MythX. We started engaging earlier with clients to help with their design and development phases, where we could make an even bigger impact.
The theme of all of these efforts has been to find bigger and bigger levers to pull, and programming language design is a huge lever. As auditors, we have a rather unique perspective on how a programming language can support writing secure smart contracts.
Some opportunities we saw for a new programming language:
- Readability could be greatly improved.
- Complexity could be better managed.
- Common bug classes could be prevented.
The rest of this post will elaborate on each of these points.
As a general rule, we tried to keep the language small and invent as little as possible. We borrowed syntax heavily from Go.
Improving readability
Auditors and (honest) smart contract developers share a common goal: it should be as easy as possible to know what a smart contract does. Smart treats readability, in this sense, as the primary goal.
Programs must be written for people to read, and only incidentally for machines to execute.
— Harold Abelson, Structure and Interpretation of Computer Programs
I’ll show how Smart addresses readability through an example. Suppose you’re reading a smart contract and see the following Solidity code:
(uint256 locked, uint256 spendable) = getBalances(msg.sender);
require(amount <= spendable, "attempting to transfer too much");
This looks reasonable from what you know so far. Let’s assume this is a security token that enforces some sort of lockup period.
Later you read the code for getBalances(address)
and discover the following:
getBalances()
actually returns the balances in the opposite order. The spendable balance is returned first, which means the code we read earlier is going to checkamount
against the wrong value.getBalances()
has a side-effect. It checks the lockup expiration and updates internal state to reflect the new balances.
In Smart, the preceding code would look like this instead:
var spendable uint256
var locked uint256
locked, spendable = spendable, locked from getBalances!(account: msg.sender)
assert amount <= spendable, "attempting to transfer too much"
A lot of information is now available at the call site. The right-hand side of the assignment is required to repeat the exact return value names from getBalances!()
, which makes the bug readily apparent. The exclamation mark on getBalances!
originates from a Scheme naming convention and indicates that the function mutates state, so you as an auditor know to dig into what state mutation it does.
This explicitness happens in the definition of getBalances!
too:
func getBalances!(account address) (spendable uint256, locked uint256) {
...
return { spendable: ..., locked: ... }
}
Here again, it’s hard to make a mistake and return the values in the wrong order. The return
statement must match the named return values too.
A few more Smart features are designed to make important things stand out to anyone reading the code:
- Public function names start with a capital letter, and private functions start with a lowercase letter.
- Mutating state requires a special syntax:
store! x = 5
to write to a storage variable andcall! ...
to make an external call. Not only does this stand out (especially with appropriate syntax highlighting), but it’s very “greppable” so someone reading the code can quickly find all such spots in the code. - No inheritance and no function overloading means that it’s always clear what function is being called.
Managing Complexity
Smart contracts seem to be getting bigger and more complex over time. This has a huge cost when it comes to readability. Due to the tight coupling and global state in Solidity smart contracts, as the size of the contract grows, the difficulty of auditing the contract tends to grow exponentially.
Decomposition
One of Smart’s main features is decomposition through modularity and data encapsulation. Large contracts are broken into smaller, reusable classes. Each class can be analyzed in isolation.
Classes combine persistent storage and methods that act on that storage. From outside of a class, only public methods are accessible, with no direct access to the class’s storage. This makes it possible to analyze a single class in isolation to verify its behavior. This is true both for human code readers and for automated tools.
Let’s continue with our previous example. Consider the following Solidity code:
contract SecurityToken {
struct Account {
uint256 lockedBalance;
uint256 spendableBalance;
uint256 lockupExpiration;
}
mapping(address => Account) accounts;
function getBalances(address addr) internal returns (uint256 spendable, uint256 locked) {
if (now >= accounts[msg.sender].lockupExpiration) {
accounts[addr].lockupExpiration = 0;
accounts[addr].spendableBalance += accounts[addr].lockedBalance;
accounts[addr].lockedBalance = 0;
}
return (accounts[addr].spendableBalance, accounts[addr].lockedBalance);
}
...
}
There’s a bug in the above code. The conditional uses msg.sender
instead of addr
to look up the account. This type of confusion over state access is quite common in my experience. The bug isn’t too hard to spot in such a small piece of code, but it’s harder to see when the code is more complex.
In Smart, this code might look like this instead:
contract SecurityToken {
storage accounts map[address]Account
...
}
class Account {
storage lockedBalance uint256
storage spendableBalance uint256
storage lockupExpiration uint256
func GetBalances!() (locked uint256, spendable uint256) {
if block.timestamp >= lockupExpiration {
store! lockupExpiration = 0
// We can improve on this too! More in a future example.
store! spendableBalance += lockedBalance
store! lockedBalance = 0
}
return { locked: lockedBalance, spendable: spendableBalance }
}
}
In the Account
class, there can be no accidents about which account we’re working with. The only data accessible to Account
is its own.
(I’ve argued in the past for doing this kind of decomposition and isolation by using a factory that deploys many separate smart contracts, but this can be expensive from a gas perspective.)
Dependencies
Smart supports code spread throughout many files. When importing from another file, the programmer can specify the hash of that file’s contents. (This is optional during development for ergonomic reasons, but it should be considered required when deploying to a public network.)
import "helpers.smrt" [0xf20876a21db9946623e3ad503bd624be1713ad379d5eef107800f53b838fe40e] as helpers
By knowing the exact hash of the file contents being imported, it’s possible to refer to previous audits of well-known code. I find myself frequently chasing down which version of a particular OpenZeppelin contract is being used, while a code hash would make this trivial.
Learning from the past
By now, we have years of history working with Ethereum smart contracts. Certain classes of bugs stand out as problematic time and time again. Where possible, Smart aims to prevent common classes of bugs.
Integer overflow/underflow
Smart’s math operations revert on overflow. Accidental integer overflows have been a persistent pain point in smart contract development, so it’s sensible to build protections into the language itself.
(This is not a differentiator. The next version of Solidity is expected to do this too, and Vyper already does.)
Escrow/accounting
Another common smart contract security vulnerability is some failure of escrow. For example, a user can withdraw the same funds repeatedly, effectively stealing from other users.
In the Account
class example, I could have easily introduced this type of bug:
class Account {
storage lockedBalance uint256
storage spendableBalance uint256
storage lockupExpiration uint256
func GetBalances!() (locked uint256, spendable uint256) {
if block.timestamp >= lockupExpiration {
store! lockupExpiration = 0
store! spendableBalance += lockedBalance
// Whoops! I forgot to zero out lockedBalance, so repeated calls
// to GetBalances!() will keep increasing spendableBalance.
}
return { locked: lockedBalance, spendable: spendableBalance }
}
}
Smart introduces pooled values to address this type of bug. Pooled values work like double-entry accounting. If a value (e.g. ether or tokens) is received or sent, that value must be added or subtracted from persistent storage.
The following operations are defined on pooled values:
- Declaration:
pool tokens
declares a new pool namedtokens
. - Minting:
mint! 5 tokens
returns a new pooled value. - Burning:
burn! 5 tokens from foo
decreasesfoo
’s value by 5. - Moving:
balanceA <- 5 tokens from balanceB
decreasesbalanceB
by 5 and increasesbalanceA
by 5. Note that the total is preserved.
Here’s a fixed version of the code:
class Account {
pool tokens
storage lockedBalance tokens
storage spendableBalance tokens
storage lockupExpiration uint256
func GetBalances!() (locked tokens, spendable tokens) {
if block.timestamp >= lockupExpiration {
store! lockupExpiration = 0
store! spendableBalance <- lockedBalance from lockedBalance
}
return { locked: lockedBalance, spendable: spendableBalance }
}
}
Without an explicit mint!
or burn!
, there’s no way to create or destroy pooled values (e.g. tokens
in the above code). Note that due to the isolation of classes, once you’ve read the Account
class and are satisfied that tokens
can neither be created nor destroyed, you don’t need to worry about code outside of this class doing some sort of manipulation.
It’s possible to temporarily move a pooled value into a local variable, but it must be moved back into persistent storage before the local variable goes out of scope, e.g.:
{
var temp tokens
temp <- lockedBalance from lockedBalance
} // Compiler error! We need to move tokens from temp to some persistent storage.
A built-in pool named wei
is used for incoming and outgoing ether and helps prevent similar accounting errors there.
@payable
func Buy() {
store! balance <- purchasePrice from msg.value
// Compiler error! What if msg.value > purchasePrice? We have to track that
// excess wei somewhere.
}
Reentrancy
Reentrancy bugs are still common. These bugs are only possible if the contract’s state is in violation of some invariant at the time of an external call. Pooled values help to prevent this problem by keeping certain things in a consistent state. We considered other mitigations as well, such as disallowing reentrancy by default, but this was never fully designed or implemented.
Current state
Enough of Smart has been implemented that we’re confident that the design works, but there’s a lot left to do. Lexical analysis is fairly complete. Semantic analysis (type checking and the like) is most of the way there, but notable exceptions include enforcing pooled types and public/private function naming. Code generation is barely started, which means it’s impossible to actually deploy interesting Smart contracts.
The following is the most complex program we were able to deploy and execute:
contract Test {
storage num uint256
storage result uint256
constructor() {
store! result = 1
for store! num = 5; num > 0; store! num = num - 1 {
store! result = result * num
}
assert result == 120 // 5 * 4 * 3 * 2
assert factorial(n: 5) == 120
}
func factorial(n uint256) uint256 {
if n == 0 {
return 1
} else {
return n * factorial(n: n-1)
}
}
}
The future
We (Consensys Diligence) don’t plan to pursue Smart any further right now, but the code is now available under the MIT license at https://github.com/consensys/smartlang. If there’s community interest in picking up development of the language or borrowing its ideas for other languages, please contact us so we can help!