Vitalik Buterin's websiteWriting by Vitalik Buterin
https://vitalik.ca/
Mon, 20 Jan 2020 17:23:50 -0800Mon, 20 Jan 2020 17:23:50 -0800Jekyll v3.7.2Base Layers And Functionality Escape Velocity<p>One common strand of thinking in blockchain land goes as follows: blockchains should be maximally simple, because they are a piece of infrastructure that is difficult to change and would lead to great harms if it breaks, and more complex functionality should be built on top, in the form of layer 2 protocols: <a href="https://www.jeffcoleman.ca/state-channels/">state channels</a>, <a href="https://ethresear.ch/t/minimal-viable-plasma/426">Plasma</a>, <a href="https://ethresear.ch/t/on-chain-scaling-to-potentially-500-tx-sec-through-mass-tx-validation/3477">rollup</a>, and so forth. Layer 2 should be the site of ongoing innovation, layer 1 should be the site of stability and maintenance, with large changes only in emergencies (eg. a one-time set of serious breaking changes to prevent the base protocol's cryptography from falling to quantum computers would be okay).</p>
<p>This kind of layer separation is a very nice idea, and in the long term I strongly support this idea. However, this kind of thinking misses an important point: while layer 1 cannot be <em>too</em> powerful, as greater power implies greater complexity and hence greater brittleness, layer 1 must also be <em>powerful enough</em> for the layer 2 protocols-on-top that people want to build to actually be possible in the first place. Once a layer 1 protocol has achieved a certain level of functionality, which I will term "functionality escape velocity", then yes, you can do everything else on top without further changing the base. But if layer 1 is not powerful enough, then you can talk about filling in the gap with layer 2 systems, but the reality is that there is no way to actually build those systems, without reintroducing a whole set of trust assumptions that the layer 1 was trying to get away from. This post will talk about some of what this minimal functionality that constitutes "functionality escape velocity" is.</p>
<h3 id="a-programming-language">A programming language</h3>
<p>It must be possible to execute custom user-generated scripts on-chain. This programming language can be simple, and actually does not need to be high-performance, but it needs to at least have the level of functionality required to be able to verify arbitrary things that might need to be verified. This is important because the layer 2 protocols that are going to be built on top need to have some kind of verification logic, and this verification logic must be executed by the blockchain somehow.</p>
<p>You may have heard of <a href="https://en.wikipedia.org/wiki/Turing_completeness">Turing completeness</a>; the "layman's intuition" for the term being that if a programming language is Turing complete then it can do anything that a computer theoretically could do. Any program in one Turing-complete language can be translated into an equivalent program in any other Turing-complete language. However, it turns out that we only need something slightly lighter: it's okay to restrict to programs without loops, or programs which are <a href="https://en.wikipedia.org/wiki/Total_functional_programming">guaranteed to terminate</a> in a specific number of steps.</p>
<h3 id="rich-statefulness">Rich Statefulness</h3>
<p>It doesn't just matter that a programming language <em>exists</em>, it also matters precisely how that programming language is integrated into the blockchain. Among the more constricted ways that a language could be integrated is if it is used for pure transaction verification: when you send coins to some address, that address represents a computer program <code>P</code> which would be used to verify a transaction that sends coins <em>from</em> that address. That is, if you send a transaction whose hash is <code>h</code>, then you would supply a signature <code>S</code>, and the blockchain would run <code>P(h, S)</code>, and if that outputs TRUE then the transaction is valid. Often, <code>P</code> is a verifier for a cryptographic signature scheme, but it could do more complex operations. Note particularly that in this model <code>P</code> <em>does not have access</em> to the destination of the transaction.</p>
<p>However, this "pure function" approach is not enough. This is because this pure function-based approach is not powerful enough to implement many kinds of layer 2 protocols that people actually want to implement. It can do channels (and channel-based systems like the Lightning Network), but it cannot implement other scaling techniques with stronger properties, it cannot be used to bootstrap systems that do have more complicated notions of state, and so forth.</p>
<p>To give a simple example of what the pure function paradigm cannot do, consider a savings account with the following feature: there is a cryptographic key <code>k</code> which can initiate a withdrawal, and if a withdrawal is initiated, within the next 24 hours that same key <code>k</code> can cancel the withdrawal. If a withdrawal remains uncancelled within 24 hours, then anyone can "poke" the account to finalize that withdrawal. The goal is that if the key is stolen, the account holder can prevent the thief from withdrawing the funds. The thief could of course prevent the legitimate owner from getting the funds, but the attack would not be profitable for the thief and so they would probably not bother with it (see <a href="http://hackingdistributed.com/2016/02/26/how-to-implement-secure-bitcoin-vaults/">the original paper</a> for an explanation of this technique).</p>
<p>Unfortunately this technique cannot be implemented with just pure functions. The problem is this: there needs to be some way to move coins from a "normal" state to an "awaiting withdrawal" state. But the program <code>P</code> does not have access to the destination! Hence, any transaction that could authorize moving the coins to an awaiting withdrawal state could also authorize just stealing those coins immediately; <code>P</code> can't tell the difference. The ability to change the state of coins, without completely setting them free, is important to many kinds of applications, including layer 2 protocols. Plasma itself fits into this "authorize, finalize, cancel" paradigm: an exit from Plasma must be approved, then there is a 7 day challenge period, and within that challenge period the exit could be cancelled if the right evidence is provided. Rollup also needs this property: coins inside a rollup must be controlled by a program that keeps track of a state root <code>R</code>, and changes from <code>R</code> to <code>R'</code> if some verifier <code>P(R, R', data)</code> returns TRUE - but it only changes the state to <code>R'</code> in that case, it does not set the coins free.</p>
<p>This ability to authorize state changes without completely setting all coins in an account free, is what I mean by "rich statefulness". It can be implemented in many ways, some UTXO-based, but without it a blockchain is not powerful enough to implement most layer 2 protocols, without including trust assumptions (eg. a set of functionaries who are collectively trusted to execute those richly-stateful programs).</p>
<p><small><i>Note: yes, I know that if <code>P</code> has access to <code>h</code> then you can just include the destination address as part of <code>S</code> and check it against <code>h</code>, and restrict state changes that way. But it is possible to have a programming language that is too resource-limited or otherwise restricted to actually do this; and surprisingly this often actually is the case in blockchain scripting languages.</i></small></p>
<h3 id="sufficient-data-scalability-and-low-latency">Sufficient data scalability and low latency</h3>
<p>It turns out that plasma and channels, and other layer 2 protocols that are fully off-chain have some fundamental weaknesses that prevent them from fully replicating the capabilities of layer 1. I go into this in detail <a href="https://vitalik.ca/general/2019/08/28/hybrid_layer_2.html">here</a>; the summary is that these protocols need to have a way of adjudicating situations where some parties maliciously fail to provide data that they promised to provide, and because data publication is not globally verifiable (you don't know when data was published unless you already downloaded it yourself) these adjudication games are not game-theoretically stable. Channels and Plasma cleverly get around this instability by adding additional assumptions, particularly assuming that for every piece of state, there is a single actor that is interested in that state not being incorrectly modified (usually because it represents coins that they own) and so can be trusted to fight on its behalf. However, this is far from general-purpose; systems like <a href="http://uniswap.exchange">Uniswap</a>, for example, include a large "central" contract that is not owned by anyone, and so they cannot effectively be protected by this paradigm.</p>
<p>There is one way to get around this, which is layer 2 protocols that publish very small amounts of data on-chain, but do computation entirely off-chain. If data is guaranteed to be available, then computation being done off-chain is okay, because games for adjudicating who did computation correctly and who did it incorrectly <em>are</em> game-theoretically stable (or could be replaced entirely by <a href="https://vitalik.ca/general/2017/02/01/zk_snarks.html">SNARKs</a> or <a href="https://vitalik.ca/general/2017/11/09/starks_part_1.html">STARKs</a>). This is the logic behind <a href="https://ethresear.ch/t/on-chain-scaling-to-potentially-500-tx-sec-through-mass-tx-validation/3477">ZK rollup</a> and <a href="https://medium.com/plasma-group/ethereum-smart-contracts-in-l2-optimistic-rollup-2c1cef2ec537">optimistic rollup</a>. If a blockchain allows for the publication and guarantees the availability of a reasonably large amount of data, even if its capacity for <em>computation</em> remains very limited, then the blockchain can support these layer-2 protocols and achieve a high level of scalability <em>and</em> functionality.</p>
<p>Just how much data does the blockchain need to be able to process and guarantee? Well, it depends on what TPS you want. With a rollup, you can compress most activity to ~10-20 bytes per transaction, so 1 kB/sec gives you 50-100 TPS, 1 MB/sec gives you 50,000-100,000 TPS, and so forth. Fortunately, internet bandwidth <a href="http://www.circleid.com/posts/20191119_nielsens_law_of_internet_bandwidth/">continues to grow quickly</a>, and does not seem to be slowing down the way Moore's law for computation is, so increasing scaling for data without increasing computational load is quite a viable path for blockchains to take!</p>
<p>Note also that it is not just data capacity that matters, it is also data latency (ie. having low block times). Layer 2 protocols like rollup (or for that matter Plasma) only give any guarantees of security when the data actually is published to chain; hence, the time it takes for data to be reliably included (ideally "finalized") on chain is the time that it takes between when Alice sends Bob a payment and Bob can be confident that this payment will be included. The block time of the base layer sets the latency for anything whose confirmation depends things being included in the base layer. This could be worked around with on-chain security deposits, aka "bonds", at the cost of high capital inefficiency, but such an approach is inherently imperfect because a malicious actor could trick an unlimited number of different people by sacrificing one deposit.</p>
<h3 id="conclusions">Conclusions</h3>
<p>"Keep layer 1 simple, make up for it on layer 2" is NOT a universal answer to blockchain scalability and functionality problems, because it fails to take into account that layer 1 blockchains themselves must have a sufficient level of scalability and functionality for this "building on top" to actually be possible (unless your so-called "layer 2 protocols" are just trusted intermediaries). However, it is true that beyond a certain point, any layer 1 functionality <em>can</em> be replicated on layer 2, and in many cases it's a good idea to do this to improve upgradeability. Hence, we need <a href="https://vitalik.ca/general/2018/08/26/layer_1.html">layer 1 development in parallel with layer 2 development in the short term, and more focus on layer 2 in the long term</a>.</p>
Thu, 26 Dec 2019 02:03:10 -0800
https://vitalik.ca/general/2019/12/26/mvb.html
https://vitalik.ca/general/2019/12/26/mvb.htmlgeneralChristmas Special<p>Since it's Christmas time now, and we're theoretically supposed to be enjoying ourselves and spending time with our families instead of waging endless holy wars on Twitter, this blog post will offer some games that you can play with your friends that will help you have fun <em>and</em> at the same time understand some spooky mathematical concepts!</p>
<h3 id="dimensional-chess">1.58 dimensional chess</h3>
<center>
<br> <a href="https://twitter.com/el33th4xor/status/1138777837320716288"><img src="http://vitalik.ca/files/posts_files/christmas-files/chess_tweet.png" /></a> <br><br>
</center>
<p>This is a variant of chess where the board is set up like this:</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/chess.png" /> <br><br>
</center>
<p>The board is still a normal 8x8 board, but there are only 27 open squares. The other 37 squares should be covered up by checkers or Go pieces or anything else to denote that they are inaccessible. The rules are the same as chess, with a few exceptions:</p>
<ul>
<li>White pawns move up, black pawns move left. White pawns take going left-and-up or right-and-up, black pawns take going left-and-down or left-and-up. White pawns promote upon reaching the top, black pawns promote upon reaching the left.</li>
<li>No en passant, castling, or two-step-forward pawn jumps.</li>
<li>Chess pieces cannot move onto <em>or through</em> the 37 covered squares. Knights cannot move onto the 37 covered squares, but don't care what they move "through".</li>
</ul>
<p>The game is called 1.58 dimensional chess because the 27 open squares are chosen according to a pattern based on the <a href="https://en.wikipedia.org/wiki/Sierpi%C5%84ski_triangle">Sierpinski triangle</a>. You start off with a single open square, and then every time you double the width, you take the shape at the end of the previous step, and copy it to the top left, top right and bottom left corners, but leave the bottom right corner inaccessible. Whereas in a one-dimensional structure, doubling the width increases the space by 2x, and in a two-dimensional structure, doubling the width increases the space by 4x (4 = 2<sup>2</sup>), and in a three-dimensional structure, doubling the width increases the space by 8x (8 = 2<sup>3</sup>), here doubling the width increases the space by 3x (3 = 2<sup>1.58496</sup>), hence "1.58 dimensional" (see <a href="https://en.wikipedia.org/wiki/Hausdorff_dimension">Hausdorff dimension</a> for details).</p>
<p>The game is substantially simpler and more "tractable" than full-on chess, and it's an interesting exercise in showing how in <a href="https://en.wikipedia.org/wiki/Flatland">lower-dimensional spaces</a> defense becomes much easier than offense. Note that the relative value of different pieces may change here, and new kinds of endings become possible (eg. you can checkmate with just a bishop).</p>
<h3 id="dimensional-tic-tac-toe">3 dimensional tic tac toe</h3>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/tic4.png" /> <br><br>
</center>
<p>The goal here is to get 4 in a straight line, where the line can go in any direction, along an axis or diagonal, including between planes. For example in this configuration X wins:</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/tic4_2.png" /> <br><br>
</center>
<p>It's considerably harder than <a href="https://www.quora.com/Is-there-a-way-to-never-lose-at-Tic-Tac-Toe">traditional 2D tic tac toe</a>, and hopefully much more fun!</p>
<h3 id="modular-tic-tac-toe">Modular tic-tac-toe</h3>
<p>Here, we go back down to having two dimensions, except we allow lines to wrap around:</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/tic4_3.png" /> <br><small><i>X wins</i></small>
</center>
<p><br><br></p>
<p>Note that we allow diagonal lines with any slope, as long as they pass through all four points. Particularly, this means that lines with slope +/- 2 and +/- 1/2 are admissible:</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/tic4_4.png" /> <br><br>
</center>
<p>Mathematically, the board can be interpreted as a 2-dimensional vector space over <a href="https://en.wikipedia.org/wiki/Modular_arithmetic">integers modulo 4</a>, and the goal being to fill in a line that passes through four points over this space. Note that there exists at least one line passing through any two points.</p>
<h3 id="tic-tac-toe-over-the-4-element-binary-field">Tic tac toe over the 4-element binary field</h3>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/tic4_5.png" /> <br><br>
</center>
<p>Here, we have the same concept as above, except we use an even spookier mathematical structure, the <a href="https://en.wikipedia.org/wiki/Finite_field#Field_with_four_elements">4-element field</a> of polynomials over <span class="math inline">\(Z_2\)</span> modulo <span class="math inline">\(x^2 + x + 1\)</span>. This structure has pretty much no reasonable geometric interpretation, so I'll just give you the addition and multiplication tables:</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/tic4_6.png" /> <br><br>
</center>
<p>OK fine, here are all possible lines, excluding the horizontal and the vertical lines (which are also admissible) for brevity:</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/tic4_7.png" style="width: 450px" /> <br><br>
</center>
<p>The lack of geometric interpretation does make the game harder to play; you pretty much have to memorize the twenty winning combinations, though note that they are <em>basically</em> rotations and reflections of the same four basic shapes (axial line, diagonal line, diagonal line starting in the middle, that weird thing that doesn't look like a line).</p>
<h3 id="now-play-1.77-dimensional-connect-four.-i-dare-you.">Now play 1.77 dimensional connect four. I dare you.</h3>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/tic4_8.png" style="width: 450px" /> <br><br>
</center>
<h3 id="modular-poker">Modular poker</h3>
<p>Everyone is dealt five (you can use whatever variant poker rules you want here in terms of how these cards are dealt and whether or not players have the right to swap cards out). The cards are interpreted as: jack = 11, queen = 12, king = 0, ace = 1. A hand is stronger than another hand, if it contains a longer sequence, with any constant difference between consecutive cards (allowing wraparound), than the other hand.</p>
<p>Mathametically, this can be represented as, a hand is stronger if the player can come up with a line <span class="math inline">\(L(x) = mx+b\)</span> such that they have cards for the numbers <span class="math inline">\(L(0)\)</span>, <span class="math inline">\(L(1)\)</span> ... <span class="math inline">\(L(k)\)</span> for the highest <span class="math inline">\(k\)</span>.</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/cards1.png" /> <br><small><i>Example of a full five-card winning hand. y = 4x + 5.</i></small>
</center>
<p><br><br></p>
<p>To break ties between equal maximum-length sequences, count the number of distinct length-three sequences they have; the hand with more distinct length-three sequences wins.</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/cards2.png" /> <br><small><i>This hand has four length-three sequences: K 2 4, K 4 8, 2 3 4, 3 8 K. This is rare.</i></small>
</center>
<p><br><br></p>
<p>Only consider lines of length three or higher. If a hand has three or more of the same denomination, that counts as a sequence, but if a hand has two of the same denomination, any sequences passing through that denomination only count as one sequence.</p>
<center>
<br> <img src="http://vitalik.ca/files/posts_files/christmas-files/cards3.png" /> <br><small><i>This hand has no length-three sequences.</i></small>
</center>
<p><br><br></p>
<p>If two hands are completely tied, the hand with the higher highest card (using J = 11, Q = 12, K = 0, A = 1 as above) wins.</p>
<p>Enjoy!</p>
Tue, 24 Dec 2019 17:03:10 -0800
https://vitalik.ca/general/2019/12/24/christmas.html
https://vitalik.ca/general/2019/12/24/christmas.htmlgeneralQuadratic Payments: A Primer<p><em>Special thanks to Karl Floersch and Jinglan Wang for feedback</em></p>
<p>If you follow applied mechanism design or decentralized governance at all, you may have recently heard one of a few buzzwords: <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2003531">quadratic voting</a>, <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3243656">quadratic funding</a> and <a href="https://kortina.nyc/essays/speech-is-free-distribution-is-not-a-tax-on-the-purchase-of-human-attention-and-political-power/">quadratic attention purchase</a>. These ideas have been gaining popularity rapidly over the last few years, and small-scale tests have already been deployed: the <a href="https://presidential-hackathon.taiwan.gov.tw/en/">Taiwanese presidential hackathon</a> used quadratic voting to vote on winning projects, Gitcoin Grants <a href="https://vitalik.ca/general/2019/10/24/gitcoin.html">used quadratic funding</a> to fund public goods in the Ethereum ecosystem, and the Colorado Democratic party <a href="https://www.wired.com/story/colorado-quadratic-voting-experiment">also experimented with</a> quadratic voting to determine their party platform.</p>
<p>To the proponents of these voting schemes, this is not just another slight improvement to what exists. Rather, it's an initial foray into a fundamentally new class of social technology which, has the potential to overturn how we make many public decisions, large and small. The ultimate effect of these schemes rolled out in their full form <em>could be as deeply transformative as the industrial-era advent of mostly-free markets and constitutional democracy</em>. But now, you may be thinking: "These are large promises. What do these new governance technologies have that justifies such claims?"</p>
<h3 id="private-goods-private-markets...">Private goods, private markets...</h3>
<p>To understand what is going on, let us first consider an existing social technology: money, and property rights - the invisible social technology that generally hides behind money. Money and private property are extremely powerful social technologies, for all the reasons classical economists have been stating for over a hundred years. If Bob is producing apples, and Alice wants to buy apples, we can economically model the interaction between the two, and the results <em>seem to make sense</em>:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market1.png" />
</center>
<p><br><br></p>
<p>Alice keeps buying apples until the marginal value of the next apple to her is less than the cost of producing it, which is pretty much exactly the optimal thing that could happen. And if the cost of producing the apples is greater than their value to Alice, then Alice just doesn't buy any:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market2.png" />
</center>
<p><br><br></p>
<p>This is all formalized in results such as the "<a href="https://en.wikipedia.org/wiki/Fundamental_theorems_of_welfare_economics">fundamental theorems of welfare economics</a>". Now, those of you who have learned some economics may be screaming, but what about <a href="https://en.wikipedia.org/wiki/Imperfect_competition">imperfect competition</a>? <a href="https://en.wikipedia.org/wiki/Information_asymmetry">Asymmetric information</a>? <a href="https://en.wikipedia.org/wiki/Economic_inequality">Economic inequality</a>? <a href="https://en.wikipedia.org/wiki/Public_good">Public goods</a>? <a href="https://en.wikipedia.org/wiki/Externality">Externalities</a>? Many activities in the real world, including those that are key to the progress of human civilization, benefit (or harm) many people in complicated ways. These activities and the consequences that arise from them often cannot be neatly decomposed into sequences of distinct trades between two parties.</p>
<p>But since when do we expect a single package of technologies to solve every problem anyway? "What about oceans?" isn't an argument against <em>cars</em>, it's an argument against <em>car maximalism</em>, the position that we need cars and nothing else. Much like how private property and markets deal with private goods, can we try to use economic means to deduce what kind of social technologies would work well for encouraging production of the public goods that we need?</p>
<h3 id="public-goods-public-markets">... Public goods, public markets</h3>
<p>Private goods (eg. apples) and public goods (eg. public parks, air quality, scientific research, this article...) are different in some key ways. When we are talking about private goods, production for multiple people (eg. the same farmer makes apples for both Alice and Bob) can be decomposed into (i) the farmer making some apples for Alice, and (ii) the farmer making some other apples for Bob. If Alice wants apples but Bob does not, then the farmer makes Alice's apples, collects payment from Alice, and leaves Bob alone. Even complex collaborations (the <a href="https://fee.org/resources/i-pencil/">"I, Pencil" essay</a> popular in libertarian circles comes to mind) can be decomposed into a series of such interactions. When we are talking about public goods, however, <em>this kind of decomposition is not possible</em>. When I write this blog article, it can be read by both Alice and Bob (and everyone else). I <em>could</em> put it behind a paywall, but if it's popular enough it will inevitably get mirrored on third-party sites, and paywalls are in any case annoying and not very effective. Furthermore, making an article available to ten people is not ten times cheaper than making the article available to a hundred people; rather, <em>the cost is exactly the same</em>. So I either produce the article for everyone, or I do not produce it for anyone at all.</p>
<p>So here comes the challenge: how do we aggregate together people's preferences? Some private and public goods are worth producing, others are not. In the case of private goods, the question is easy, because we can just decompose it into a series of decisions for each individual. Whatever amount each person is willing to pay for, that much gets produced for them; the economics is not especially complex. In the case of public goods, however, you cannot "decompose", and so we need to add up people's preferences in a different way.</p>
<p>First of all, let's see what happens if we just put up a plain old regular market: I offer to write an article as long as at least $1000 of money gets donated to me (fun fact: <a href="https://bitcointalk.org/index.php?topic=28681.msg360909#msg360909">I literally did this back in 2011</a>). Every dollar donated increases the probability that the goal will be reached and the article will be published; let us call this "marginal probability" <code>p</code>. At a cost of $<code>k</code>, you can increase the probability that the article will be published by <code>k * p</code> (though eventually the gains will decrease as the probability approaches 100%). Let's say to you personally, the article being published is worth $<code>V</code>. Would you donate? Well, donating a dollar increases the probability it will be published by <code>p</code>, and so gives you an expected $<code>p * V</code> of value. If <code>p * V > 1</code>, you donate, and quite a lot, and if <code>p * V < 1</code> you don't donate at all.</p>
<p>Phrased less mathematically, either you value the article enough (and/or are rich enough) to pay, and if that's the case it's in your interest to keep paying (and influencing) quite a lot, or you don't value the article enough and you contribute nothing. Hence, the only blog articles that get published would be articles where some single person is willing to <a href="https://en.wikipedia.org/wiki/Patronage">basically pay for it themselves</a> (in my experiment in 2011, this prediction was experimentally verified: in <a href="https://bitcointalk.org/index.php?topic=23934.msg306437#msg306437">most</a> <a href="https://bitcointalk.org/index.php?topic=28681.msg360909#msg360909">rounds</a>, over half of the total contribution came from a single donor).</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market8.png" />
</center>
<p><br><br></p>
<p>Note that <em>this reasoning applies for any kind of mechanism that involves "buying influence" over matters of public concern</em>. This includes paying for public goods, shareholder voting in corporations, public advertising, bribing politicians, and much more. The little guy has too little influence (not quite zero, because in the real world things like altruism exist) and the big guy has too much. If you had an intuition that markets work great for buying apples, but money is corrupting in "the public sphere", this is basically a simplified mathematical model that shows why.</p>
<p>We can also consider a different mechanism: one-person-one-vote. Let's say you can either vote that I deserve a reward for writing this article, or you can vote that I don't, and my reward is proportional to the number of votes in my favor. We can interpret this as follows: your first "contribution" costs only a small amount of effort, so you'll support an article if you care about it enough, but after that point there is no more room to contribute further; your second contribution "costs" infinity.</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market9.png" />
</center>
<p><br><br></p>
<p>Now, you might notice that neither of the graphs above look quite right. The first graph over-privileges people who <em>care a lot</em> (or are wealthy), the second graph over-privileges people who <em>care only a little</em>, which is also a problem. The single sheep's desire to live is more important than the two wolves' desire to have a tasty dinner.</p>
<p>But what do we actually want? Ultimately, we want a scheme where <em>how much influence you "buy" is proportional to how much you care</em>. In the mathematical lingo above, we want your <code>k</code> to be proportional to your <code>V</code>. But here's the problem: your <code>V</code> determines how much you're willing to pay for <em>one</em> unit of influence. If Alice were willing to pay $100 for the article if she had to fund it herself, then she would be willing to pay $1 for an increased 1% chance it will get written, and if Bob were only willing to pay $50 for the article then he would only be willing to pay $0.5 for the same "unit of influence".</p>
<p>So how do we match these two up? The answer is clever: <em>your n'th unit of influence costs you $n</em> . That is, for example, you could buy your first vote for $0.01, but then your second would cost $0.02, your third $0.03, and so forth. Suppose you were Alice in the example above; in such a system she would keep buying units of influence until the cost of the next one got to $1, so she would buy 100 units. Bob would similarly buy until the cost got to $0.5, so he would buy 50 units. Alice's 2x higher valuation turned into 2x more units of influence purchased.</p>
<p>Let's draw this as a graph:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market10.png" />
</center>
<p><br><br></p>
<p>Now let's look at all three beside each other:</p>
<center>
<table>
<tr>
<td>
One dollar one vote
</td>
<td>
Quadratic voting
</td>
<td>
One person one vote
</td>
</tr>
<tr>
<td>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market8.png" />
</td>
<td>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market10.png" />
</td>
<td>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market9.png" />
</td>
</tr>
</table>
</center>
<p><br><br></p>
<p>Notice that only quadratic voting has this nice property that the amount of influence you purchase is proportional to how much you care; the other two mechanisms either over-privilege concentrated interests or over-privilege diffuse interests.</p>
<p>Now, you might ask, where does the <em>quadratic</em> come from? Well, the <em>marginal</em> cost of the n'th vote is $n (or $0.01 * n), but the <em>total</em> cost of n votes is <span class="math inline">\(\approx \frac{n^2}{2}\)</span>. You can view this geometrically as follows:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/qv_triangle.png" />
</center>
<p><br><br></p>
<p>The total cost is the area of a triangle, and you probably learned in math class that area is base * height / 2. And since here base and height are proportionate, that basically means that total cost is proportional to number of votes squared - hence, "quadratic". But honestly it's easier to think "your n'th unit of influence costs $n".</p>
<p>Finally, you might notice that above I've been vague about what "one unit of influence" actually means. This is deliberate; it can mean different things in different contexts, and the different "flavors" of quadratic payments reflect these different perspectives.</p>
<h3 id="quadratic-voting">Quadratic Voting</h3>
<p><em>See also the original paper: <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2003531">https://papers.ssrn.com/sol3/papers.cfm?abstract%5fid=2003531</a></em></p>
<p>Let us begin by exploring the first "flavor" of quadratic payments: quadratic voting. Imagine that some organization is trying to choose between two choices for some decision that affects all of its members. For example, this could be a company or a nonprofit deciding which part of town to make a new office in, or a government deciding whether or not to implement some policy, or an internet forum deciding whether or not its rules should allow discussion of cryptocurrency prices. Within the context of the organization, the choice made is a public good (or public bad, depending on whom you talk to): everyone "consumes" the results of the same decision, they just have different opinions about how much they like the result.</p>
<p>This seems like a perfect target for quadratic voting. The goal is that option A gets chosen if in total people like A more, and option B gets chosen if in total people like B more. With simple voting ("one person one vote"), the distinction between stronger vs weaker preferences gets ignored, so on issues where one side is of very high value to a few people and the other side is of low value to more people, simple voting is likely to give wrong answers. With a private-goods market mechanism where people can buy as many votes as they want at the same price per vote, the individual with the strongest preference (or the wealthiest) carries everything. Quadratic voting, where you can make n votes in either direction at a cost of n<sup>2</sup>, is right in the middle between these two extremes, and creates the perfect balance.</p>
<br>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/Market7.png?2e" /><br><i><small>Note that in the voting case, we're deciding two options, so different people will favor A over B or B over A; hence, unlike the graphs we saw earlier that start from zero, here voting and preference can both be positive or negative (which option is considered positive and which is negative doesn't matter; the math works out the same way)</small></i>
</center>
<p><br><br></p>
<p>As shown above, because the n'th vote has a cost of <code>n</code>, the number of votes you make is proportional to how much you value one unit of influence over the decision (the value of the decision multiplied by the probability that one vote will tip the result), and hence proportional to how much you care about A being chosen over B or vice versa. Hence, we once again have this nice clean "preference adding" effect.</p>
<p>We can extend quadratic voting in multiple ways. First, we can allow voting between more than two options. While traditional voting schemes inevitably fall prey to various kinds of "strategic voting" issues because of <a href="https://en.wikipedia.org/wiki/Arrow%27s_impossibility_theorem">Arrow's theorem</a> and <a href="https://en.wikipedia.org/wiki/Duverger%27s_law">Duverger's law</a>, quadratic voting <a href="http://www.econ.msu.edu/seminars/docs/QuadMultAltshort19.pdf">continues to be optimal</a> in contexts with more than two choices.</p>
<blockquote>
<p><strong>The intuitive argument for those interested</strong>: suppose there are established candidates A and B and new candidate C. Some people favor C > A > B but others C > B > A. in a regular vote, if both sides think C stands no chance, they decide may as well vote their preference between A and B, so C gets no votes, and C's failure becomes a self-fulfilling prophecy. In quadratic voting the former group would vote [A +10, B -10, C +1] and the latter [A -10, B +10, C +1], so the A and B votes cancel out and C's popularity shines through.</p>
</blockquote>
<p>Second, we can look not just at voting between discrete options, but also at voting on the setting of a thermostat: anyone can push the thermostat up or down by 0.01 degrees n times by paying a cost of n<sup>2</sup>.</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/tug_of_war.png" /><br><small><i>Plot twist: the side wanting it colder only wins when they convince the other side that "C" stands for "caliente".</i></small>
</center>
<p><br><br></p>
<h3 id="quadratic-funding">Quadratic funding</h3>
<p><em>See also the original paper: <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3243656">https://papers.ssrn.com/sol3/papers.cfm?abstract%5fid=3243656</a></em></p>
<p>Quadratic voting is optimal when you need to make some fixed number of collective decisions. But one weakness of quadratic voting is that it doesn't come with a built-in mechanism for deciding what goes on the ballot in the first place. Proposing votes is potentially a source of considerable power if not handled with care: a malicious actor in control of it can repeatedly propose some decision that a majority weakly approves of and a minority strongly disapproves of, and keep proposing it until the minority runs out of voting tokens (if you do the math you'll see that the minority would burn through tokens much faster than the majority). Let's consider a flavor of quadratic payments that does not run into this issue, and makes the choice of decisions itself endogenous (ie. part of the mechanism itself). In this case, the mechanism is specialized for one particular use case: individual provision of public goods.</p>
<p>Let us consider an example where someone is looking to produce a public good (eg. a developer writing an open source software program), and we want to figure out whether or not this program is worth funding. But instead of just thinking about one single public good, let's create a mechanism where <em>anyone</em> can raise funds for what they claim to be a public good project. Anyone can make a contribution to any project; a mechanism keeps track of these contributions and then at the end of some period of time the mechanism calculates a payment to each project. The way that this payment is calculated is as follows: for any given project, take the square root of each contributor's contribution, add these values together, and take the square of the result. Or in math speak:</p>
<p><span class="math display">\[(\sum_{i=1}^n \sqrt{c_i})^2\]</span></p>
<p>If that sounds complicated, here it is graphically:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/quadratic_funding.png" />
</center>
<p><br><br></p>
<p>In any case where there is more than one contributor, the computed payment is greater than the raw sum of contributions; the difference comes out of a central subsidy pool (eg. if ten people each donate $1, then the sum-of-square-roots is $10, and the square of that is $100, so the subsidy is $90). Note that if the subsidy pool is not big enough to make the full required payment to every project, we can just divide the subsidies proportionately by whatever constant makes the totals add up to the subsidy pool's budget; <strong>you can prove that this solves the tragedy-of-the-commons problem as well as you can with that subsidy budget</strong>.</p>
<p>There are two ways to intuitively interpret this formula. First, one can look at it through the "fixing market failure" lens, a surgical fix to the <a href="https://en.wikipedia.org/wiki/Tragedy_of_the_commons">tragedy of the commons</a> problem. In any situation where Alice contributes to a project and Bob also contributes to that same project, Alice is making a contribution to something that is valuable not only to herself, but also to Bob. When deciding <em>how much to contribute</em>, Alice was only taking into account the benefit to herself, not Bob, whom she most likely does not even know. The quadratic funding mechanism adds a subsidy to compensate for this effect, determining how much Alice "would have" contributed if she also took into account the benefit her contribution brings to Bob. Furthermore, we can separately calculate the subsidy for each pair of people (nb. if there are <code>N</code> people there are <code>N * (N-1) / 2</code> pairs), and add up all of these subsidies together, and give Bob the combined subsidy from all pairs. And it turns out that this gives exactly the quadratic funding formula.</p>
<p>Second, one can look at the formula through a quadratic voting lens. We interpret the quadratic funding as being <em>a special case</em> of quadratic voting, where the contributors to a project are voting for that project and there is one imaginary participant voting against it: the subsidy pool. Every "project" is a motion to take money from the subsidy pool and give it to that project's creator. Everyone sending <span class="math inline">\(c_i\)</span> of funds is making <span class="math inline">\(\sqrt{c_i}\)</span> votes, so there's a total of <span class="math inline">\(\sum_{i=1}^n \sqrt{c_i}\)</span> votes in favor of the motion. To kill the motion, the subsidy pool would need to make more than <span class="math inline">\(\sum_{i=1}^n \sqrt{c_i}\)</span> votes against it, which would cost it more than <span class="math inline">\((\sum_{i=1}^n \sqrt{c_i})^2\)</span>. Hence, <span class="math inline">\((\sum_{i=1}^n \sqrt{c_i})^2\)</span> is the maximum transfer from the subsidy pool to the project that the subsidy pool would not vote to stop.</p>
<p>Quadratic funding is starting to be explored as a mechanism for funding public goods already; <a href="https://vitalik.ca/general/2019/10/24/gitcoin.html">Gitcoin grants</a> for funding public goods in the Ethereum ecosystem is currently the biggest example, and the most recent round led to results that, in my own view, did a quite good job of making a fair allocation to support projects that the community deems valuable.</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/round3.png" /><br><small><i>Numbers in white are raw contribution totals; numbers in green are the extra subsidies.</i></small>
</center>
<p><br><br></p>
<h3 id="quadratic-attention-payments">Quadratic attention payments</h3>
<p><em>See also the original post: <a href="https://kortina.nyc/essays/speech-is-free-distribution-is-not-a-tax-on-the-purchase-of-human-attention-and-political-power/" class="uri">https://kortina.nyc/essays/speech-is-free-distribution-is-not-a-tax-on-the-purchase-of-human-attention-and-political-power/</a></em></p>
<p>One of the defining features of modern capitalism that people love to hate is ads. Our cities have ads:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/ads1.jpg" style="height:300px" /><br><small><i>Source: <a href="https://www.flickr.com/photos/argonavigo/36657795264">https://www.flickr.com/photos/argonavigo/36657795264</a></i></small>
</center>
<p><br><br></p>
<p>Our subway turnstiles have ads:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/ads2.jpg" style="height:300px" /><br><small><i>Source: <a href="https://commons.wikimedia.org/wiki/File:NYC,_subway_ad_on_Prince_St.jpg">https://commons.wikimedia.org/wiki/File:NYC,_subway_ad_on_Prince_St.jpg</a></i></small>
</center>
<p><br><br></p>
<p>Our politics are dominated by ads:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/ads3.jpg" style="height:300px" /><br><small><i>Source: <a href="https://upload.wikimedia.org/wikipedia/commons/e/e3/Billboard_Challenging_the_validity_of_Barack_Obama%27s_Birth_Certificate.JPG">https://upload.wikimedia.org/wikipedia/commons/e/e3/Billboard_Challenging_the_validity_of_Barack_Obama%27s_Birth_Certificate.JPG</a></i></small>
</center>
<p><br><br></p>
<p>And even the rivers and the skies <a href="https://newyork.cbslocal.com/2018/11/13/are-led-boat-advertisements-on-the-hudson-river-going-a-step-too-far/">have ads</a>. Now, there are some places that seem to not have this problem:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/ads4.png" style="height:450px" /><br><br>
</center>
<p>But really they just have a different kind of ads:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/ads5.jpg" style="height:300px" /><br><br>
</center>
<p>Now, recently there are attempts to move beyond this <a href="https://www.theguardian.com/cities/2015/aug/11/can-cities-kick-ads-ban-urban-billboards">in some cities</a>. And <a href="https://twitter.com/jack/status/1189634360472829952">on Twitter</a>. But let's look at the problem systematically and try to see what's going wrong. The answer is actually surprisingly simple: public advertising is the evil twin of public goods production. In the case of public goods production, there is one actor that is taking on an expenditure to produce some product, and this product benefits a large number of people. Because these people cannot effectively coordinate to pay for the public goods by themselves, we get much less public goods than we need, and the ones we do get are those favored by wealthy actors or centralized authorities. Here, there is one actor that reaps a large <em>benefit</em> from forcing other people to look at some image, and this action <em>harms</em> a large number of people. Because these people cannot effectively coordinate to buy out the slots for the ads, we get ads we don't want to see, that are favored by... wealthy actors or centralized authorities.</p>
<p>So how do we solve this dark mirror image of public goods production? With a bright mirror image of quadratic funding: quadratic fees! Imagine a billboard where anyone can pay $1 to put up an ad for one minute, but if they want to do this multiple times the prices go up: $2 for the second minute, $3 for the third minute, etc. Note that you can pay to extend the lifetime of <em>someone else's</em> ad on the billboard, and this also costs you only $1 for the first minute, <em>even if other people already paid to extend the ad's lifetime many times</em>. We can once again interpret this as being a special case of quadratic voting: it's basically the same as the "voting on a thermostat" example above, but where the thermostat in question is the number of seconds an ad stays up.</p>
<p>This kind of payment model could be applied in cities, on websites, at conferences, or in many other contexts, if the goal is to optimize for putting up things that people want to see (or things that people want other people to see, but even here it's much more democratic than simply buying space) rather than things that wealthy people and centralized institutions want people to see.</p>
<h3 id="complexities-and-caveats">Complexities and caveats</h3>
<p>Perhaps the biggest challenge to consider with this concept of quadratic payments is the practical implementation issue of <a href="https://vitalik.ca/general/2019/04/03/collusion.html">identity and bribery/collusion</a>. Quadratic payments in any form require a model of identity where individuals cannot easily get as many identities as they want: if they could, then they could just keep getting new identities and keep paying $1 to influence some decision as many times as they want, and the mechanism collapses into linear vote-buying. Note that the identity system does <em>not</em> need to be airtight (in the sense of preventing multiple-identity acquisition), and indeed there are good civil-liberties reasons why identity systems probably should <em>not</em> try to be airtight. Rather, it just needs to be robust enough that manipulation is not worth the cost.</p>
<p>Collusion is also tricky. If we can’t prevent people from selling their votes, the mechanisms once again collapse into one-dollar-one-vote. We don't just need votes to be anonymous and private (while still making the final result provable and public); <strong>we need votes to be so private that even the person who made the vote can't prove to anyone else what they voted for</strong>. This is difficult. Secret ballots do this well in the offline world, but secret ballots are a nineteenth century technology, far too inefficient for the sheer amount of quadratic voting and funding that we want to see in the twenty first century.</p>
<p>Fortunately, there are <a href="https://ethresear.ch/t/minimal-anti-collusion-infrastructure/5413">technological means that can help</a>, combining together zero-knowledge proofs, encryption and other cryptographic technologies to achieve the precise desired set of privacy and verifiability properties. There's also <a href="https://twitter.com/phildaian/status/1181822995993681921">proposed techniques</a> to verify that private keys actually are in an individual's possession and not in some hardware or cryptographic system that can restrict how they use those keys. However, these techniques are all untested and require quite a bit of further work.</p>
<p>Another challenge is that quadratic payments, being a payment-based mechanism, continues to favor people with more money. Note that because the cost of votes is quadratic, this effect is dampened: someone with 100 times more money only has 10 times more influence, not 100 times, so the extent of the problem goes down by 90% (and even more for ultra-wealthy actors). That said, it may be desirable to mitigate this inequality of power further. This could be done either by denominating quadratic payments in a separate token of which everyone gets a fixed number of units, or giving each person an allocation of funds that can only be used for quadratic-payments use cases: this is basically <a href="https://www.yang2020.com/policies/democracydollars/">Andrew Yang's "democracy dollars"</a> proposal.</p>
<center>
<img src="https://vitalik.ca/files/posts_files/qv-files/Oprah.png" style="height:300px" /><br>
</center>
<p>A third challenge is the "<a href="https://en.wikipedia.org/wiki/Rational_ignorance">rational ignorance</a>" and "<a href="https://en.wikipedia.org/wiki/Rational_irrationality">rational irrationality</a>" problems, which is that decentralized public decisions have the weakness that any single individual has very little effect on the outcome, and so little motivation to make sure they are supporting the decision that is best for the long term; instead, pressures such as tribal affiliation may dominate. There are many strands of philosophy that emphasize the ability of large crowds to be very wrong despite (or because of!) their size, and quadratic payments in any form do little to address this.</p>
<p>Quadratic payments do better at mitigating this problem than one-person-one-vote systems, and these problems can be expected to be less severe for medium-scale public goods than for large decisions that affect many millions of people, so it may not be a large challenge at first, but it's certainly an issue worth confronting. One approach is <a href="https://ethresear.ch/t/quadratic-voting-with-sortition/6065">combining quadratic voting with elements of sortition</a>. Another, potentially more long-term durable, approach is to combine quadratic voting with another economic technology that is much more specifically targeted toward rewarding the "correct contrarianism" that can dispel mass delusions: <a href="https://en.wikipedia.org/wiki/Prediction_market">prediction markets</a>. A simple example would be a system where quadratic funding is done <em>retrospectively</em>, so people vote on which public goods were valuable some time ago (eg. even 2 years), and projects are funded up-front by selling shares of the results of these deferred votes; by buying shares people would be both funding the projects and betting on which project would be viewed as successful in 2 years' time. There is a large design space to experiment with here.</p>
<h3 id="conclusion">Conclusion</h3>
<p>As I mentioned at the beginning, quadratic payments do not solve every problem. They solve the problem of governing resources that affect large numbers of people, but they do not solve many other kinds of problems. A particularly important one is information asymmetry and low quality of information in general. For this reason, I am a fan of techniques such as prediction markets (see <a href="https://electionbettingodds.com/">electionbettingodds.com</a> for one example) to solve information-gathering problems, and many applications can be made most effective by combining different mechanisms together.</p>
<p>One particular cause dear to me personally is what I call "entrepreneurial public goods": public goods that in the present only a few people believe are important but in the future many more people will value. In the 19th century, contributing to abolition of slavery may have been one example; in the 21st century I can't give examples that will satisfy every reader because it's the nature of these goods that their importance will only become common knowledge later down the road, but I would point to <a href="https://www.sens.org/">life extension</a> and <a href="https://intelligence.org/">AI risk research</a> as two possible examples.</p>
<p>That said, we don't need to solve every problem today. Quadratic payments are an idea that has only become popular in the last few years; we still have not seen more than small-scale trials of quadratic voting and funding, and quadratic attention payments have not been tried at all! There is still a long way to go. But if we can get these mechanisms off the ground, there is a lot that these mechanisms have to offer!</p>
Sat, 07 Dec 2019 17:03:10 -0800
https://vitalik.ca/general/2019/12/07/quadratic.html
https://vitalik.ca/general/2019/12/07/quadratic.htmlgeneralHard Problems in Cryptocurrency: Five Years Later<p><em>Special thanks to Justin Drake and Jinglan Wang for feedback</em></p>
<p>In 2014, I made a <a href="https://github.com/ethereum/wiki/wiki/Problems/89fd07ffff8b042134e4ca67a0ce143d574016bd">post</a> and a <a href="https://www.youtube.com/watch?v=rXRtJcNVfQE">presentation</a> with a list of hard problems in math, computer science and economics that I thought were important for the cryptocurrency space (as I then called it) to be able to reach maturity. In the last five years, much has changed. But exactly how much progress on what we thought then was important has been achieved? Where have we succeeded, where have we failed, and where have we changed our minds about what is important? In this post, I'll go through the 16 problems from 2014 one by one, and see just where we are today on each one. At the end, I’ll include my new picks for hard problems of 2019.</p>
<p>The problems are broken down into three categories: (i) cryptographic, and hence expected to be solvable with purely mathematical techniques if they are to be solvable at all, (ii) consensus theory, largely improvements to proof of work and proof of stake, and (iii) economic, and hence having to do with creating structures involving incentives given to different participants, and often involving the application layer more than the protocol layer. We see significant progress in all categories, though some more than others.</p>
<h2 id="cryptographic-problems">Cryptographic problems</h2>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol type="1">
<li>Blockchain Scalability
</h3>
One of the largest problems facing the cryptocurrency space today is the issue of scalability ... The main concern with [oversized blockchains] is trust: if there are only a few entities capable of running full nodes, then those entities can conspire and agree to give themselves a large number of additional bitcoins, and there would be no way for other users to see for themselves that a block is invalid without processing an entire block themselves.<br />
<strong>Problem:</strong> create a blockchain design that maintains Bitcoin-like security guarantees, but where the maximum size of the most powerful node that needs to exist for the network to keep functioning is substantially sublinear in the number of transactions.
</blockquote></li>
</ol>
<p>Status: <strong>Great theoretical progress, pending more real-world evaluation</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face7.png" style="width:50px; height: 50px" /></p>
<p>Scalability is one technical problem that we have had a huge amount of progress on theoretically. Five years ago, almost no one was thinking about sharding; now, sharding designs are commonplace. Aside from <a href="https://github.com/ethereum/eth2.0-specs">ethereum 2.0</a>, we have <a href="https://eprint.iacr.org/2017/406.pdf">OmniLedger</a>, <a href="https://arxiv.org/abs/1905.09274">LazyLedger</a>, <a href="https://medium.com/@giottodf/zilliqa-a-novel-approach-to-sharding-d79249347a1f">Zilliqa</a> and research papers <a href="https://arxiv.org/pdf/1910.10434.pdf">seemingly coming out every month</a>. In my own view, further progress at this point is incremental. Fundamentally, we already have a number of techniques that allow groups of validators to securely come to consensus on much more data than an individual validator can process, as well as techniques allow clients to indirectly verify the full validity and availability of blocks even under 51% attack conditions.</p>
<p>These are probably the most important technologies:</p>
<ul>
<li><strong>Random sampling</strong>, allowing a small randomly selected committee to statistically stand in for the full validator set: <a href="https://github.com/ethereum/wiki/wiki/Sharding-FAQ#how-can-we-solve-the-single-shard-takeover-attack-in-an-uncoordinated-majority-model" class="uri">https://github.com/ethereum/wiki/wiki/Sharding-FAQ#how-can-we-solve-the-single-shard-takeover-attack-in-an-uncoordinated-majority-model</a></li>
<li><strong>Fraud proofs</strong>, allowing individual nodes that learn of an error to broadcast its presence to everyone else: <a href="https://bitcoin.stackexchange.com/questions/49647/what-is-a-fraud-proof" class="uri">https://bitcoin.stackexchange.com/questions/49647/what-is-a-fraud-proof</a></li>
<li><strong>Proofs of custody</strong>, allowing validators to probabilistically prove that they individually downloaded and verified some piece of data: <a href="https://ethresear.ch/t/1-bit-aggregation-friendly-custody-bonds/2236" class="uri">https://ethresear.ch/t/1-bit-aggregation-friendly-custody-bonds/2236</a></li>
<li><strong>Data availability proofs</strong>, allowing clients to detect when the bodies of blocks that they have headers for <a href="https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding">are unavailable</a>: <a href="https://arxiv.org/abs/1809.09044" class="uri">https://arxiv.org/abs/1809.09044</a>. See also the newer <a href="https://arxiv.org/abs/1910.01247">coded Merkle trees</a> proposal.</li>
</ul>
<p>There are also other smaller developments like <a href="https://github.com/ethereum/wiki/wiki/Sharding-FAQ#how-can-we-facilitate-cross-shard-communication">Cross-shard communication via receipts</a> as well as "constant-factor" enhancements such as BLS signature aggregation.</p>
<p>That said, fully sharded blockchains have still not been seen in live operation (the partially sharded Zilliqa has recently started running). On the theoretical side, there are mainly disputes about details remaining, along with challenges having to do with stability of sharded networking, developer experience and mitigating risks of centralization; fundamental technical possibility no longer seems in doubt. But the challenges that <em>do</em> remain are challenges that cannot be solved by just thinking about them; only developing the system and seeing ethereum 2.0 or some similar chain running live will suffice.</p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="2" type="1">
<li>Timestamping
</h3>
<strong>Problem:</strong> create a distributed incentive-compatible system, whether it is an overlay on top of a blockchain or its own blockchain, which maintains the current time to high accuracy. All legitimate users have clocks in a normal distribution around some "real" time with standard deviation 20 seconds ... no two nodes are more than 20 seconds apart The solution is allowed to rely on an existing concept of "N nodes"; this would in practice be enforced with proof-of-stake or non-sybil tokens (see #9). The system should continuously provide a time which is within 120s (or less if possible) of the internal clock of >99% of honestly participating nodes. External systems may end up relying on this system; hence, it should remain secure against attackers controlling < 25% of nodes regardless of incentives.
</blockquote></li>
</ol>
<p>Status: <strong>Some progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face2.png" style="width:50px; height: 50px" /></p>
<p>Ethereum has actually survived just fine with a 13-second block time and no particularly advanced timestamping technology; it uses a simple technique where a client does not accept a block whose stated timestamp is earlier than the client's local time. That said, this has not been tested under serious attacks. The recent <a href="https://ethresear.ch/t/network-adjusted-timestamps/4187">network-adjusted timestamps</a> proposal tries to improve on the status quo by allowing the client to determine the consensus on the time in the case where the client does not locally know the current time to high accuracy; this has not yet been tested. But in general, timestamping is not currently at the foreground of perceived research challenges; perhaps this will change once more proof of stake chains (including Ethereum 2.0 but also others) come online as real live systems and we see what the issues are.</p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="3" type="1">
<li>Arbitrary Proof of Computation
</h3>
<strong>Problem:</strong> create programs <code>POC_PROVE(P,I) -> (O,Q)</code> and <code>POC_VERIFY(P,O,Q) -> { 0, 1 }</code> such that <code>POC_PROVE</code> runs program <code>P</code> on input <code>I</code> and returns the program output <code>O</code> and a proof-of-computation <code>Q</code> and POC_VERIFY takes <code>P</code>, <code>O</code> and <code>Q</code> and outputs whether or not <code>Q</code> and <code>O</code> were legitimately produced by the <code>POC_PROVE</code> algorithm using <code>P</code>.
</blockquote></li>
</ol>
<p>Status: <strong>Great theoretical and practical progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face1.png" style="width:50px; height: 50px" /></p>
<p>This is basically saying, build a SNARK (or STARK, or SHARK, or...). And <a href="https://medium.com/@VitalikButerin/zk-snarks-under-the-hood-b33151a013f6">we've</a> <a href="https://vitalik.ca/general/2018/07/21/starks_part_3.html">done</a> <a href="https://vitalik.ca/general/2019/09/22/plonk.html">it</a>! SNARKs are now increasingly well understood, and are even already being used in multiple blockchains today (including <a href="https://tornado.cash/">tornado.cash</a> on Ethereum). And SNARKs are extremely useful, both as a privacy technology (see Zcash and tornado.cash) and as a scalability technology (see <a href="https://ethresear.ch/t/on-chain-scaling-to-potentially-500-tx-sec-through-mass-tx-validation/3477">ZK Rollup</a>, <a href="https://www.starkdex.io/">STARKDEX</a> and <a href="https://ethresear.ch/t/stark-proving-low-degree-ness-of-a-data-availability-root-some-analysis/6214">STARKing erasure coded data roots</a>).</p>
<p>There are still challenges with efficiency; making arithmetization-friendly hash functions (see <a href="https://starkware.co/hash-challenge/">here</a> and <a href="https://mimchash.org/">here</a> for bounties for breaking proposed candidates) is a big one, and efficiently proving random memory accesses is another. Furthermore, there's the unsolved question of whether the O(n * log(n)) blowup in prover time is a fundamental limitation or if there is some way to make a succinct proof with only linear overhead as in <a href="https://web.stanford.edu/~buenz/pubs/bulletproofs.pdf">bulletproofs</a> (which unfortunately take linear time to verify). There are also ever-present risks that the existing schemes have bugs. In general, the problems are in the details rather than the fundamentals.</p>
<p><a name="numberfour"></a></p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="4" type="1">
<li>Code Obfuscation
</h3>
The holy grail is to create an obfuscator O, such that given any program P the obfuscator can produce a second program O(P) = Q such that P and Q return the same output if given the same input and, importantly, Q reveals no information whatsoever about the internals of P. One can hide inside of Q a password, a secret encryption key, or one can simply use Q to hide the proprietary workings of the algorithm itself.
</blockquote></li>
</ol>
<p>Status: <strong>Slow progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face3.png" style="width:50px; height: 50px" /></p>
<p>In plain English, the problem is saying that we want to come up with a way to "encrypt" a program so that the encrypted program would still give the same outputs for the same inputs, but the "internals" of the program would be hidden. An example use case for obfuscation is a program containing a private key where the program only allows the private key to sign certain messages.</p>
<p>A solution to code obfuscation would be very useful to blockchain protocols. The use cases are subtle, because one must deal with the possibility that an on-chain obfuscated program will be copied and run in an environment different from the chain itself, but there are many possibilities. One that personally interests me is the ability to remove the centralized operator from <a href="https://ethresear.ch/t/minimal-anti-collusion-infrastructure/5413">collusion-resistance gadgets</a> by replacing the operator with an obfuscated program that contains some proof of work, making it very expensive to run more than once with different inputs as part of an attempt to determine individual participants' actions.</p>
<p>Unfortunately this continues to be a hard problem. There is continuing ongoing work in attacking the problem, one side making constructions (eg. <a href="https://eprint.iacr.org/2018/615">this</a>) that try to reduce the number of assumptions on mathematical objects that we do not know practically exist (eg. general cryptographic multilinear maps) and another side trying to make practical implementations of the desired mathematical objects. However, all of these paths are still quite far from creating something viable and known to be secure. See <a href="https://eprint.iacr.org/2019/463.pdf" class="uri">https://eprint.iacr.org/2019/463.pdf</a> for a more general overview to the problem.</p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="5" type="1">
<li>Hash-Based Cryptography
</h3>
<strong>Problem:</strong> create a signature algorithm relying on no security assumption but the random oracle property of hashes that maintains 160 bits of security against classical computers (ie. 80 vs. quantum due to Grover's algorithm) with optimal size and other properties.
</blockquote></li>
</ol>
<p>Status: <strong>Some progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face2.png" style="width:50px; height: 50px" /></p>
<p>There have been two strands of progress on this since 2014. <a href="https://cryptojedi.org/papers/sphincs-20141001.pdf">SPHINCS</a>, a "stateless" (meaning, using it multiple times does not require remembering information like a nonce) signature scheme, was released soon after this "hard problems" list was published, and provides a purely hash-based signature scheme of size around 41 kB. Additionally, <a href="https://vitalik.ca/general/2018/07/21/starks_part_3.html">STARKs</a> have been developed, and one can create signatures of similar size based on them. The fact that not just signatures, but also general-purpose zero knowledge proofs, are possible with just hashes was definitely something I did not expect five years ago; I am very happy that this is the case. That said, size continues to be an issue, and ongoing progress (eg. see the very recent <a href="https://arxiv.org/abs/1903.12243">DEEP FRI</a>) is continuing to reduce the size of proofs, though it looks like further progress will be incremental.</p>
<p>The main not-yet-solved problem with hash-based cryptography is aggregate signatures, similar to what <a href="https://ethresear.ch/t/pragmatic-signature-aggregation-with-bls/2105">BLS aggregation</a> makes possible. It's known that we can just make a STARK over many Lamport signatures, but this is inefficient; a more efficient scheme would be welcome. (In case you're wondering if hash-based <em>public key encryption</em> is possible, the answer is, no, you can't do anything with <a href="https://www.boazbarak.org/Papers/merkle.pdf">more than a quadratic attack cost</a>)</p>
<h2 id="consensus-theory-problems">Consensus theory problems</h2>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="6" type="1">
<li>ASIC-Resistant Proof of Work
</h3>
One approach at solving the problem is creating a proof-of-work algorithm based on a type of computation that is very difficult to specialize ... For a more in-depth discussion on ASIC-resistant hardware, see <a href="https://blog.ethereum.org/2014/06/19/mining/" class="uri">https://blog.ethereum.org/2014/06/19/mining/</a>.
</blockquote></li>
</ol>
<p>Status: <strong>Solved as far as we can</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face4.png" style="width:50px; height: 50px" /></p>
<p>About six months after the "hard problems" list was posted, Ethereum settled on its ASIC-resistant proof of work algorithm: <a href="https://github.com/ethereum/wiki/wiki/Ethash">Ethash</a>. Ethash is known as a memory-hard algorithm. The theory is that random-access memory in regular computers is well-optimized already and hence difficult to improve on for specialized applications. Ethash aims to achieve ASIC resistance by making memory access the dominant part of running the PoW computation. Ethash was not the first memory-hard algorithm, but it did add one innovation: it uses pseudorandom lookups over a two-level DAG, allowing for two ways of evaluating the function. First, one could compute it quickly if one has the entire (~2 GB) DAG; this is the memory-hard "fast path". Second, one can compute it much more slowly (still fast enough to check a single provided solution quickly) if one only has the top level of the DAG; this is used for block verification.</p>
<p>Ethash has proven remarkably successful at ASIC resistance; after three years and billions of dollars of block rewards, ASICs do exist but are at best <a href="https://blog.miningstore.com/blog/ethereum-mining-hardware-for-2019">2-5 times more power and cost-efficient</a> than GPUs. <a href="https://github.com/ifdefelse/ProgPOW">ProgPoW</a> has been proposed as an alternative, but there is a growing consensus that ASIC-resistant algorithms will inevitably have a limited lifespan, and that ASIC resistance <a href="https://pdaian.com/blog/anti-asic-forks-considered-harmful/">has downsides</a> because it makes 51% attacks cheaper (eg. see the <a href="https://cointelegraph.com/news/ethereum-classic-51-attack-the-reality-of-proof-of-work">51% attack on Ethereum Classic</a>).</p>
<p>I believe that PoW algorithms that provide a medium level of ASIC resistance can be created, but such resistance is limited-term and both ASIC and non-ASIC PoW have disadvantages; in the long term the better choice for blockchain consensus is proof of stake.</p>
<blockquote style="background-color:#ffeeff">
<h3>
<ol start="7" type="1">
<li>Useful Proof of Work
</h3>
making the proof of work function something which is simultaneously useful; a common candidate is something like Folding@home, an existing program where users can download software onto their computers to simulate protein folding and provide researchers with a large supply of data to help them cure diseases.
</blockquote></li>
</ol>
<p>Status: <strong>Probably not feasible, with one exception</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face3.png" style="width:50px; height: 50px" /></p>
<p>The challenge with useful proof of work is that a proof of work algorithm requires many properties:</p>
<ul>
<li>Hard to compute</li>
<li>Easy to verify</li>
<li>Does not depend on large amounts of external data</li>
<li>Can be efficiently computed in small "bite-sized" chunks</li>
</ul>
<p>Unfortunately, there are not many computations that are useful that preserve all of these properties, and most computations that <em>do</em> have all of those properties and are "useful" are only "useful" for far too short a time to build a cryptocurrency around them.</p>
<p>However, there is one possible exception: zero-knowledge-proof generation. Zero knowledge proofs of aspects of blockchain validity (eg. <a href="https://ethresear.ch/t/stark-proving-low-degree-ness-of-a-data-availability-root-some-analysis/6214">data availability roots</a> for a simple example) are difficult to compute, and easy to verify. Furthermore, they are durably difficult to compute; if proofs of "highly structured" computation become too easy, one can simply switch to verifying a blockchain's entire state transition, which becomes extremely expensive due to the need to model the virtual machine and random memory accesses.</p>
<p>Zero-knowledge proofs of blockchain validity provide great value to users of the blockchain, as they can substitute the need to verify the chain directly; <a href="https://codaprotocol.com/">Coda</a> is doing this already, albeit with a simplified blockchain design that is heavily optimized for provability. Such proofs can significantly assist in improving the blockchain's safety and scalability. That said, the total amount of computation that realistically needs to be done is still much less than the amount that's currently done by proof of work miners, so this would at best be an add-on for proof of stake blockchains, not a full-on consensus algorithm.</p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="8" type="1">
<li>Proof of Stake
</h3>
Another approach to solving the mining centralization problem is to abolish mining entirely, and move to some other mechanism for counting the weight of each node in the consensus. The most popular alternative under discussion to date is "proof of stake" - that is to say, instead of treating the consensus model as "one unit of CPU power, one vote" it becomes "one currency unit, one vote".
</blockquote></li>
</ol>
<p>Status: <strong>Great theoretical progress, pending more real-world evaluation</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face1.png" style="width:50px; height: 50px" /></p>
<p>Near the end of 2014, it became clear to the proof of stake community that some form of "weak subjectivity" <a href="https://blog.ethereum.org/2014/11/25/proof-stake-learned-love-weak-subjectivity">is unavoidable</a>. To maintain economic security, nodes need to obtain a recent checkpoint extra-protocol when they sync for the first time, and again if they go offline for more than a few months. This was a difficult pill to swallow; many PoW advocates still cling to PoW precisely because in a PoW chain the "head" of the chain can be discovered with the only data coming from a trusted source being the blockchain client software itself. PoS advocates, however, were willing to swallow the pill, seeing the added trust requirements as not being large. From there the path to proof of stake through long-duration security deposits became clear.</p>
<p>Most interesting consensus algorithms today are fundamentally similar to <a href="http://pmg.csail.mit.edu/papers/osdi99.pdf">PBFT</a>, but replace the fixed set of validators with a dynamic list that anyone can join by sending tokens into a system-level smart contract with time-locked withdrawals (eg. a withdrawal might in some cases take up to 4 months to complete). In many cases (including ethereum 2.0), these algorithms achieve "economic finality" by penalizing validators that are caught performing actions that violate the protocol in certain ways (see <a href="https://medium.com/@VitalikButerin/a-proof-of-stake-design-philosophy-506585978d51">here</a> for a philosophical view on what proof of stake accomplishes).</p>
<p>As of today, we have (among many other algorithms):</p>
<ul>
<li><strong>Casper FFG</strong>: <a href="https://arxiv.org/abs/1710.09437" class="uri">https://arxiv.org/abs/1710.09437</a></li>
<li><strong>Tendermint</strong>: <a href="https://tendermint.com/docs/spec/consensus/consensus.html" class="uri">https://tendermint.com/docs/spec/consensus/consensus.html</a></li>
<li><strong>HotStuff</strong>: <a href="https://arxiv.org/abs/1803.05069" class="uri">https://arxiv.org/abs/1803.05069</a></li>
<li><strong>Casper CBC</strong>: <a href="https://vitalik.ca/general/2018/12/05/cbc_casper.html" class="uri">https://vitalik.ca/general/2018/12/05/cbc_casper.html</a></li>
</ul>
<p>There continues to be ongoing refinement (eg. <a href="https://ethresear.ch/t/analysis-of-bouncing-attack-on-ffg/6113">here</a> and <a href="https://ethresear.ch/t/saving-strategy-and-fmd-ghost/6226">here</a>) . Eth2 phase 0, the chain that will implement FFG, is currently under implementation and enormous progress has been made. Additionally, Tendermint has been running, in the form of the <a href="https://cosmos.bigdipper.live/validators">Cosmos chain</a> for several months. Remaining arguments about proof of stake, in my view, have to do with optimizing the economic incentives, and further formalizing the <a href="https://ethresear.ch/t/responding-to-51-attacks-in-casper-ffg/6363">strategy for responding to 51% attacks</a>. Additionally, the <a href="https://github.com/ethereum/eth2.0-specs/issues/701">Casper CBC spec</a> could still use concrete efficiency improvements.</p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="9" type="1">
<li>Proof of Storage
</h3>
A third approach to the problem is to use a scarce computational resource other than computational power or currency. In this regard, the two main alternatives that have been proposed are storage and bandwidth. There is no way in principle to provide an after-the-fact cryptographic proof that bandwidth was given or used, so proof of bandwidth should most accurately be considered a subset of social proof, discussed in later problems, but proof of storage is something that certainly can be done computationally. An advantage of proof-of-storage is that it is completely ASIC-resistant; the kind of storage that we have in hard drives is already close to optimal.
</blockquote></li>
</ol>
<p>Status: <strong>A lot of theoretical progress, though still a lot to go, as well as more real-world evaluation</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face2.png" style="width:50px; height: 50px" /></p>
<p>There are a number of <a href="https://en.wikipedia.org/wiki/Proof_of_space">blockchains planning to use proof of storage</a> protocols, including <a href="https://eprint.iacr.org/2017/893.pdf">Chia</a> and <a href="https://filecoin.io/filecoin.pdf">Filecoin</a>. That said, these algorithms have not been tested in the wild. My own main concern is centralization: will these algorithms actually be dominated by smaller users using spare storage capacity, or will they be dominated by large mining farms?</p>
<h2 id="economics">Economics</h2>
<p><a name="numberten"></a></p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="10" type="1">
<li>Stable-value cryptoassets
</h3>
One of the main problems with Bitcoin is the issue of price volatility ... Problem: construct a cryptographic asset with a stable price.
</blockquote></li>
</ol>
<p>Status: <strong>Some progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face2.png" style="width:50px; height: 50px" /></p>
<p><a href="https://makerdao.com/en/">MakerDAO</a> is now live, and has been holding stable for nearly two years. It has survived a 93% drop in the value of its underlying collateral asset (ETH), and there is now more than $100 million in DAI issued. It has become a mainstay of the Ethereum ecosystem, and many Ethereum projects have or are integrating with it. Other synthetic token projects, such as <a href="https://umaproject.org/">UMA</a>, are rapidly gaining steam as well.</p>
<p>However, while the MakerDAO system has survived tough economic conditions in 2019, the conditions were by no means the toughest that could happen. In the past, Bitcoin has <a href="https://fortune.com/2017/09/18/bitcoin-crash-history/">fallen by 75%</a> over the course of two days; the same may happen to ether or any other collateral asset some day. Attacks on the underlying blockchain are an even larger untested risk, especially if compounded by price decreases at the same time. Another major challenge, and arguably the larger one, is that the stability of MakerDAO-like systems is dependent on some underlying oracle scheme. Different attempts at oracle systems do exist (see #16), but the jury is still out on how well they can hold up under large amounts of economic stress. So far, the collateral controlled by MakerDAO has been lower than the value of the MKR token; if this relationship reverses MKR holders may have a collective incentive to try to "loot" the MakerDAO system. There are ways to try to protect against such attacks, but they have not been tested in real life.</p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="11" type="1">
<li>Decentralized Public Goods Incentivization
</h3>
One of the challenges in economic systems in general is the problem of "public goods". For example, suppose that there is a scientific research project which will cost $1 million to complete, and it is known that if it is completed the resulting research will save one million people $5 each. In total, the social benefit is clear ... [but] from the point of view of each individual person contributing does not make sense ... So far, most problems to public goods have involved centralization Additional Assumptions And Requirements: A fully trustworthy oracle exists for determining whether or not a certain public good task has been completed (in reality this is false, but this is the domain of another problem)
</blockquote></li>
</ol>
<p>Status: <strong>Some progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face2.png" style="width:50px; height: 50px" /></p>
<p>The problem of funding public goods is generally understood to be split into two problems: the funding problem (where to get funding for public goods from) and the preference aggregation problem (how to determine what is a genuine public good, rather than some single individual's pet project, in the first place). This problem focuses specifically on the former, assuming the latter is solved (see the <a href="#numberfourteensic">"decentralized contribution metrics" section below</a> for work on that problem).</p>
<p>In general, there haven't been large new breakthroughs here. There's two major categories of solutions. First, we can try to elicit individual contributions, giving people social rewards for doing so. My own proposal for <a href="https://vitalik.ca/general/2017/03/11/a_note_on_charity.html">charity through marginal price discrimination</a> is one example of this; another is the anti-malaria donation badges on <a href="https://peepeth.com/welcome">Peepeth</a>. Second, we can collect funds from applications that have network effects. Within blockchain land there are several options for doing this:</p>
<ul>
<li>Issuing coins</li>
<li>Taking a portion of transaction fees at protocol level (eg. through <a href="https://github.com/ethereum/EIPs/issues/1559">EIP 1559</a>)</li>
<li>Taking a portion of transaction fees from some layer-2 application (eg. Uniswap, or some scaling solution, or even state rent in an execution environment in ethereum 2.0)</li>
<li>Taking a portion of other kinds of fees (eg. ENS registration)</li>
</ul>
<p>Outside of blockchain land, this is just the age-old question of how to collect taxes if you're a government, and charge fees if you're a business or other organization.</p>
<p><a name="numbertwelve"></a></p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="12" type="1">
<li>Reputation systems
</h3>
<strong>Problem:</strong> design a formalized reputation system, including a score rep(A,B) -> V where V is the reputation of B from the point of view of A, a mechanism for determining the probability that one party can be trusted by another, and a mechanism for updating the reputation given a record of a particular open or finalized interaction.
</blockquote></li>
</ol>
<p>Status: <strong>Slow progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face3.png" style="width:50px; height: 50px" /></p>
<p>There hasn't really been much work on reputation systems since 2014. Perhaps the best is the use of token curated registries to create curated lists of trustable entities/objects; the <a href="https://blog.kleros.io/erc20-becomes-part-of-the-token/">Kleros ERC20 TCR</a> (yes, that's a <a href="https://medium.com/@tokencuratedregistry/a-simple-overview-of-token-curated-registries-84e2b7b19a06">token-curated registry</a> of legitimate ERC20 tokens) is one example, and there is even an alternative interface to Uniswap (<a href="http://uniswap.ninja" class="uri">http://uniswap.ninja</a>) that uses it as the backend to get the list of tokens and ticker symbols and logos from. Reputation systems of the subjective variety have not really been tried, perhaps because there is just not enough information about the "social graph" of people's connections to each other that has already been published to chain in some form. If such information starts to exist for other reasons, then subjective reputation systems may become more popular.</p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="13" type="1">
<li>Proof of excellence
</h3>
One interesting, and largely unexplored, solution to the problem of [token] distribution specifically (there are reasons why it cannot be so easily used for mining) is using tasks that are socially useful but require original human-driven creative effort and talent. For example, one can come up with a "proof of proof" currency that rewards players for coming up with mathematical proofs of certain theorems
</blockquote></li>
</ol>
<p>Status: <strong>No progress, problem is largely forgotten</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face5.png" style="width:50px; height: 50px" /></p>
<p>The main alternative approach to token distribution that has instead become popular is <a href="https://en.wikipedia.org/wiki/Airdrop_%28cryptocurrency%29">airdrops</a>; typically, tokens are distributed at launch either proportionately to existing holdings of some other token, or based on some other metric (eg. as in the <a href="https://help.namebase.io/article/4vchu01mec-handshake-airdrop-101">Handshake airdrop</a>). Verifying human creativity directly has not really been attempted, and with recent progress on AI the problem of creating a task that only humans can do but computers can verify may well be too difficult.</p>
<p><a name="numberfifteensic"></a></p>
<blockquote style="background-color:#ffe4ff">
<h3>
15 [sic]. Anti-Sybil systems
</h3>
A problem that is somewhat related to the issue of a reputation system is the challenge of creating a "unique identity system" - a system for generating tokens that prove that an identity is not part of a Sybil attack ... However, we would like to have a system that has nicer and more egalitarian features than "one-dollar-one-vote"; arguably, one-person-one-vote would be ideal.
</blockquote>
<p>Status: <strong>Some progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face2.png" style="width:50px; height: 50px" /></p>
<p>There have been quite a few attempts at solving the unique-human problem. Attempts that come to mind include (incomplete list!):</p>
<ul>
<li><strong>HumanityDAO</strong>: <a href="https://www.humanitydao.org/" class="uri">https://www.humanitydao.org/</a></li>
<li><strong>Pseudonym parties</strong>: <a href="https://bford.info/pub/net/sybil.pdf" class="uri">https://bford.info/pub/net/sybil.pdf</a></li>
<li><strong>POAP</strong> ("proof of attendance protocol"): <a href="https://www.poap.xyz/" class="uri">https://www.poap.xyz/</a></li>
<li><strong>BrightID</strong>: <a href="https://www.brightid.org/" class="uri">https://www.brightid.org/</a></li>
</ul>
<p>With the growing interest in techniques like <a href="https://en.wikipedia.org/wiki/Quadratic_voting">quadratic voting</a> and <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3243656">quadratic funding</a>, the need for some kind of human-based anti-sybil system continues to grow. Hopefully, ongoing development of these techniques and new ones can come to meet it.</p>
<a name="numberfourteensic"></a>
<blockquote style="background-color:#ffe4ff">
<h3>
14 [sic]. Decentralized contribution metrics
</h3>
Incentivizing the production of public goods is, unfortunately, not the only problem that centralization solves. The other problem is determining, first, which public goods are worth producing in the first place and, second, determining to what extent a particular effort actually accomplished the production of the public good. This challenge deals with the latter issue.
</blockquote>
<p>Status: <strong>Some progress, some change in focus</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face6.png" style="width:50px; height: 50px" /></p>
<p>More recent work on determining value of public-good contributions does not try to separate determining tasks and determining quality of completion; the reason is that in practice the two are difficult to separate. Work done by specific teams tends to be non-fungible and subjective enough that the most reasonable approach is to look at relevance of task and quality of performance as a single package, and use the same technique to evaluate both.</p>
<p>Fortunately, there has been great progress on this, particularly with the discovery of <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3243656">quadratic funding</a>. Quadratic funding is a mechanism where individuals can make donations to projects, and then based on the number of people who donated and how much they donated, a formula is used to calculate how much they would have donated if they were perfectly coordinated with each other (ie. took each other's interests into account and did not fall prey to the tragedy of the commons). The difference between amount would-have-donated and amount actually donated for any given project is given to that project as a subsidy from some central pool (see #11 for where the central pool funding could come from). Note that this mechanism focuses on satisfying the values of some community, not on satisfying some given goal regardless of whether or not anyone cares about it. Because of the <a href="https://wiki.lesswrong.com/wiki/Complexity_of_value">complexity of values</a> problem, this approach is likely to be much more robust to unknown unknowns.</p>
<p>Quadratic funding has even been tried in real life with considerable success in the <a href="https://vitalik.ca/general/2019/10/24/gitcoin.html">recent gitcoin quadratic funding round</a>. There has also been some incremental progress on improving quadratic funding and similar mechanisms; particularly, <a href="https://ethresear.ch/t/pairwise-coordination-subsidies-a-new-quadratic-funding-design/5553">pairwise-bounded quadratic funding</a> to mitigate collusion. There has also been work on specification and implementation of <a href="https://ethresear.ch/t/minimal-anti-collusion-infrastructure/5413">bribe-resistant</a> voting technology, preventing users from proving to third parties who they voted for; this prevents many kinds of collusion and bribe attacks.</p>
<p><a name="numbersixteen"></a></p>
<blockquote style="background-color:#ffe4ff">
<h3>
<ol start="16" type="1">
<li>Decentralized success metrics
</h3>
Problem: come up with and implement a decentralized method for measuring numerical real-world variables ... the system should be able to measure anything that humans can currently reach a rough consensus on (eg. price of an asset, temperature, global CO2 concentration)
</blockquote></li>
</ol>
<p>Status: <strong>Some progress</strong>. <img src="http://vitalik.ca/files/posts_files/progress-files/happy_face2.png" style="width:50px; height: 50px" /></p>
<p>This is now generally just called "the oracle problem". The largest known instance of a decentralized oracle running is <a href="https://www.augur.net/">Augur</a>, which has processed outcomes for millions of dollars of bets. <a href="https://medium.com/@tokencuratedregistry/a-simple-overview-of-token-curated-registries-84e2b7b19a06">Token curated registries</a> such as the <a href="https://tokens.kleros.io/tokens">Kleros TCR for tokens</a> are another example. However, these systems still have not seen a real-world test of the forking mechanism (search for "subjectivocracy" <a href="https://blog.ethereum.org/2015/02/14/subjectivity-exploitability-tradeoff/">here</a>) either due to a highly controversial question or due to an attempted 51% attack. There is also research on the oracle problem happening outside of the blockchain space in the form of the "<a href="https://www2.cs.duke.edu/courses/spring17/compsci590.2/peer_prediction.pdf">peer prediction</a>" literature; see <a href="https://arxiv.org/abs/1911.00272">here</a> for a very recent advancement in the space.</p>
<p>Another looming challenge is that people want to rely on these systems to guide transfers of quantities of assets larger than the economic value of the system's native token. In these conditions, token holders in theory have the incentive to collude to give wrong answers to steal the funds. In such a case, the system would fork and the original system token would likely become valueless, but the original system token holders would still get away with the returns from whatever asset transfer they misdirected. Stablecoins (see <a href="#numberten">#10</a>) are a particularly egregious case of this. One approach to solving this would be a system that assumes that altruistically honest data providers do exist, and creating a mechanism to identify them, and only allowing them to churn slowly so that if malicious ones start getting voted in the users of systems that rely on the oracle can first complete an orderly exit. In any case, more development of oracle tech is very much an important problem.</p>
<h3 id="new-problems">New problems</h3>
<p>If I were to write the hard problems list again in 2019, some would be a continuation of the above problems, but there would be significant changes in emphasis, as well as significant new problems. Here are a few picks:</p>
<ul>
<li><strong>Cryptographic obfuscation</strong>: same as <a href="#numberfour">#4</a> above</li>
<li><strong>Ongoing work on post-quantum cryptography</strong>: both hash-based as well as based on post-quantum-secure "structured" mathematical objects, eg. elliptic curve isogenies, lattices...</li>
<li><strong>Anti-collusion infrastructure</strong>: ongoing work and refinement of <a href="https://ethresear.ch/t/minimal-anti-collusion-infrastructure/5413" class="uri">https://ethresear.ch/t/minimal-anti-collusion-infrastructure/5413</a>, including adding privacy against the operator, adding multi-party computation in a maximally practical way, etc.</li>
<li><strong>Oracles</strong>: same as <a href="#numbersixteen">#16</a> above, but removing the emphasis on "success metrics" and focusing on the general "get real-world data" problem</li>
<li><strong>Unique-human identities</strong> (or, more realistically, semi-unique-human identities): same as what was written as <a href="#numberfifteensic">#15</a> above, but with an emphasis on a less "absolute" solution: it should be much harder to get two identities than one, but making it impossible to get multiple identities is both impossible and potentially harmful even if we do succeed</li>
<li><strong>Homomorphic encryption and multi-party computation</strong>: ongoing improvements are still required for practicality</li>
<li><strong>Decentralized governance mechanisms</strong>: DAOs are cool, but current DAOs are still very primitive; we can do better</li>
<li><strong>Fully formalizing responses to PoS 51% attacks</strong>: ongoing work and refinement of <a href="https://ethresear.ch/t/responding-to-51-attacks-in-casper-ffg/6363" class="uri">https://ethresear.ch/t/responding-to-51-attacks-in-casper-ffg/6363</a></li>
<li><strong>More sources of public goods funding</strong>: the ideal is to charge for congestible resources inside of systems that have network effects (eg. transaction fees), but doing so in decentralized systems requires public legitimacy; hence this is a social problem along with the technical one of finding possible sources</li>
<li><strong>Reputation systems</strong>: same as <a href="#numbertwelve">#12</a> above</li>
</ul>
<p>In general, base-layer problems are slowly but surely decreasing, but application-layer problems are only just getting started.</p>
Fri, 22 Nov 2019 17:03:10 -0800
https://vitalik.ca/general/2019/11/22/progress.html
https://vitalik.ca/general/2019/11/22/progress.htmlgeneralReview of Gitcoin Quadratic Funding Round 3<p><em>Special thanks to the Gitcoin team and especially Frank Chen for working with me through these numbers</em></p>
<p>The next round of Gitcoin Grants quadratic funding has just finished, and we the numbers for how much each project has received <a href="https://gitcoin.co/blog/gitcoins-q3-match/">were just released</a>. Here are the top ten:</p>
<center>
<img src="http://vitalik.ca/files/posts_files/gitcoin-files/round3.png" style="width:750px" />
</center>
<p><br><br></p>
<p>Altogether, $163,279 was donated to 80 projects by 477 contributors, augmented by a matching pool of $100,000. Nearly half came from four contributions above $10,000: $37,500 to Lighthouse, and $12,500 each to Gas Station Network, Black Girls Code and Public Health Incentives Layer. Out of the remainder, about half came from contributions between $1,000 and $10,000, and the rest came from smaller donations of various sizes. But what matters more here are not the raw donations, but rather the subsidies that the <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3243656">quadratic funding</a> mechanism applied. Gitcoin Grants is there to support valuable public goods in the Ethereum ecosystem, but also serve as a testbed for this new quadratic donation matching mechanism, and see how well it lives up to its promise of creating a democratic, market-based and efficient way of funding public goods. This time around, a modified formula based on <a href="https://ethresear.ch/t/pairwise-coordination-subsidies-a-new-quadratic-funding-design/5553">pairwise-bounded coordination subsidies</a> was used, which has the goal of minimizing distortion from large contributions from coordinated actors. And now we get to see how the experiment went.</p>
<h3 id="judging-the-outcomes">Judging the Outcomes</h3>
<p>First, the results. Ultimately, every mechanism for allocating resources, whether centralized, market-based, democratic or otherwise, must stand the test of delivering results, or else sooner or later it will be abandoned for another mechanism that is perceived to be better, even if it is less philosophically clean. Judging results is inherently a subjective exercise; any single person's analysis of a mechanism will inevitably be shaped by how well the results fit their own preferences and tastes. However, in those cases where a mechanism does output a surprising result, one can and should use that as an opportunity to learn, and see whether or not one missed some key information that other participants in the mechanism had.</p>
<p>In my own case, I found the top results very agreeable and a quite reasonable catalogue of projects that are good for the Ethereum community. One of the disparities between these grants and the Ethereum Foundation grants is that the Ethereum Foundation grants (see recent rounds <a href="https://blog.ethereum.org/2019/08/26/announcing-ethereum-foundation-and-co-funded-grants/">here</a> and <a href="https://blog.ethereum.org/2019/02/21/ethereum-foundation-grants-program-wave-5/">here</a>) tend to overwhelmingly focus on technology with only a small section on education and community resources, whereas in the Gitcoin grants while technology still dominates, EthHub is #2 and lower down defiprime.com is #14 and cryptoeconomics.study is #17. In this case my personal opinion is that EF <em>has</em> made a genuine error in undervaluing grants to community/education organizations and Gitcoin's "collective instinct" is correct. Score one for new-age fancy quadratic market democracy.</p>
<p>Another surprising result to me was Austin Griffith getting second place. I personally have never spent too much time thinking about Burner Wallet; I knew that it existed but in my mental space I did not take it too seriously, focusing instead on client development, L2 scaling, privacy and to a lesser extent smart contract wallets (the latter being a key use case of Gas Station Network at #8). After seeing Austin's impressive performance in this Gitcoin round, I asked a few people what was going on.</p>
<p>Burner Wallet (<a href="https://xdai.io/">website</a>, <a href="https://settle.finance/blog/what-is-the-burner-wallet-and-whats-xdai/">explainer article</a>) is an "insta-wallet" that's very easy to use: just load it up on your desktop or phone, and there you have it. It was used successfully at EthDenver to sell food from food trucks, and generally many people appreciate its convenience. Its main weaknesses are lower security and that one of its features, support for xDAI, is dependent on a permissioned chain.</p>
<p>Austin's Gitcoin grant is there to fund his <a href="https://gitcoin.co/grants/122/austin-griffith-ethereum-rampd">ongoing work</a>, and I have heard one criticism: there's many prototypes, but comparatively few "things taken to completion". There is also the critique that as great as Austin is, it's difficult to argue that he's as important to the success of Ethereum as, say, Lighthouse and Prysmatic, though one can reply that what matters is not total value, but rather the marginal value of giving a given project or person an extra $10,000. On the whole, however, I feel like quadratic funding's (Glen would say deliberate!) tendency to select for things like Burner Wallet with populist appeal is a much needed corrective to the influence of the Ethereum tech elite (including myself!) who often value technical impressiveness and undervalue simple and quick things that make it really easy for people to participate in Ethereum. This one is slightly more ambiguous, but I'll say score two for new-age fancy quadratic market democracy.</p>
<p>The main thing that I was disappointed the Gitcoiner-ati did <em>not</em> support more was Gitcoin maintenance itself. The Gitcoin Sustainability Fund only got a total $1,119 in raw contributions from 18 participants, plus a match of $202. The optional 5% tips that users could give to Gitcoin upon donating were not included into the quadratic matching calculations, but raised another ~$1,000. Given the amount of effort the Gitcoin people put in to making quadratic funding possible, this is not nearly enough; Gitcoin clearly deserves more than 0.9% of the total donations in the round. Meanwhile, the Ethereum Foundation (as well as Consensys and individual donors) have been giving grants to Gitcoin that include supporting Gitcoin itself. Hopefully in future rounds people will support Gitcoin itself too, but for now, score one for good old-fashioned EF technocracy.</p>
<p>On the whole, quadratic funding, while still young and immature, seems to be a remarkably effective complement to the funding preferences of existing institutions, and it seems worthwhile to continue it and even increase its scope and size in the future.</p>
<h3 id="pairwise-bounded-quadratic-funding-vs-traditional-quadratic-funding">Pairwise-bounded quadratic funding vs traditional quadratic funding</h3>
<p>Round 3 differs from previous rounds in that it uses a new <a href="https://ethresear.ch/t/pairwise-coordination-subsidies-a-new-quadratic-funding-design/5553">flavor of quadratic funding</a>, which limits the subsidy per pair of participants. For example, in traditional QF, if two people each donate $10, the subsidy would be $10, and if two people each donate $10,000, the subsidy would be $10,000. This property of traditional QF makes it highly vulnerable to collusion: two key employees of a project (or even two fake accounts owned by the same person) could each donate as much money as they have, and get back a very large subsidy. Pairwise-bounded QF computes the total subsidy to a project by looking through all pairs of contributors, and imposes a maximum bound on the total subsidy that any given pair of participants can trigger (combined across all projects). Pairwise-bounded QF also has the property that it generally penalizes projects that are dominated by large contributors:</p>
<center>
<img src="http://vitalik.ca/files/posts_files/gitcoin-files/chart2.png" />
</center>
<p><br><br></p>
<p>The projects that lost the most relative to traditional QF seem to be projects that have a single large contribution (or sometimes two). For example, "fuzz geth and Parity for EVM consensus bugs" got a $415 match compared to the $2000 he would have gotten in traditional QF; the decrease is explained by the fact that the contributions are dominated by two large $4500 contributions. On the other hand, <a href="http://cryptoeconomics.study">cryptoeconomics.study</a> got $1274, <em>up</em> nearly double from the $750 it would have gotten in traditional QF; this is explained by the large diversity of contributions that the project received and particularly the lack of large sponsors: the largest contribution to cryptoeconomics.study was $100.</p>
<p>Another desirable property of pairwise-bounded QF is that it privileges <em>cross-tribal</em> projects. That is, if there are projects that group A typically supports, and projects that group B typically supports, then projects that manage to get support from both groups get a more favorable subsidy (because the pairs that go between groups are not as saturated). Has this incentive for building bridges appeared in these results?</p>
<p>Unfortunately, my code of honor as a social scientist obliges me to report the negative result: the Ethereum community just does not yet have enough internal tribal structure for effects like this to materialize, and even when there are differences in correlations they don't seem strongly connected to higher subsidies due to pairwise-bounding. Here are the cross-correlations between who contributed to different projects:</p>
<center>
<img src="http://vitalik.ca/files/posts_files/gitcoin-files/correlations.png" />
</center>
<p><br><br></p>
<p>Generally, all projects are slightly positively correlated with each other, with a few exceptions with greater correlation and one exception with broad roughly zero correlation: <a href="https://gitcoin.co/grants/120/nori">Nori</a> (120 in this chart). However, Nori did not do well in pairwise-bounded QF, because over 94% of its donations came from a single $5000 donation.</p>
<h3 id="dominance-of-large-projects">Dominance of large projects</h3>
<p>One other pattern that we saw in this round is that popular projects got disproportionately large grants:</p>
<center>
<img src="http://vitalik.ca/files/posts_files/gitcoin-files/ratios.jpeg" />
</center>
<p><br><br></p>
<p>To be clear, this is not just saying "more contributions, more match", it's saying "more contributions, <em>more match per dollar contributed</em>". Arguably, this is an intended feature of the mechanism. Projects that can get more people to donate to them represent public goods that serve a larger public, and so tragedy of the commons problems are more severe and hence contributions to them should be multiplied more to compensate. However, looking at the list, it's hard to argue that, say, Prysm ($3,848 contributed, $8,566 matched) is a more public good than Nimbus ($1,129 contributed, $496 matched; for the unaware, Prysm and Nimbus are both eth2 clients). The failure does not look too severe; on average, projects near the top do seem to serve a larger public and projects near the bottom do seem niche, but it seems clear that at least part of the disparity is not genuine publicness of the good, but rather inequality of attention. N units of marketing effort can attract attention of N people, and theoretically get N^2 resources.</p>
<p>Of course, this could be solved via a "layer on top" venture-capital style: upstart new projects could get investors to support them, in return for a share of matched contributions received when they get large. Something like this would be needed eventually; predicting future public goods is as important a social function as predicting future private goods. But we could also consider less winner-take-all alternatives; the simplest one would be adjusting the QF formula so it uses an exponent of eg. 1.5 instead of 2. I can see it being worthwhile to try a future round of Gitcoin Grants with such a formula (<span class="math inline">\(\left(\sum_i x_i^{\frac{2}{3}}\right)^{\frac{3}{2}}\)</span> instead of <span class="math inline">\(\left(\sum_i x_i^{\frac{1}{2}}\right)^2\)</span>) to see what the results are like.</p>
<h3 id="individual-leverage-curves">Individual leverage curves</h3>
<p>One key question is, if you donate $1, or $5, or $100, how big an impact can you have on the amount of money that a project gets? Fortunately, we can use the data to calculate these deltas!</p>
<center>
<img src="http://vitalik.ca/files/posts_files/gitcoin-files/deltas.png" />
</center>
<p><br><br></p>
<p>The different lines are for different projects; supporting projects with higher existing support will lead to you getting a bigger multiplier. In all cases, the first dollar is very valuable, with a matching ratio in some cases over 100:1. But the second dollar is much less valuable, and matching ratios quickly taper off; even for the largest projects increasing one's donation from $32 to $64 will only get a 1:1 match, and anything above $100 becomes almost a straight donation with nearly no matching. However, given that it's likely possible to get legitimate-looking Github accounts on the grey market for around those costs, having a cap of a few hundred dollars on the amount of matched funds that any particular account can direct seems like a very reasonable mitigation, despite its costs in limiting the bulk of the matching effect to small-sized donations.</p>
<h3 id="conclusions">Conclusions</h3>
<p>On the whole, this was by far the largest and the most data-rich Gitcoin funding round to date. It successfully attracted hundreds of contributors, reaching a size where we can finally see many significant effects in play and drown out the effects of the more naive forms of small-scale collusion. The experiment already seems to be leading to valuable information that can be used by future quadratic funding implementers to improve their quadratic funding implementations. The case of Austin Griffith is also interesting because $23,911 in funds that he received comes, in relative terms, surprisingly close to an average salary for a developer if the grants can be repeated on a regular schedule. What this means is that if Gitcoin Grants <em>does</em> continue operating regularly, and attracts and expands its pool of donations, we could be very close to seeing the first "quadratic freelancer" - someone directly "working for the public", funded by donations boosted by quadratic matching subsidies. And at that point we could start to see more experimentation in new forms of organization that live on top of quadratic funding gadgets as a base layer. All in all, this foretells an exciting and, err, radical public-goods funding future ahead of us.</p>
Thu, 24 Oct 2019 18:03:10 -0700
https://vitalik.ca/general/2019/10/24/gitcoin.html
https://vitalik.ca/general/2019/10/24/gitcoin.htmlgeneralIn-person meatspace protocol to prove unconditional possession of a private key<p><em>Recommended pre-reading: <a href="https://ethresear.ch/t/minimal-anti-collusion-infrastructure/5413" class="uri">https://ethresear.ch/t/minimal-anti-collusion-infrastructure/5413</a></em></p>
<p>Alice slowly walks down the old, dusty stairs of the building into the basement. She thinks wistfully of the old days, when quadratic-voting in the World Collective Market was a much simpler process of linking her public key to a twitter account and opening up metamask to start firing off votes. Of course back then voting in the WCM was used for little; there were a few internet forums that used it for voting on posts, and a few million dollars donated to its <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3243656">quadratic funding</a> oracle. But then it grew, and then the game-theoretic attacks came.</p>
<p>First came the exchange platforms, which started offering "<a href="https://vitalik.ca/general/2018/03/28/plutocracy.html">dividends</a>" to anyone who registered a public key belonging to an exchange and thus provably allowed the exchange to vote on their behalf, breaking the crucial "independent choice" assumption of the quadratic voting and funding mechanisms. And soon after that came the fake accounts - Twitter accounts, Reddit accounts filtered by karma score, national government IDs, all proved vulnerable to either government cheating or hackers, or both. Elaborate infrastructure was instituted at registration time to ensure both that account holders were real people, and that account holders themselves held the keys, not a central custody service purchasing keys by the thousands to buy votes.</p>
<p>And so today, voting is still easy, but initiation, while still not harder than going to a government office, is no longer exactly trivial. But of course, with billions of dollars in donations from now-deceased billionaires and cryptocurrency premines forming part of the WCM's quadratic funding pool, and elements of municipal governance using its quadratic voting protocols, participating is very much worth it.</p>
<p>After reaching the end of the stairs, Alice opens the door and enters the room. Inside the room, she sees a table. On the near side of the table, she sees a single, empty chair. On the far side of the table, she sees four people already sitting down on chairs of their own, the high-reputation Guardians randomly selected by the WCM for Alice's registration ceremony. "Hello, Alice," the person sitting on the leftmost chair, whose name she intuits is Bob, says in a calm voice. "Glad that you can make it," the person sitting beside Bob, whose name she intuits is Charlie, adds.</p>
<p>Alice walks over to the chair that is clearly meant for her and sits down. "Let us begin," the person sitting beside Charlie, whose name by logical progression is David, proclaims. "Alice, do you have your key shares?"</p>
<p>Alice takes out four pocket-sized notebooks, clearly bought from a dollar store, and places them on the table. The person sitting at the right, logically named Evan, takes out his phone, and immediately the others take out theirs. They open up their ethereum wallets. "So," Evan begins, "the current Ethereum beacon chain slot number is 28,205,913, and the block hash starts <code>0xbe48</code>. Do all agree?". "Yes," Alice, Bob, Charlie and David exclaim in unison. Evan continues: "so let us wait for the next block."</p>
<p>The five intently stare at their phones. First for ten seconds, then twenty, then thirty. "Three skipped proposers," Bob mutters, "how unusual". But then after another ten seconds, a new block appears. "Slot number 28,205,917, block hash starts <code>0x62f9</code>, so first digit 6. All agreed?"</p>
<p>"Yes."</p>
<p>"Six mod four is two, and as is prescribed in the Old Ways, we start counting indices from zero, so this means Alice will keep the third book, counting as usual from our left."</p>
<p>Bob takes the first, second and fourth notebooks that Alice provided, leaving the third untouched. Alice takes the remaining notebook and puts it back in her backpack. Bob opens each notebook to a page in the middle with the corner folded, and sees a sequence of letters and numbers written with a pencil in the middle of each page - a standard way of writing the key shares for over a decade, since camera and image processing technology got powerful enough to recognize words and numbers written on single slips of paper even inside an envelope. Bob, Charlie, David and Evan crowd around the books together, and each open up an app on their phone and press a few buttons.</p>
<p>Bob starts reading, as all four start typing into their phones at the same time:</p>
<p>"Alice's first key share is, <code>6-b-d-7-h-k-k-l-o-e-q-q-p-3-y-s-6-x-e-f</code>. Applying the 100,000x iterated SHA256 hash we get <code>e-a-6-6...</code>, confirm?"</p>
<p>"Confirmed," the others replied. "Checking against Alice's precommitted elliptic curve point A0... match."</p>
<p>"Alice's second key share is, <code>f-r-n-m-j-t-x-r-s-3-b-u-n-n-n-i-z-3-d-g</code>. Iterated hash <code>8-0-3-c...</code>, confirm?"</p>
<p>"Confirmed. Checking against Alice's precommitted elliptic curve point A1... match."</p>
<p>"Alice's fourth key share is, <code>i-o-f-s-a-q-f-n-w-f-6-c-e-a-m-s-6-z-z-n</code>. Iterated hash <code>6-a-5-6...</code>, confirm?"</p>
<p>"Confirmed. Checking against Alice's precommitted elliptic curve point A3... match."</p>
<p>"Adding the four precommitted curve points, x coordinate begins <code>3-1-8-3</code>. Alice, confirm that that is the key you wish to register?"</p>
<p>"Confirm."</p>
<p>Bob, Charlie, David and Evan glance down at their smartphone apps one more time, and each tap a few buttons. Alice catches a glance at Charlie's phone; she sees four yellow checkmarks, and an "approval transaction pending" dialog. After a few seconds, the four yellow checkmarks are replaced with a single green checkmark, with a transaction hash ID, too small for Alice to make out the digits from a few meters away, below. Alice's phone soon buzzes, with a notification dialog saying "Registration confirmed".</p>
<p>"Congratulations, Alice," Bob says. "Unconditional possession of your key has been verified. You are now free to send a transaction to the World Collective Market's MPC oracle to update your key."</p>
<p>"Only a 75% probability this would have actually caught me if I didn't actually have all four parts of the key," Alice thought to herself. But it seemed to be enough for an in-person protocol in practice; and if it ever wasn't then they could easily switch to slightly more complex protocols that used low-degree polynomials to achieve exponentially high levels of soundness. Alice taps a few buttons on her smartphone, and a "transaction pending" dialog shows up on the screen. Five seconds later, the dialog disappears and is replaced by a green checkmark. She jumps up with joy and, before Bob, Charlie, David and Evan can say goodbye, runs out of the room, frantically tapping buttons to vote on all the projects and issues in the WCM that she had wanted to support for months.</p>
Tue, 01 Oct 2019 18:03:10 -0700
https://vitalik.ca/general/2019/10/01/story.html
https://vitalik.ca/general/2019/10/01/story.htmlgeneralUnderstanding PLONK<p><em>Special thanks to Justin Drake, Karl Floersch, Hsiao-wei Wang, Barry Whitehat, Dankrad Feist, Kobi Gurkan and Zac Williamson for review</em></p>
<p>Very recently, Ariel Gabizon, Zac Williamson and Oana Ciobotaru announced a new general-purpose zero-knowledge proof scheme called <a href="https://eprint.iacr.org/2019/953">PLONK</a>, standing for the unwieldy quasi-backronym "Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge". While <a href="https://eprint.iacr.org/2016/260.pdf">improvements</a> to general-purpose <a href="https://arxiv.org/abs/1903.12243">zero-knowledge proof</a> protocols have been <a href="https://dci.mit.edu/zksharks">coming</a> for <a href="https://eprint.iacr.org/2017/1066">years</a>, what PLONK (and the earlier but more complex <a href="https://www.benthamsgaze.org/2019/02/07/introducing-sonic-a-practical-zk-snark-with-a-nearly-trustless-setup/">SONIC</a> and the more recent <a href="https://eprint.iacr.org/2019/1047.pdf">Marlin</a>) bring to the table is a series of enhancements that may greatly improve the usability and progress of these kinds of proofs in general.</p>
<p>The first improvement is that while PLONK still requires a trusted setup procedure similar to that needed for the <a href="https://minezcash.com/zcash-trusted-setup/">SNARKs in Zcash</a>, it is a "universal and updateable" trusted setup. This means two things: first, instead of there being one separate trusted setup for every program you want to prove things about, there is one single trusted setup for the whole scheme after which you can use the scheme with any program (up to some maximum size chosen when making the setup). Second, there is a way for multiple parties to participate in the trusted setup such that it is secure as long as any one of them is honest, and this multi-party procedure is fully sequential: first one person participates, then the second, then the third... The full set of participants does not even need to be known ahead of time; new participants could just add themselves to the end. This makes it easy for the trusted setup to have a large number of participants, making it quite safe in practice.</p>
<p>The second improvement is that the "fancy cryptography" it relies on is one single standardized component, called a "polynomial commitment". PLONK uses "Kate commitments", based on a trusted setup and elliptic curve pairings, but you can instead swap it out with other schemes, such as <a href="https://vitalik.ca/general/2017/11/22/starks_part_2.html">FRI</a> (which would <a href="https://eprint.iacr.org/2019/1020">turn PLONK into a kind of STARK</a>) or DARK (based on hidden-order groups). This means the scheme is theoretically compatible with any (achievable) tradeoff between proof size and security assumptions.</p>
<center>
<img src="http://vitalik.ca/files/posts_files/plonk-files/Tradeoffs.png" />
</center>
<p><br></p>
<p>What this means is that use cases that require different tradeoffs between proof size and security assumptions (or developers that have different ideological positions about this question) can still share the bulk of the same tooling for "arithmetization" - the process for converting a program into a set of polynomial equations that the polynpomial commitments are then used to check. If this kind of scheme becomes widely adopted, we can thus expect rapid progress in improving shared arithmetization techniques.</p>
<h2 id="how-plonk-works">How PLONK works</h2>
<p>Let us start with an explanation of how PLONK works, in a somewhat abstracted format that focuses on polynomial equations without immediately explaining how those equations are verified. A key ingredient in PLONK, as is the case in the <a href="https://medium.com/@VitalikButerin/quadratic-arithmetic-programs-from-zero-to-hero-f6d558cea649">QAPs used in SNARKs</a>, is a procedure for converting a problem of the form "give me a value <span class="math inline">\(X\)</span> such that a specific program <span class="math inline">\(P\)</span> that I give you, when evaluated with <span class="math inline">\(X\)</span> as an input, gives some specific result <span class="math inline">\(Y\)</span>" into the problem "give me a set of values that satisfies a set of math equations". The program <span class="math inline">\(P\)</span> can represent many things; for example the problem could be "give me a solution to this sudoku", which you would encode by setting <span class="math inline">\(P\)</span> to be a sudoku verifier plus some initial values encoded and setting <span class="math inline">\(Y\)</span> to <span class="math inline">\(1\)</span> (ie. "yes, this solution is correct"), and a satisfying input <span class="math inline">\(X\)</span> would be a valid solution to the sudoku. This is done by representing <span class="math inline">\(P\)</span> as a circuit with logic gates for addition and multiplication, and converting it into a system of equations where the variables are the values on all the wires and there is one equation per gate (eg. <span class="math inline">\(x_6 = x_4 \cdot x_7\)</span> for multiplication, <span class="math inline">\(x_8 = x_5 + x_9\)</span> for addition).</p>
<p>Here is an example of the problem of finding <span class="math inline">\(x\)</span> such that <span class="math inline">\(P(x) = x^3 + x + 5 = 35\)</span> (hint: <span title="Though other solutions also exist over fields where -31 has a square root; since SNARKs are done over prime fields this is something to watch out for!"><span class="math inline">\(x = 3\)</span></span>):</p>
<center>
<img src="http://vitalik.ca/files/posts_files/plonk-files/Circuit.png" />
</center>
<p><br></p>
<p>We can label the gates and wires as follows:</p>
<center>
<img src="http://vitalik.ca/files/posts_files/plonk-files/Circuit2.png" />
</center>
<p><br></p>
<p>On the gates and wires, we have two types of constraints: <strong>gate constraints</strong> (equations between wires attached to the same gate, eg. <span class="math inline">\(a_1 \cdot b_1 = c_1\)</span>) and <strong>copy constraints</strong> (claims about equality of different wires anywhere in the circuit, eg. <span class="math inline">\(a_0 = a_1 = b_1 = b_2 = a_3\)</span> or <span class="math inline">\(c_0 = a_1\)</span>). We will need to create a structured system of equations, which will ultimately reduce to a very small number of polynomial equations, to represent both.</p>
<p>In PLONK, the setup for these equations is as follows. Each equation is of the following form (think: <span class="math inline">\(L\)</span> = left, <span class="math inline">\(R\)</span> = right, <span class="math inline">\(O\)</span> = output, <span class="math inline">\(M\)</span> = multiplication, <span class="math inline">\(C\)</span> = constant):</p>
<p><span class="math display">\[
\left(Q_{L_{i}}\right) a_{i}+\left(Q_{R_{i}}\right) b_{i}+\left(Q_{O_{i}}\right) c_{i}+\left(Q_{M_{i}}\right) a_{i} b_{i}+Q_{C_{i}}=0
\]</span></p>
<p>Each <span class="math inline">\(Q\)</span> value is a constant; the constants in each equation (and the number of equations) will be different for each program. Each small-letter value is a variable, provided by the user: <span class="math inline">\(a_i\)</span> is the left input wire of the <span class="math inline">\(i\)</span>'th gate, <span class="math inline">\(b_i\)</span> is the right input wire, and <span class="math inline">\(c_i\)</span> is the output wire of the <span class="math inline">\(i\)</span>'th gate. For an addition gate, we set:</p>
<p><span class="math display">\[
Q_{L_{i}}=1, Q_{R_{i}}=1, Q_{M_{i}}=0, Q_{O_{i}}=-1, Q_{C_{i}}=0
\]</span></p>
<p>Plugging these constants into the equation and simplifying gives us <span class="math inline">\(a_i + b_i - o_i = 0\)</span>, which is exactly the constraint that we want. For a multiplication gate, we set:</p>
<p><span class="math display">\[
Q_{L_{i}}=0, Q_{R_{i}}=0, Q_{M_{i}}=1, Q_{O_{i}}=-1, Q_{C_{i}}=0
\]</span></p>
<p>For a constant gate setting <span class="math inline">\(a_i\)</span> to some constant <span class="math inline">\(x\)</span>, we set:</p>
<p><span class="math display">\[
Q_{L}=1, Q_{R}=0, Q_{M}=0, Q_{O}=0, Q_{C}=-x
\]</span></p>
<p>You may have noticed that each end of a wire, as well as each wire in a set of wires that clearly must have the same value (eg. <span class="math inline">\(x\)</span>), corresponds to a distinct variable; there's nothing so far forcing the output of one gate to be the same as the input of another gate (what we call "copy constraints"). PLONK does of course have a way of enforcing copy constraints, but we'll get to this later. So now we have a problem where a prover wants to prove that they have a bunch of <span class="math inline">\(x_{a_i}, x_{b_i}\)</span> and <span class="math inline">\(x_{c_i}\)</span> values that satisfy a bunch of equations that are of the same form. This is still a big problem, but unlike "find a satisfying input to this computer program" it's a very <em>structured</em> big problem, and we have mathematical tools to "compress" it.</p>
<h3 id="from-linear-systems-to-polynomials">From linear systems to polynomials</h3>
<p>If you have read about <a href="https://vitalik.ca/general/2017/11/09/starks_part_1.html">STARKs</a> or <a href="https://medium.com/@VitalikButerin/quadratic-arithmetic-programs-from-zero-to-hero-f6d558cea649">QAPs</a>, the mechanism described in this next section will hopefully feel somewhat familiar, but if you have not that's okay too. The main ingredient here is to understand a <em>polynomial</em> as a mathematical tool for encapsulating a whole lot of values into a single object. Typically, we think of polynomials in "coefficient form", that is an expression like:</p>
<p><span class="math display">\[
y=x^{3}-5 x^{2}+7 x-2
\]</span></p>
<p>But we can also view polynomials in "evaluation form". For example, we can think of the above as being "the" degree <span class="math inline">\(< 4\)</span> polynomial with evaluations <span class="math inline">\((-2, 1, 0, 1)\)</span> at the coordinates <span class="math inline">\((0, 1, 2, 3)\)</span> respectively.</p>
<center>
<img src="http://vitalik.ca/files/posts_files/plonk-files/polynomial_graph.png" />
</center>
<p><br></p>
<p>Now here's the next step. Systems of many equations of the same form can be re-interpreted as a single equation over polynomials. For example, suppose that we have the system:</p>
<p><span class="math display">\[
\begin{array}{l}{2 x_{1}-x_{2}+3 x_{3}=8} \\ {x_{1}+4 x_{2}-5 x_{3}=5} \\ {8 x_{1}-x_{2}-x_{3}=-2}\end{array}
\]</span></p>
<p>Let us define four polynomials in evaluation form: <span class="math inline">\(L(x)\)</span> is the degree <span class="math inline">\(< 3\)</span> polynomial that evaluates to <span class="math inline">\((2, 1, 8)\)</span> at the coordinates <span class="math inline">\((0, 1, 2)\)</span>, and at those same coordinates <span class="math inline">\(M(x)\)</span> evaluates to <span class="math inline">\((-1, 4, -1)\)</span>, <span class="math inline">\(R(x)\)</span> to <span class="math inline">\((3, -5, -1)\)</span> and <span class="math inline">\(O(x)\)</span> to <span class="math inline">\((8, 5, -2)\)</span> (it is okay to directly define polynomials in this way; you can use <a href="https://en.wikipedia.org/wiki/Lagrange_interpolation">Lagrange interpolation</a> to convert to coefficient form). Now, consider the equation:</p>
<p><span class="math display">\[
L(x) \cdot x_{1}+M(x) \cdot x_{2}+R(x) \cdot x_{3}-O(x)=Z(x) H(x)
\]</span></p>
<p>Here, <span class="math inline">\(Z(x)\)</span> is shorthand for <span class="math inline">\((x-0) \cdot (x-1) \cdot (x-2)\)</span> - the minimal (nontrivial) polynomial that returns zero over the evaluation domain <span class="math inline">\((0, 1, 2)\)</span>. A solution to this equation (<span class="math inline">\(x_1 = 1, x_2 = 6, x_3 = 4, H(x) = 0\)</span>) is also a solution to the original system of equations, except the original system does not need <span class="math inline">\(H(x)\)</span>. Notice also that in this case, <span class="math inline">\(H(x)\)</span> is conveniently zero, but in more complex cases <span class="math inline">\(H\)</span> may need to be nonzero.</p>
<p>So now we know that we can represent a large set of constraints within a small number of mathematical objects (the polynomials). But in the equations that we made above to represent the gate wire constraints, the <span class="math inline">\(x_1, x_2, x_3\)</span> variables are different per equation. We can handle this by making the variables themselves polynomials rather than constants in the same way. And so we get:</p>
<p><span class="math display">\[
Q_{L}(x) a(x)+Q_{R}(x) b(x)+Q_{O}(x) c(x)+Q_{M}(x) a(x) b(x)+Q_{C}(x)=0
\]</span></p>
<p>As before, each <span class="math inline">\(Q\)</span> polynomial is a parameter that can be generated from the program that is being verified, and the <span class="math inline">\(a\)</span>, <span class="math inline">\(b\)</span>, <span class="math inline">\(c\)</span> polynomials are the user-provided inputs.</p>
<h3 id="copy-constraints">Copy constraints</h3>
<p>Now, let us get back to "connecting" the wires. So far, all we have is a bunch of disjoint equations about disjoint values that are independently easy to satisfy: constant gates can be satisfied by setting the value to the constant and addition and multiplication gates can simply be satisfied by setting all wires to zero! To make the problem actually challenging (and actually represent the problem encoded in the original circuit), we need to add an equation that verifies "copy constraints": constraints such as <span class="math inline">\(a(5) = c(7)\)</span>, <span class="math inline">\(c(10) = c(12)\)</span>, etc. This requires some clever trickery.</p>
<p>Our strategy will be to design a "coordinate pair accumulator", a polynomial <span class="math inline">\(p(x)\)</span> which works as follows. First, let <span class="math inline">\(X(x)\)</span> and <span class="math inline">\(Y(x)\)</span> be two polynomials representing the <span class="math inline">\(x\)</span> and <span class="math inline">\(y\)</span> coordinates of a set of points (eg. to represent the set <span class="math inline">\(((0, -2), (1, 1), (2, 0), (3, 1))\)</span> you might set <span class="math inline">\(X(x) = x\)</span> and <span class="math inline">\(Y(x) = x_3 - 5x_2 + 7x - 2)\)</span>. Our goal will be to let <span class="math inline">\(p(x)\)</span> represent all the points up to (but not including) the given position, so <span class="math inline">\(p(0)\)</span> starts at <span class="math inline">\(1\)</span>, <span class="math inline">\(p(1)\)</span> represents just the first point, <span class="math inline">\(p(2)\)</span> the first and the second, etc. We will do this by "randomly" selecting two constants, <span class="math inline">\(v_1\)</span> and <span class="math inline">\(v_2\)</span>, and constructing <span class="math inline">\(p(x)\)</span> using the constraints <span class="math inline">\(p(0) = 1\)</span> and <span class="math inline">\(p(x+1) = p(x) \cdot (v_1 + X(x) + v_2 \cdot Y(x))\)</span> at least within the domain <span class="math inline">\((0, 1, 2, 3)\)</span>.</p>
<p>For example, letting <span class="math inline">\(v_1 = 3\)</span> and <span class="math inline">\(v_2 = 2\)</span>, we get:</p>
<center>
<p><img src="http://vitalik.ca/files/posts_files/plonk-files/polynomial_graph3.png" style="width:440px"/><br></p>
<table style="padding-right:136px; border-collapse: collapse;" align="center">
<tr>
<td style="border: 1px solid black"align="right" width="136px"><span class="math inline">\(X(x)\)</span>
</td>
<td style="border: 1px solid black" align="center" width="84px">
0
</td>
<td style="border: 1px solid black" align="center" width="84px">
1
</td>
<td style="border: 1px solid black" align="center" width="84px">
2
</td>
<td style="border: 1px solid black" align="center" width="84px">
3
</td>
<td style="border: 1px solid black" align="center" width="84px">
4
</td>
</tr>
<tr>
<td style="border: 1px solid black" align="right" width="136px">
<span class="math inline">\(Y(x)\)</span>
</td>
<td style="border: 1px solid black" align="center" width="84px">
-2
</td>
<td style="border: 1px solid black" align="center" width="84px">
1
</td>
<td style="border: 1px solid black" align="center" width="84px">
0
</td>
<td style="border: 1px solid black" align="center" width="84px">
1
</td>
<td style="border: 1px solid black" align="center" width="84px">
</td>
</tr>
<tr>
<td style="border: 1px solid black" align="right" width="136px">
<small><span class="math inline">\(v_1 + X(x) + v_2 \cdot Y(x)\)</span></small>
</td>
<td style="border: 1px solid black" align="center" width="84px">
-1
</td>
<td style="border: 1px solid black" align="center" width="84px">
6
</td>
<td style="border: 1px solid black" align="center" width="84px">
5
</td>
<td style="border: 1px solid black" align="center" width="84px">
8
</td>
<td style="border: 1px solid black" align="center" width="84px">
</td>
</tr>
<tr>
<td style="border: 1px solid black" align="right" width="136px">
<span class="math inline">\(p(x)\)</span>
</td>
<td style="border: 1px solid black" align="center" width="84px">
1
</td>
<td style="border: 1px solid black" align="center" width="84px">
-1
</td>
<td style="border: 1px solid black" align="center" width="84px">
-6
</td>
<td style="border: 1px solid black" align="center" width="84px">
-30
</td>
<td style="border: 1px solid black" align="center" width="84px">
-240
</td>
</tr>
</table>
<br> <small><i>Notice that (aside from the first column) every <span class="math inline">\(p(x)\)</span> value equals the value to the left of it multiplied by the value to the left and above it.</i></small>
</center>
<p><br></p>
<p>The result we care about is <span class="math inline">\(p(4) = -240\)</span>. Now, consider the case where instead of <span class="math inline">\(X(x) = x\)</span>, we set <span class="math inline">\(X(x) = \frac{2}{3} x^3 - 4x^2 + \frac{19}{3}x\)</span> (that is, the polynomial that evaluates to <span class="math inline">\((0, 3, 2, 1)\)</span> at the coordinates <span class="math inline">\((0, 1, 2, 3)\)</span>). If you run the same procedure, you'll find that you also get <span class="math inline">\(p(4) = -240\)</span>. This is not a coincidence (in fact, if you randomly pick <span class="math inline">\(v_1\)</span> and <span class="math inline">\(v_2\)</span> from a sufficiently large field, it will <em>almost never</em> happen coincidentally). Rather, this happens because <span class="math inline">\(Y(1) = Y(3)\)</span>, so if you "swap the <span class="math inline">\(X\)</span> coordinates" of the points <span class="math inline">\((1, 1)\)</span> and <span class="math inline">\((3, 1)\)</span> you're not changing the <em>set</em> of points, and because the accumulator encodes a set (as multiplication does not care about order) the value at the end will be the same.</p>
<p>Now we can start to see the basic technique that we will use to prove copy constraints. First, consider the simple case where we only want to prove copy constraints within one set of wires (eg. we want to prove <span class="math inline">\(a(1) = a(3)\)</span>). We'll make two coordinate accumulators: one where <span class="math inline">\(X(x) = x\)</span> and <span class="math inline">\(Y(x) = a(x)\)</span>, and the other where <span class="math inline">\(Y(x) = a(x)\)</span> but <span class="math inline">\(X'(x)\)</span> is the polynomial that evaluates to the permutation that flips (or otherwise rearranges) the values in each copy constraint; in the <span class="math inline">\(a(1) = a(3)\)</span> case this would mean the permutation would start <span class="math inline">\(0 3 2 1 4...\)</span>. The first accumulator would be compressing <span class="math inline">\(((0, a(0)), (1, a(1)), (2, a(2)), (3, a(3)), (4, a(4))...\)</span>, the second <span class="math inline">\(((0, a(0)), (3, a(1)), (2, a(2)), (1, a(3)), (4, a(4))...\)</span>. The only way the two can give the same result is if <span class="math inline">\(a(1) = a(3)\)</span>.</p>
<p>To prove constraints between <span class="math inline">\(a\)</span>, <span class="math inline">\(b\)</span> and <span class="math inline">\(c\)</span>, we use the same procedure, but instead "accumulate" together points from all three polynomials. We assign each of <span class="math inline">\(a\)</span>, <span class="math inline">\(b\)</span>, <span class="math inline">\(c\)</span> a range of <span class="math inline">\(X\)</span> coordinates (eg. <span class="math inline">\(a\)</span> gets <span class="math inline">\(X_a(x) = x\)</span> ie. <span class="math inline">\(0...n-1\)</span>, <span class="math inline">\(b\)</span> gets <span class="math inline">\(X_b(x) = n+x\)</span>, ie. <span class="math inline">\(n...2n-1\)</span>, <span class="math inline">\(c\)</span> gets <span class="math inline">\(X_c(x) = 2n+x\)</span>, ie. <span class="math inline">\(2n...3n-1\)</span>. To prove copy constraints that hop between different sets of wires, the "alternate" <span class="math inline">\(X\)</span> coordinates would be slices of a permutation across all three sets. For example, if we want to prove <span class="math inline">\(a(2) = b(4)\)</span> with <span class="math inline">\(n = 5\)</span>, then <span class="math inline">\(X'_a(x)\)</span> would have evaluations <span class="math inline">\(0 1 9 3 4\)</span> and <span class="math inline">\(X'_b(x)\)</span> would have evaluations <span class="math inline">\(5 6 7 8 2\)</span> (notice the <span class="math inline">\(2\)</span> and <span class="math inline">\(9\)</span> flipped, where <span class="math inline">\(9\)</span> corresponds to the <span class="math inline">\(b_4\)</span> wire).</p>
<p>We would then instead of checking equality within one run of the procedure (ie. checking <span class="math inline">\(p(4) = p'(4)\)</span> as before), we would check <em>the product</em> of the three different runs on each side:</p>
<p><span class="math display">\[
p_{a}(n) \cdot p_{b}(n) \cdot p_{c}(n)=p_{a}^{\prime}(n) \cdot p_{b}^{\prime}(n) \cdot p_{c}^{\prime}(n)
\]</span></p>
<p>The product of the three <span class="math inline">\(p(n)\)</span> evaluations on each side accumulates <em>all</em> coordinate pairs in the <span class="math inline">\(a\)</span>, <span class="math inline">\(b\)</span> and <span class="math inline">\(c\)</span> runs on each side together, so this allows us to do the same check as before, except that we can now check copy constraints not just between positions within one of the three sets of wires <span class="math inline">\(a\)</span>, <span class="math inline">\(b\)</span> or <span class="math inline">\(c\)</span>, but also between one set of wires and another (eg. as in <span class="math inline">\(a(2) = b(4)\)</span>).</p>
<p>And that's all there is to it!</p>
<h3 id="putting-it-all-together">Putting it all together</h3>
<p>In reality, all of this math is done not over integers, but over a prime field; check the section "A Modular Math Interlude" <a href="https://vitalik.ca/general/2017/11/22/starks_part_2.html">here</a> for a description of what prime fields are. Also, for mathematical reasons perhaps best appreciated by reading and understanding <a href="https://vitalik.ca/general/2019/05/12/fft.html">this article on FFT implementation</a>, instead of representing wire indices with <span class="math inline">\(x=0....n-1\)</span>, we'll use powers of <span class="math inline">\(\omega: 1, \omega, \omega ^2....\omega ^{n-1}\)</span> where <span class="math inline">\(\omega\)</span> is a high-order root-of-unity in the field. This changes nothing about the math, except that the coordinate pair accumulator constraint checking equation changes from <span class="math inline">\(p(x + 1) = p(x) \cdot (v_1 + X(x) + v_2 \cdot Y(x))\)</span> to <span class="math inline">\(p(\omega \cdot x) = p(x) \cdot (v_1 + X(x) + v_2 \cdot Y(x))\)</span>, and instead of using <span class="math inline">\(0..n-1\)</span>, <span class="math inline">\(n..2n-1\)</span>, <span class="math inline">\(2n..3n-1\)</span> as coordinates we use <span class="math inline">\(\omega^i, g \cdot \omega^i\)</span> and <span class="math inline">\(g^2 \cdot \omega^i\)</span> where <span class="math inline">\(g\)</span> can be some random high-order element in the field.</p>
<p>Now let's write out all the equations we need to check. First, the main gate-constraint satisfaction check:</p>
<p><span class="math display">\[
Q_{L}(x) a(x)+Q_{R}(x) b(x)+Q_{O}(x) c(x)+Q_{M}(x) a(x) b(x)+Q_{C}(x)=0
\]</span></p>
<p>Then the polynomial accumulator transition constraint (note: think of "<span class="math inline">\(= Z(x) \cdot H(x)\)</span>" as meaning "equals zero for all coordinates within some particular domain that we care about, but not necessarily outside of it"):</p>
<p><span class="math display">\[
\begin{array}{l}{P_{a}(\omega x)-P_{a}(x)\left(v_{1}+x+v_{2} a(x)\right) =Z(x) H_{1}(x)} \\ {P_{a^{\prime}}(\omega x)-P_{a^{\prime}}(x)\left(v_{1}+\sigma_{a}(x)+v_{2} a(x)\right)=Z(x) H_{2}(x)} \\ {P_{b}(\omega x)-P_{b}(x)\left(v_{1}+g x+v_{2} b(x)\right)=Z(x) H_{3}(x)} \\ {P_{b^{\prime}}(\omega x)-P_{b^{\prime}}(x)\left(v_{1}+\sigma_{b}(x)+v_{2} b(x)\right)=Z(x) H_{4}(x)} \\ {P_{c}(\omega x)-P_{c}(x)\left(v_{1}+g^{2} x+v_{2} c(x)\right)=Z(x) H_{5}(x)} \\ {P_{c^{\prime}}(\omega x)-P_{c^{\prime}}(x)\left(v_{1}+\sigma_{c}(x)+v_{2} c(x)\right)=Z(x) H_{6}(x)}\end{array}
\]</span></p>
<p>Then the polynomial accumulator starting and ending constraints:</p>
<p><span class="math display">\[
\begin{array}{l}{P_{a}(1)=P_{b}(1)=P_{c}(1)=P_{a^{\prime}}(1)=P_{b^{\prime}}(1)=P_{c^{\prime}}(1)=1} \\ {P_{a}\left(\omega^{n}\right) P_{b}\left(\omega^{n}\right) P_{c}\left(\omega^{n}\right)=P_{a^{\prime}}\left(\omega^{n}\right) P_{b^{\prime}}\left(\omega^{n}\right) P_{c^{\prime}}\left(\omega^{n}\right)}\end{array}
\]</span></p>
<p>The user-provided polynomials are:</p>
<ul>
<li>The wire assignments <span class="math inline">\(a(x), b(x), c(x)\)</span></li>
<li>The coordinate accumulators <span class="math inline">\(P_a(x), P_b(x), P_c(x), P_{a'}(x), P_{b'}(x), P_{c'}(x)\)</span></li>
<li>The quotients <span class="math inline">\(H(x)\)</span> and <span class="math inline">\(H_1(x)...H_6(x)\)</span></li>
</ul>
<p>The program-specific polynomials that the prover and verifier need to compute ahead of time are:</p>
<ul>
<li><span class="math inline">\(Q_L(x), Q_R(x), Q_O(x), Q_M(x), Q_C(x)\)</span>, which together represent the gates in the circuit (note that <span class="math inline">\(Q_C(x)\)</span> encodes public inputs, so it may need to be computed or modified at runtime)</li>
<li>The "permutation polynomials" <span class="math inline">\(\sigma_a(x), \sigma_b(x)\)</span> and <span class="math inline">\(\sigma_c(x)\)</span>, which encode the copy constraints between the <span class="math inline">\(a\)</span>, <span class="math inline">\(b\)</span> and <span class="math inline">\(c\)</span> wires</li>
</ul>
<p>Note that the verifier need only store commitments to these polynomials. The only remaining polynomial in the above equations is <span class="math inline">\(Z(x) = (x - 1) \cdot (x - \omega) \cdot ... \cdot (x - \omega ^{n-1})\)</span> which is designed to evaluate to zero at all those points. Fortunately, <span class="math inline">\(\omega\)</span> can be chosen to make this polynomial very easy to evaluate: the usual technique is to choose <span class="math inline">\(\omega\)</span> to satisfy <span class="math inline">\(\omega ^n = 1\)</span>, in which case <span class="math inline">\(Z(x) = x^n - 1\)</span>.</p>
<p>The only constraint on <span class="math inline">\(v_1\)</span> and <span class="math inline">\(v_2\)</span> is that the user must not be able to choose <span class="math inline">\(a(x), b(x)\)</span> or <span class="math inline">\(c(x)\)</span> after <span class="math inline">\(v_1\)</span> and <span class="math inline">\(v_2\)</span> become known, so we can satisfy this by computing <span class="math inline">\(v_1\)</span> and <span class="math inline">\(v_2\)</span> from hashes of commitments to <span class="math inline">\(a(x), b(x)\)</span> and <span class="math inline">\(c(x)\)</span>.</p>
<p>So now we've turned the program satisfaction problem into a simple problem of satisfying a few equations with polynomials, and there are some optimizations in PLONK that allow us to remove many of the polynomials in the above equations that I will not go into to preserve simplicity. But the polynomials themselves, both the program-specific parameters and the user inputs, are <strong>big</strong>. So the next question is, how do we get around this so we can make the proof short?</p>
<h2 id="polynomial-commitments">Polynomial commitments</h2>
<p>A <a href="https://pdfs.semanticscholar.org/31eb/add7a0109a584cfbf94b3afaa3c117c78c91.pdf">polynomial commitment</a> is a short object that "represents" a polynomial, and allows you to verify evaluations of that polynomial, without needing to actually contain all of the data in the polynomial. That is, if someone gives you a commitment <span class="math inline">\(c\)</span> representing <span class="math inline">\(P(x)\)</span>, they can give you a proof that can convince you, for some specific <span class="math inline">\(z\)</span>, what the value of <span class="math inline">\(P(z)\)</span> is. There is a further mathematical result that says that, over a sufficiently big field, if certain kinds of equations (chosen before <span class="math inline">\(z\)</span> is known) about polynomials evaluated at a random <span class="math inline">\(z\)</span> are true, those same equations are true about the whole polynomial as well. For example, if <span class="math inline">\(P(z) \cdot Q(z) + R(z) = S(z) + 5\)</span>, then we know that it's overwhelmingly likely that <span class="math inline">\(P(x) \cdot Q(x) + R(x) = S(x) + 5\)</span> in general. Using such polynomial commitments, we could very easily check all of the above polynomial equations above - make the commitments, use them as input to generate <span class="math inline">\(z\)</span>, prove what the evaluations are of each polynomial at <span class="math inline">\(z\)</span>, and then run the equations with these evaluations instead of the original polynomials. But how do these commitments work?</p>
<p>There are two parts: the commitment to the polynomial <span class="math inline">\(P(x) \rightarrow c\)</span>, and the opening to a value <span class="math inline">\(P(z)\)</span> at some <span class="math inline">\(z\)</span>. To make a commitment, there are many techniques; one example is <a href="https://vitalik.ca/general/2017/11/22/starks_part_2.html">FRI</a>, and another is Kate commitments which I will describe below. To prove an opening, it turns out that there is a simple generic "subtract-and-divide" trick: to prove that <span class="math inline">\(P(z) = a\)</span>, you prove that</p>
<p><span class="math display">\[
\frac{P(x)-a}{x-z}
\]</span></p>
<p>is also a polynomial (using another polynomial commitment). This works because if the quotient is a polynomial (ie. it is not fractional), then <span class="math inline">\(x - z\)</span> is a factor of <span class="math inline">\(P(x) - a\)</span>, so <span class="math inline">\((P(x) - a)(z) = 0\)</span>, so <span class="math inline">\(P(z) = a\)</span>. Try it with some polynomial, eg. <span class="math inline">\(P(x) = x^3 + 2 \cdot x^2 + 5\)</span> with <span class="math inline">\((z = 6, a = 293)\)</span>, yourself; and try <span class="math inline">\((z = 6, a = 292)\)</span> and see how it fails (if you're lazy, see WolframAlpha <a href="https://www.wolframalpha.com/input/?i=factor+%28%28x%5E3+%2B+2*x%5E2+%2B+5%29+-+293%29+%2F+%28x+-+6%29">here</a> vs <a href="https://www.wolframalpha.com/input/?i=factor+%28%28x%5E3+%2B+2*x%5E2+%2B+5%29+-+292%29+%2F+%28x+-+6%29">here</a>). Note also a generic optimization: to prove many openings of many polynomials at the same time, after committing to the outputs do the subtract-and-divide trick on a <em>random linear combination</em> of the polynomials and the outputs.</p>
<p>So how do the commitments themselves work? Kate commitments are, fortunately, much simpler than FRI. A trusted-setup procedure generates a set of elliptic curve points <span class="math inline">\(G, G \cdot s, G \cdot s^2\)</span> .... <span class="math inline">\(G \cdot s^n\)</span>, as well as <span class="math inline">\(G_2 \cdot s\)</span>, where <span class="math inline">\(G\)</span> and <span class="math inline">\(G_2\)</span> are the generators of two elliptic curve groups and <span class="math inline">\(s\)</span> is a secret that is forgotten once the procedure is finished (note that there is a multi-party version of this setup, which is secure as long as at least one of the participants forgets their share of the secret). These points are published and considered to be "the proving key" of the scheme; anyone who needs to make a polynomial commitment will need to use these points. A commitment to a degree-d polynomial is made by multiplying each of the first d+1 points in the proving key by the corresponding coefficient in the polynomial, and adding the results together.</p>
<p>Notice that this provides an "evaluation" of that polynomial at <span class="math inline">\(s\)</span>, without knowing <span class="math inline">\(s\)</span>. For example, <span class="math inline">\(x^3 + 2x^2+5\)</span> would be represented by <span class="math inline">\((G \cdot s^3) + 2 \cdot (G \cdot s^2) + 5 \cdot G\)</span>. We can use the notation <span class="math inline">\([P]\)</span> to refer to <span class="math inline">\(P\)</span> encoded in this way (ie. <span class="math inline">\(G \cdot P(s)\)</span>). When doing the subtract-and-divide trick, you can prove that the two polynomials actually satisfy the relation by using <a href="https://medium.com/@VitalikButerin/exploring-elliptic-curve-pairings-c73c1864e627">elliptic curve pairings</a>: check that <span class="math inline">\(e([P] - G \cdot a, G_2) = e([Q], [x] - G_2 \cdot z)\)</span> as a proxy for checking that <span class="math inline">\(P(x) - a = Q(x) \cdot (x - z)\)</span>.</p>
<p>But there are more recently other types of polynomial commitments coming out too. A new scheme called DARK ("Diophantine arguments of knowledge") uses "hidden order groups" such as <a href="https://blogs.ams.org/mathgradblog/2018/02/10/introduction-ideal-class-groups/">class groups</a> to implement another kind of polynomial commitment. Hidden order groups are unique because they allow you to compress arbitrarily large numbers into group elements, even numbers much larger than the size of the group element, in a way that can't be "spoofed"; constructions from VDFs to <a href="https://ethresear.ch/t/rsa-accumulators-for-plasma-cash-history-reduction/3739">accumulators</a> to range proofs to polynomial commitments can be built on top of this. Another option is to use bulletproofs, using regular elliptic curve groups at the cost of the proof taking much longer to verify. Because polynomial commitments are much simpler than full-on zero knowledge proof schemes, we can expect more such schemes to get created in the future.</p>
<h2 id="recap">Recap</h2>
<p>To finish off, let's go over the scheme again. Given a program <span class="math inline">\(P\)</span>, you convert it into a circuit, and generate a set of equations that look like this:</p>
<p><span class="math display">\[
\left(Q_{L_{i}}\right) a_{i}+\left(Q_{R_{i}}\right) b_{i}+\left(Q_{O_{i}}\right) c_{i}+\left(Q_{M_{i}}\right) a_{i} b_{i}+Q_{C_{i}}=0
\]</span></p>
<p>You then convert this set of equations into a single polynomial equation:</p>
<p><span class="math display">\[
Q_{L}(x) a(x)+Q_{R}(x) b(x)+Q_{O}(x) c(x)+Q_{M}(x) a(x) b(x)+Q_{C}(x)=0
\]</span></p>
<p>You also generate from the circuit a list of copy constraints. From these copy constraints you generate the three polynomials representing the permuted wire indices: <span class="math inline">\(\sigma_a(x), \sigma_b(x), \sigma_c(x)\)</span>. To generate a proof, you compute the values of all the wires and convert them into three polynomials: <span class="math inline">\(a(x), b(x), c(x)\)</span>. You also compute six "coordinate pair accumulator" polynomials as part of the permutation-check argument. Finally you compute the cofactors <span class="math inline">\(H_i(x)\)</span>.</p>
<p>There is a set of equations between the polynomials that need to be checked; you can do this by making commitments to the polynomials, opening them at some random <span class="math inline">\(z\)</span> (along with proofs that the openings are correct), and running the equations on these evaluations instead of the original polynomials. The proof itself is just a few commitments and openings and can be checked with a few equations. And that's all there is to it!</p>
Sun, 22 Sep 2019 18:03:10 -0700
https://vitalik.ca/general/2019/09/22/plonk.html
https://vitalik.ca/general/2019/09/22/plonk.htmlgeneralThe Dawn of Hybrid Layer 2 Protocols<p><em>Special thanks to the Plasma Group team for review and feedback</em></p>
<p>Current approaches to layer 2 scaling - basically, Plasma and state channels - are increasingly moving from theory to practice, but at the same time it is becoming easier to see the inherent challenges in treating these techniques as a fully fledged scaling solution for Ethereum. Ethereum was arguably successful in large part because of its very easy developer experience: you write a program, publish the program, and anyone can interact with it. Designing a state channel or Plasma application, on the other hand, relies on a lot of explicit reasoning about incentives and application-specific development complexity. State channels work well for specific use cases such as repeated payments between the same two parties and two-player games (as successfully implemented in <a href="https://www.celer.network/">Celer</a>), but more generalized usage is proving challenging. Plasma, particularly <a href="https://www.learnplasma.org/en/learn/cash.html">Plasma Cash</a>, can work well for payments, but generalization similarly incurs challenges: even implementing a decentralized exchange requires clients to store much more history data, and generalizing to Ethereum-style smart contracts on Plasma seems extremely difficult.</p>
<p>But at the same time, there is a resurgence of a forgotten category of "semi-layer-2" protocols - a category which promises less extreme gains in scaling, but with the benefit of much easier generalization and more favorable security models. A <a href="https://blog.ethereum.org/2014/09/17/scalability-part-1-building-top/">long-forgotten blog post from 2014</a> introduced the idea of "shadow chains", an architecture where block data is published on-chain, but blocks are not <em>verified</em> by default. Rather, blocks are tentatively accepted, and only finalized after some period of time (eg. 2 weeks). During those 2 weeks, a tentatively accepted block can be challenged; only then is the block verified, and if the block proves to be invalid then the chain from that block on is reverted, and the original publisher's deposit is penalized. The contract does not keep track of the full state of the system; it only keeps track of the state root, and users themselves can calculate the state by processing the data submitted to the chain from start to head. A more recent proposal, <a href="https://ethresear.ch/t/on-chain-scaling-to-potentially-500-tx-sec-through-mass-tx-validation/3477">ZK Rollup</a>, does the same thing without challenge periods, by using ZK-SNARKs to verify blocks' validity.</p>
<center>
<img src="https://vitalik.ca/files/posts_files/hybrid-layer-2-files/RollupAnatomy.png"><br> <small><i>Anatomy of a ZK Rollup package that is published on-chain. Hundreds of "internal transactions" that affect the state (ie. account balances) of the ZK Rollup system are compressed into a package that contains ~10 bytes per internal transaction that specifies the state transitions, plus a ~100-300 byte SNARK proving that the transitions are all valid.</i></small>
</center>
<p><br></p>
<p>In both cases, the main chain is used to verify data <em>availability</em>, but does not (directly) verify block <em>validity</em> or perform any significant computation, unless challenges are made. This technique is thus not a jaw-droppingly huge scalability gain, because the on-chain data overhead eventually presents a bottleneck, but it is nevertheless a very significant one. Data is cheaper than computation, and there are ways to compress transaction data very significantly, particularly because the great majority of data in a transaction is the signature and many signatures can be compressed into one through many forms of aggregation. ZK Rollup promises 500 tx/sec, a 30x gain over the Ethereum chain itself, by compressing each transaction to a mere ~10 bytes; signatures do not need to be included because their validity is verified by the zero-knowledge proof. With BLS aggregate signatures a similar throughput can be achieved in shadow chains (more recently called "optimistic rollup" to highlight its similarities to ZK Rollup). The upcoming <a href="https://eth.wiki/en/roadmap/istanbul">Istanbul hard fork</a> will reduce the gas cost of data from 68 per byte to 16 per byte, increasing the throughput of these techniques by another 4x (that's <strong>over 2000 transactions per second</strong>).</p>
<br>
<hr />
<p><br><br></p>
<p>So what is the benefit of data on-chain techniques such as ZK/optimistic rollup versus data off-chain techniques such as Plasma? First of all, there is no need for semi-trusted operators. In ZK Rollup, because validity is verified by cryptographic proofs there is literally no way for a package submitter to be malicious (depending on the setup, a malicious submitter may cause the system to halt for a few seconds, but this is the most harm that can be done). In optimistic rollup, a malicious submitter can publish a bad block, but the next submitter will immediately challenge that block before publishing their own. In both ZK and optimistic rollup, enough data is published on chain to allow anyone to compute the complete internal state, simply by processing all of the submitted deltas in order, and there is no "data withholding attack" that can take this property away. Hence, becoming an operator can be fully permissionless; all that is needed is a security deposit (eg. 10 ETH) for anti-spam purposes.</p>
<p>Second, optimistic rollup particularly is vastly easier to generalize; the state transition function in an optimistic rollup system can be literally anything that can be computed within the gas limit of a single block (including the Merkle branches providing the parts of the state needed to verify the transition). ZK Rollup is theoretically generalizeable in the same way, though in practice making ZK SNARKs over general-purpose computation (such as EVM execution) is very difficult, at least for now. Third, optimistic rollup is much easier to build clients for, as there is less need for second-layer networking infrastructure; more can be done by just scanning the blockchain.</p>
<p>But where do these advantages come from? The answer lies in a highly technical issue known as the <em>data availability problem</em> (see <a href="https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding">note</a>, <a href="https://www.youtube.com/watch?v=OJT_fR7wexw">video</a>). Basically, there are two ways to try to cheat in a layer-2 system. The first is to publish invalid data to the blockchain. The second is to not publish data at all (eg. in Plasma, publishing the root hash of a new Plasma block to the main chain but without revealing the contents of the block to anyone). Published-but-invalid data is very easy to deal with, because once the data is published on-chain there are multiple ways to figure out unambiguously whether or not it's valid, and an invalid submission is unambiguously invalid so the submitter can be heavily penalized. Unavailable data, on the other hand, is much harder to deal with, because even though unavailability can be detected if challenged, one cannot reliably determine whose fault the non-publication is, especially if data is withheld by default and revealed on-demand only when some verification mechanism tries to verify its availability. This is illustrated in the "Fisherman's dilemma", which shows how a challenge-response game cannot distinguish between malicious submitters and malicious challengers:</p>
<center>
<img src="https://raw.githubusercontent.com/vbuterin/diagrams/master/fisherman_dilemma_1.png"> <br><br> <small><i>Fisherman's dilemma. If you only start watching the given specific piece of data at time T3, you have no idea whether you are living in Case 1 or Case 2, and hence who is at fault.</i></small>
</center>
<p><br></p>
<p>Plasma and channels both work around the fisherman's dilemma by pushing the problem to users: if you as a user decide that another user you are interacting with (a counterparty in a state channel, an operator in a Plasma chain) is not publishing data to you that they should be publishing, it's your responsibility to exit and move to a different counterparty/operator. The fact that you as a user have all of the <em>previous</em> data, and data about all of the transactions <em>you</em> signed, allows you to prove to the chain what assets you held inside the layer-2 protocol, and thus safely bring them out of the system. You prove the existence of a (previously agreed) operation that gave the asset to you, no one else can prove the existence of an operation approved by you that sent the asset to someone else, so you get the asset.</p>
<p>The technique is very elegant. However, it relies on a key assumption: that every state object has a logical "owner", and the state of the object cannot be changed without the owner's consent. This works well for UTXO-based payments (but not account-based payments, where you <em>can</em> edit someone else's balance <em>upward</em> without their consent; this is why account-based Plasma is so hard), and it can even be made to work for a decentralized exchange, but this "ownership" property is far from universal. Some applications, eg. <a href="http://uniswap.exchange">Uniswap</a> don't have a natural owner, and even in those applications that do, there are often multiple people that can legitimately make edits to the object. And there is no way to allow arbitrary third parties to exit an asset without introducing the possibility of denial-of-service (DoS) attacks, precisely because one cannot prove whether the publisher or submitter is at fault.</p>
<p>There are other issues peculiar to Plasma and channels individually. Channels do not allow off-chain transactions to users that are not already part of the channel (argument: suppose there existed a way to send $1 to an arbitrary new user from inside a channel. Then this technique could be used many times in parallel to send $1 to more users than there are funds in the system, already breaking its security guarantee). Plasma requires users to store large amounts of history data, which gets even bigger when different assets can be intertwined (eg. when an asset is transferred conditional on transfer of another asset, as happens in a decentralized exchange with a single-stage order book mechanism).</p>
<p>Because data-on-chain computation-off-chain layer 2 techniques don't have data availability issues, they have none of these weaknesses. ZK and optimistic rollup take great care to put enough data on chain to allow users to calculate the full state of the layer 2 system, ensuring that if any participant disappears a new one can trivially take their place. The only issue that they have is verifying computation without doing the computation on-chain, which is a much easier problem. And the scalability gains are significant: ~10 bytes per transaction in ZK Rollup, and a similar level of scalability can be achieved in optimistic rollup by using BLS aggregation to aggregate signatures. This corresponds to a theoretical maximum of ~500 transactions per second today, and over 2000 post-Istanbul.</p>
<br>
<hr />
<p><br><br></p>
<p>But what if you want more scalability? Then there is a large middle ground between data-on-chain layer 2 and data-off-chain layer 2 protocols, with many hybrid approaches that give you some of the benefits of both. To give a simple example, the history storage blowup in a decentralized exchange implemented on Plasma Cash can be prevented by publishing a mapping of which orders are matched with which orders (that's less than 4 bytes per order) on chain:</p>
<center>
<img src="https://vitalik.ca/files/posts_files/hybrid-layer-2-files/Plasma%20Cash%200.png" style="width:180px; padding: 40px"> <img src="https://vitalik.ca/files/posts_files/hybrid-layer-2-files/Plasma%20Cash%201.png" style="width:180px; padding: 40px"> <img src="https://vitalik.ca/files/posts_files/hybrid-layer-2-files/Plasma%20Cash%202.png" style="width:180px; padding: 40px"><br> <small><i><b>Left</b>: History data a Plasma Cash user needs to store if they own 1 coin. <b>Middle:</b> History data a Plasma Cash user needs to store if they own 1 coin that was exchanged with another coin using an atomic swap. <b>Right</b>: History data a Plasma Cash user needs to store if the order matching is published on chain.</i></small>
</center>
<p><br></p>
<p>Even outside of the decentralized exchange context, the amount of history that users need to store in Plasma can be reduced by having the Plasma chain periodically publish some per-user data on-chain. One could also imagine a platform which works like Plasma in the case where some state <em>does</em> have a logical "owner" and works like ZK or optimistic rollup in the case where it does not. Plasma developers <a href="https://plasma.build/t/rollup-plasma-for-mass-exits-complex-disputes/90">are already starting to work</a> on these kinds of optimizations.</p>
<p>There is thus a strong case to be made for developers of layer 2 scalability solutions to move to be more willing to publish per-user data on-chain at least some of the time: it greatly increases ease of development, generality and security and reduces per-user load (eg. no need for users storing history data). The efficiency losses of doing so are also overstated: even in a fully off-chain layer-2 architecture, users depositing, withdrawing and moving between different counterparties and providers is going to be an inevitable and frequent occurrence, and so there will be a significant amount of per-user on-chain data regardless. The hybrid route opens the door to a relatively fast deployment of fully generalized Ethereum-style smart contracts inside a quasi-layer-2 architecture.</p>
<p>See also:</p>
<ul>
<li><a href="https://medium.com/@plasma_group/db253287af50">Introducing the OVM</a></li>
<li><a href="https://medium.com/plasma-group/ethereum-smart-contracts-in-l2-optimistic-rollup-2c1cef2ec537">Blog post by Karl Floersch</a></li>
<li><a href="https://ethresear.ch/t/minimal-viable-merged-consensus/5617">Related ideas by John Adler</a></li>
</ul>
Wed, 28 Aug 2019 18:03:10 -0700
https://vitalik.ca/general/2019/08/28/hybrid_layer_2.html
https://vitalik.ca/general/2019/08/28/hybrid_layer_2.htmlgeneralSidechains vs Plasma vs Sharding<p><em>Special thanks to Jinglan Wang for review and feedback</em></p>
<p>One question that often comes up is: how exactly is sharding different from sidechains or Plasma? All three architectures seem to involve a hub-and-spoke architecture with a central "main chain" that serves as the consensus backbone of the system, and a set of "child" chains containing actual user-level transactions. Hashes from the child chains are usually periodically published into the main chain (sharded chains with no hub are theoretically possible but haven't been done so far; this article will not focus on them, but the arguments are similar). Given this fundamental similarity, why go with one approach over the others?</p>
<p>Distinguishing sidechains from Plasma is simple. Plasma chains are sidechains that have a non-custodial property: if there is any error in the Plasma chain, then the error can be detected, and users can safely exit the Plasma chain and prevent the attacker from doing any lasting damage. The only cost that users suffer is that they must wait for a challenge period and pay some higher transaction fees on the (non-scalable) base chain. Regular sidechains do not have this safety property, so they are less secure. However, designing Plasma chains is in many cases much harder, and one could argue that for many low-value applications the security is not worth the added complexity.</p>
<p>So what about Plasma versus sharding? The key technical difference has to do with the notion of <strong>tight coupling</strong>. Tight coupling is a property of sharding, but NOT a property of sidechains or Plasma, that says that the validity of the main chain ("beacon chain" in ethereum 2.0) is inseparable from the validity of the child chains. That is, a child chain block that specifies an invalid main chain block as a dependency is by definition invalid, and more importantly a main chain block that includes an invalid child chain block is by definition invalid.</p>
<p>In non-sharded blockchains, this idea that the canonical chain (ie. the chain that everyone accepts as representing the "real" history) is <em>by definition</em> fully available and valid also applies; for example in the case of Bitcoin and Ethereum one typically says that the canonical chain is the "longest valid chain" (or, more pedantically, the "heaviest valid and available chain"). In sharded blockchains, this idea that the canonical chain is the heaviest valid and available chain <em>by definition</em> also applies, with the validity and availability requirement applying to both the main chain and shard chains. The new challenge that a sharded system has, however, is that users have no way of fully verifying the validity and availability of any given chain <em>directly</em>, because there is too much data. The challenge of engineering sharded chains is to get around this limitation by giving users a maximally trustless and practical <em>indirect</em> means to verify which chains are fully available and valid, so that they can still determine which chain is canonical. In practice, this includes techniques like committees, SNARKs/STARKs, fisherman schemes and <a href="https://arxiv.org/abs/1809.09044">fraud and data availability proofs</a>.</p>
<p>If a chain structure does not have this tight-coupling property, then it is arguably not a layer-1 sharding scheme, but rather a layer-2 system sitting on top of a non-scalable layer-1 chain. Plasma is not a tightly-coupled system: an invalid Plasma block absolutely can have its header be committed into the main Ethereum chain, because the Ethereum base layer has no idea that it represents an invalid Plasma block, or even that it represents a Plasma block at all; all that it sees is a transaction containing a small piece of data. However, the consequences of a single Plasma chain failing are localized to within that Plasma chain.</p>
<center>
<table border="1">
<tr>
<td>
<b>Sharding</b>
</td>
<td>
Try really hard to ensure total validity/availability of every part of the system
</td>
</tr>
<tr>
<td>
<b>Plasma</b>
</td>
<td>
Accept local faults but try to limit their consequences
</td>
</tr>
</table>
</center>
<p><br></p>
<p>However, if you try to analyze the process of <em>how</em> users perform the "indirect validation" procedure to determine if the chain they are looking at is fully valid and available without downloading and executing the whole thing, one can find more similarities with how Plasma works. For example, a common technique used to prevent availability issues is fishermen: if a node sees a given piece of a block as unavailable, it can publish a challenge claiming this, creating a time period within which anyone can publish that piece of data. If a block goes unchallenged for long enough, the blocks and all blocks that cite it as a dependency can be reverted. This seems fundamentally similar to Plasma, where if a block is unavailable users can publish a message to the main chain to exit their state in response. Both techniques eventually buckle under pressure in the same way: if there are too many false challenges in a sharded system, then users cannot keep track of whether or not all of the availability challenges have been answered, and if there are too many availability challenges in a Plasma system then the main chain could get overwhelmed as the exits fill up the chain's block size limit. In both cases, it seems like there's a system that has nominally <span class="math inline">\(O(C^2)\)</span> scalability (where <span class="math inline">\(C\)</span> is the computing power of one node) but where scalability falls to <span class="math inline">\(O(C)\)</span> in the event of an attack. However, sharding has more defenses against this.</p>
<p>First of all, modern sharded designs use randomly sampled committees, so one cannot easily dominate even one committee enough to produce a fake block unless one has a large portion (perhaps <span class="math inline">\(>\frac{1}{3}\)</span>) of the entire validator set of the chain. Second, there are better strategies to handling data availability than fishermen: data availability proofs. In a scheme using data availability proofs, if a block is <em>unavailable</em>, then clients' data availability checks will fail and clients will see that block as unavailable. If the block is <em>invalid</em>, then even a single fraud proof will convince them of this fact for an entire block. An <span class="math inline">\(O(1)\)</span>-sized fraud proof can convince a client of the invalidity of an <span class="math inline">\(O(C)\)</span>-sized block, and so <span class="math inline">\(O(C)\)</span> data suffices to convince a client of the invalidity of <span class="math inline">\(O(C^2)\)</span> data (this is in the worst case where the client is dealing with <span class="math inline">\(N\)</span> sister blocks all with the same parent of which only one is valid; in more likely cases, one single fraud proof suffices to prove invalidity of an entire invalid chain). Hence, sharded systems are theoretically less vulnerable to being overwhelmed by denial-of-service attacks than Plasma chains.</p>
<p>Second, sharded chains provide stronger guarantees in the face of large and majority attackers (with more than <span class="math inline">\(\frac{1}{3}\)</span> or even <span class="math inline">\(\frac{1}{2}\)</span> of the validator set). A Plasma chain can always be successfully attacked by a 51% attack on the main chain that censors exits; a sharded chain cannot. This is because data availability proofs and fraud proofs happen <em>inside the client</em>, rather than <em>inside the chain</em>, so they cannot be censored by 51% attacks. Third, the defenses provided by sharded chains are easier to generalize; Plasma's model of exits requires state to be separated into discrete pieces each of which is in the interest of any single actor to maintain, whereas sharded chains relying on data availability proofs, fraud proofs, fishermen and random sampling are theoretically universal.</p>
<p>So there really is a large difference between validity and availability guarantees that are provided at layer 2, which are limited and more complex as they require explicit reasoning about incentives and which party has an interest in which pieces of state, and guarantees that are provided by a layer 1 system that is committed to fully satisfying them.</p>
<p>But Plasma chains also have large advantages too. First, they can be iterated and new designs can be implemented more quickly, as each Plasma chain can be deployed separately without coordinating the rest of the ecosystem. Second, sharding is inherently more fragile, as it attempts to guarantee absolute and total availability and validity of some quantity of data, and this quantity must be set in the protocol; too little, and the system has less scalability than it could have had, too much, and the entire system risks breaking. The maximum safe level of scalability also depends on the number of users of the system, which is an unpredictable variable. Plasma chains, on the other hand, allow different users to make different tradeoffs in this regard, and allow users to adjust more flexibly to changes in circumstances.</p>
<p>Single-operator Plasma chains can also be used to offer more privacy than sharded systems, where all data is public. Even where privacy is not desired, they are potentially more efficient, because the total data availability requirement of sharded systems requires a large extra level of redundancy as a safety margin. In Plasma systems, on the other hand, data requirements for each piece of data can be minimized, to the point where in the long term each individual piece of data may only need to be replicated a few times, rather than a thousand times as is the case in sharded systems.</p>
<p>Hence, in the long term, a hybrid system where a sharded base layer exists, and Plasma chains exist on top of it to provide further scalability, seems like the most likely approach, more able to serve different groups' of users need than sole reliance on one strategy or the other. And it is unfortunately <em>not</em> the case that at a sufficient level of advancement Plasma and sharding collapse into the same design; the two are in some key ways irreducibly different (eg. the data availability checks made by clients in sharded systems <em>cannot</em> be moved to the main chain in Plasma because these checks only work if they are done subjectively and based on private information). But both scalability solutions (as well as state channels!) have a bright future ahead of them.</p>
Wed, 12 Jun 2019 18:03:10 -0700
https://vitalik.ca/general/2019/06/12/plasma_vs_sharding.html
https://vitalik.ca/general/2019/06/12/plasma_vs_sharding.htmlgeneralFast Fourier Transforms<p>
<em>Trigger warning: specialized mathematical topic</em>
</p>
<p>
<em>Special thanks to Karl Floersch for feedback</em>
</p>
<p>
One of the more interesting algorithms in number theory is the Fast Fourier transform (FFT). FFTs are a key building block in many algorithms, including <a href="http://www.math.clemson.edu/~sgao/papers/GM10.pdf">extremely fast multiplication of large numbers</a>, multiplication of polynomials, and extremely fast generation and recovery of <a href="https://blog.ethereum.org/2014/08/16/secret-sharing-erasure-coding-guide-aspiring-dropbox-decentralizer">erasure codes</a>. Erasure codes in particular are highly versatile; in addition to their basic use cases in fault-tolerant data storage and recovery, erasure codes also have more advanced use cases such as <a href="https://arxiv.org/pdf/1809.09044">securing data availability in scalable blockchains</a> and <a href="https://vitalik.ca/general/2017/11/09/starks_part_1.html">STARKs</a>. This article will go into what fast Fourier transforms are, and how some of the simpler algorithms for computing them work.
</p>
<h3>
Background
</h3>
<p>
The original <a href="https://en.wikipedia.org/wiki/Fourier_transform">Fourier transform</a> is a mathematical operation that is often described as converting data between the "frequency domain" and the "time domain". What this means more precisely is that if you have a piece of data, then running the algorithm would come up with a collection of sine waves with different frequencies and amplitudes that, if you added them together, would approximate the original data. Fourier transforms can be used for such wonderful things as <a href="https://twitter.com/johncarlosbaez/status/1094671748501405696">expressing square orbits through epicycles</a> and <a href="https://en.wikipedia.org/wiki/Fourier_transform">deriving a set of equations that can draw an elephant</a>:
</p>
<p>
<center>
<table>
<tr>
<td>
<img src="http://vitalik.ca/files/posts_files/fft-files/elephant1.png" /><br> <img src="http://vitalik.ca/files/posts_files/fft-files/elephant3.png" />
</td>
<td>
<img src="http://vitalik.ca/files/posts_files/fft-files/elephant2.png" width="400px"/>
</td>
</tr>
</table>
<br> <small><i>Ok fine, Fourier transforms also have really important applications in signal processing, quantum mechanics, and other areas, and help make significant parts of the global economy happen. But come on, elephants are cooler.</i></small>
</center>
<br>
</p>
<p>
Running the Fourier transform algorithm in the "inverse" direction would simply take the sine waves and add them together and compute the resulting values at as many points as you wanted to sample.
</p>
<p>
The kind of Fourier transform we'll be talking about in this post is a similar algorithm, except instead of being a <em>continuous</em> Fourier transform over <em>real or complex numbers</em>, it's a <em><strong>discrete Fourier transform</strong></em> over <em>finite fields</em> (see the "A Modular Math Interlude" section <a href="https://vitalik.ca/general/2017/11/22/starks_part_2.html">here</a> for a refresher on what finite fields are). Instead of talking about converting between "frequency domain" and "time domain", here we'll talk about two different operations: <em>multi-point polynomial evaluation</em> (evaluating a degree <span class="math inline">\(< N\)</span> polynomial at <span class="math inline">\(N\)</span> different points) and its inverse, <em>polynomial interpolation</em> (given the evaluations of a degree <span class="math inline">\(< N\)</span> polynomial at <span class="math inline">\(N\)</span> different points, recovering the polynomial). For example, if we are operating in the prime field with modulus 5, then the polynomial <span class="math inline">\(y = x² + 3\)</span> (for convenience we can write the coefficients in increasing order: <span class="math inline">\([3,0,1]\)</span>) evaluated at the points <span class="math inline">\([0,1,2]\)</span> gives the values <span class="math inline">\([3,4,2]\)</span> (not <span class="math inline">\([3, 4, 7]\)</span> because we're operating in a finite field where the numbers wrap around at 5), and we can actually take the evaluations <span class="math inline">\([3,4,2]\)</span> and the coordinates they were evaluated at (<span class="math inline">\([0,1,2]\)</span>) to recover the original polynomial <span class="math inline">\([3,0,1]\)</span>.
</p>
<p>
There are algorithms for both multi-point evaluation and interpolation that can do either operation in <span class="math inline">\(O(N^2)\)</span> time. Multi-point evaluation is simple: just separately evaluate the polynomial at each point. Here's python code for doing that:
</p>
<pre>
def eval_poly_at(self, poly, x, modulus):
y = 0
power_of_x = 1
for coefficient in poly:
y += power_of_x * coefficient
power_of_x *= x
return y % modulus
</pre>
<p>
The algorithm runs a loop going through every coefficient and does one thing for each coefficient, so it runs in <span class="math inline">\(O(N)\)</span> time. Multi-point evaluation involves doing this evaluation at <span class="math inline">\(N\)</span> different points, so the total run time is <span class="math inline">\(O(N^2)\)</span>.
</p>
<p>
Lagrange interpolation is more complicated (search for "Lagrange interpolation" <a href="https://blog.ethereum.org/2014/08/16/secret-sharing-erasure-coding-guide-aspiring-dropbox-decentralizer/">here</a> for a more detailed explanation). The key building block of the basic strategy is that for any domain <span class="math inline">\(D\)</span> and point <span class="math inline">\(x\)</span>, we can construct a polynomial that returns <span class="math inline">\(1\)</span> for <span class="math inline">\(x\)</span> and <span class="math inline">\(0\)</span> for any value in <span class="math inline">\(D\)</span> other than <span class="math inline">\(x\)</span>. For example, if <span class="math inline">\(D = [1,2,3,4]\)</span> and <span class="math inline">\(x = 1\)</span>, the polynomial is:
</p>
<p><span class="math display">\[
y = \frac{(x-2)(x-3)(x-4)}{(1-2)(1-3)(1-4)}
\]</span></p>
<p>
You can mentally plug in <span class="math inline">\(1\)</span>, <span class="math inline">\(2\)</span>, <span class="math inline">\(3\)</span> and <span class="math inline">\(4\)</span> to the above expression and verify that it returns <span class="math inline">\(1\)</span> for <span class="math inline">\(x= 1\)</span> and <span class="math inline">\(0\)</span> in the other three cases.
</p>
<p>
We can recover the polynomial that gives any desired set of outputs on the given domain by multiplying and adding these polynomials. If we call the above polynomial <span class="math inline">\(P_1\)</span>, and the equivalent ones for <span class="math inline">\(x=2\)</span>, <span class="math inline">\(x=3\)</span>, <span class="math inline">\(x=4\)</span>, <span class="math inline">\(P_2\)</span>, <span class="math inline">\(P_3\)</span> and <span class="math inline">\(P_4\)</span>, then the polynomial that returns <span class="math inline">\([3,1,4,1]\)</span> on the domain <span class="math inline">\([1,2,3,4]\)</span> is simply <span class="math inline">\(3 \cdot P_1 + P_2 + 4 \cdot P_3 + P_4\)</span>. Computing the <span class="math inline">\(P_i\)</span> polynomials takes <span class="math inline">\(O(N^2)\)</span> time (you first construct the polynomial that returns to 0 on the entire domain, which takes <span class="math inline">\(O(N^2)\)</span> time, then separately divide it by <span class="math inline">\((x - x_i)\)</span> for each <span class="math inline">\(x_i\)</span>), and computing the linear combination takes another <span class="math inline">\(O(N^2)\)</span> time, so it's <span class="math inline">\(O(N^2)\)</span> runtime total.
</p>
<p>
What Fast Fourier transforms let us do, is make both multi-point evaluation and interpolation much faster.
</p>
<h3>
Fast Fourier Transforms
</h3>
<p>
There is a price you have to pay for using this much faster algorithm, which is that you cannot choose any arbitrary field and any arbitrary domain. Whereas with Lagrange interpolation, you could choose whatever x coordinates and y coordinates you wanted, and whatever field you wanted (you could even do it over plain old real numbers), and you could get a polynomial that passes through them., with an FFT, you have to use a finite field, and the domain must be a <em>multiplicative subgroup</em> of the field (that is, a list of powers of some "generator" value). For example, you could use the finite field of integers modulo <span class="math inline">\(337\)</span>, and for the domain use <span class="math inline">\([1, 85, 148, 111, 336, 252, 189, 226]\)</span> (that's the powers of <span class="math inline">\(85\)</span> in the field, eg. <span class="math inline">\(85^3\)</span> % <span class="math inline">\(337 = 111\)</span>; it stops at <span class="math inline">\(226\)</span> because the next power of <span class="math inline">\(85\)</span> cycles back to <span class="math inline">\(1\)</span>). Futhermore, the multiplicative subgroup must have size <span class="math inline">\(2^n\)</span> (there's ways to make it work for numbers of the form <span class="math inline">\(2^{m} \cdot 3^n\)</span> and possibly slightly higher prime powers but then it gets much more complicated and inefficient). The finite field of intergers modulo <span class="math inline">\(59\)</span>, for example, would not work, because there are only multiplicative subgroups of order <span class="math inline">\(2\)</span>, <span class="math inline">\(29\)</span> and <span class="math inline">\(58\)</span>; <span class="math inline">\(2\)</span> is too small to be interesting, and the factor <span class="math inline">\(29\)</span> is far too large to be FFT-friendly. The symmetry that comes from multiplicative groups of size <span class="math inline">\(2^n\)</span> lets us create a recursive algorithm that quite cleverly calculate the results we need from a much smaller amount of work.
</p>
<p>
To understand the algorithm and why it has a low runtime, it's important to understand the general concept of recursion. A recursive algorithm is an algorithm that has two cases: a "base case" where the input to the algorithm is small enough that you can give the output directly, and the "recursive case" where the required computation consists of some "glue computation" plus one or more uses of the same algorithm to smaller inputs. For example, you might have seen recursive algorithms being used for sorting lists. If you have a list (eg. <span class="math inline">\([1,8,7,4,5,6,3,2,9]\)</span>), then you can sort it using the following procedure:
</p>
<ul>
<li>
If the input has one element, then it's already "sorted", so you can just return the input.
</li>
<li>
If the input has more than one element, then separately sort the first half of the list and the second half of the list, and then merge the two sorted sub-lists (call them <span class="math inline">\(A\)</span> and <span class="math inline">\(B\)</span>) as follows. Maintain two counters, <span class="math inline">\(apos\)</span> and <span class="math inline">\(bpos\)</span>, both starting at zero, and maintain an output list, which starts empty. Until either <span class="math inline">\(apos\)</span> or <span class="math inline">\(bpos\)</span> is at the end of the corresponding list, check if <span class="math inline">\(A[apos]\)</span> or <span class="math inline">\(B[bpos]\)</span> is smaller. Whichever is smaller, add that value to the end of the output list, and increase that counter by <span class="math inline">\(1\)</span>. Once this is done, add the rest of whatever list has not been fully processed to the end of the output list, and return the output list.
</li>
</ul>
<p>
Note that the "glue" in the second procedure has runtime <span class="math inline">\(O(N)\)</span>: if each of the two sub-lists has <span class="math inline">\(N\)</span> elements, then you need to run through every item in each list once, so it's <span class="math inline">\(O(N)\)</span> computation total. So the algorithm as a whole works by taking a problem of size <span class="math inline">\(N\)</span>, and breaking it up into two problems of size <span class="math inline">\(\frac{N}{2}\)</span>, plus <span class="math inline">\(O(N)\)</span> of "glue" execution. There is a theorem called the <a href="https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms%29">Master Theorem</a> that lets us compute the total runtime of algorithms like this. It has many sub-cases, but in the case where you break up an execution of size <span class="math inline">\(N\)</span> into <span class="math inline">\(k\)</span> sub-cases of size <span class="math inline">\(\frac{N}{k}\)</span> with <span class="math inline">\(O(N)\)</span> glue (as is the case here), the result is that the execution takes time <span class="math inline">\(O(N \cdot log(N))\)</span>.
</p>
<p>
<center>
<img src="http://vitalik.ca/files/posts_files/fft-files/sorting.png" /><br>
</center>
<br>
</p>
<p>
An FFT works in the same way. We take a problem of size <span class="math inline">\(N\)</span>, break it up into two problems of size <span class="math inline">\(\frac{N}{2}\)</span>, and do <span class="math inline">\(O(N)\)</span> glue work to combine the smaller solutions into a bigger solution, so we get <span class="math inline">\(O(N \cdot log(N))\)</span> runtime total - <em>much faster</em> than <span class="math inline">\(O(N^2)\)</span>. Here is how we do it. I'll describe first how to use an FFT for multi-point evaluation (ie. for some domain <span class="math inline">\(D\)</span> and polynomial <span class="math inline">\(P\)</span>, calculate <span class="math inline">\(P(x)\)</span> for every <span class="math inline">\(x\)</span> in <span class="math inline">\(D\)</span>), and it turns out that you can use the same algorithm for interpolation with a minor tweak.
</p>
<p>
Suppose that we have an FFT where the given domain is the powers of <span class="math inline">\(x\)</span> in some field, where <span class="math inline">\(x^{2^{k}} = 1\)</span> (eg. in the case we introduced above, the domain is the powers of <span class="math inline">\(85\)</span> modulo <span class="math inline">\(337\)</span>, and <span class="math inline">\(85^{2^{3}} = 1\)</span>). We have some polynomial, eg. <span class="math inline">\(y = 6x^7 + 2x^6 + 9x^5 + 5x^4 + x^3 + 4x^2 + x + 3\)</span> (we'll write it as <span class="math inline">\(p = [3, 1, 4, 1, 5, 9, 2, 6]\)</span>). We want to evaluate this polynomial at each point in the domain, ie. at each of the eight powers of <span class="math inline">\(85\)</span>. Here is what we do. First, we break up the polynomial into two parts, which we'll call <span class="math inline">\(evens\)</span> and <span class="math inline">\(odds\)</span>: <span class="math inline">\(evens = [3, 4, 5, 2]\)</span> and <span class="math inline">\(odds = [1, 1, 9, 6]\)</span> (or <span class="math inline">\(evens = 2x^3 + 5x^2 + 4x + 3\)</span> and <span class="math inline">\(odds = 6x^3 + 9x^2 + x + 1\)</span>; yes, this is just taking the even-degree coefficients and the odd-degree coefficients). Now, we note a mathematical observation: <span class="math inline">\(p(x) = evens(x^2) + x \cdot odds(x^2)\)</span> and <span class="math inline">\(p(-x) = evens(x^2) - x \cdot odds(x^2)\)</span> (think about this for yourself and make sure you understand it before going further).
</p>
<p>
Here, we have a nice property: <span class="math inline">\(evens\)</span> and <span class="math inline">\(odds\)</span> are both polynomials half the size of <span class="math inline">\(p\)</span>, and furthermore, the set of possible values of <span class="math inline">\(x^2\)</span> is only half the size of the original domain, because there is a two-to-one correspondence: <span class="math inline">\(x\)</span> and <span class="math inline">\(-x\)</span> are both part of <span class="math inline">\(D\)</span> (eg. in our current domain <span class="math inline">\([1, 85, 148, 111, 336, 252, 189, 226]\)</span>, 1 and 336 are negatives of each other, as <span class="math inline">\(336 = -1\)</span> % <span class="math inline">\(337\)</span>, as are <span class="math inline">\((85, 252)\)</span>, <span class="math inline">\((148, 189)\)</span> and <span class="math inline">\((111, 226)\)</span>. And <span class="math inline">\(x\)</span> and <span class="math inline">\(-x\)</span> always both have the same square. Hence, we can use an FFT to compute the result of <span class="math inline">\(evens(x)\)</span> for every <span class="math inline">\(x\)</span> in the smaller domain consisting of squares of numbers in the original domain (<span class="math inline">\([1, 148, 336, 189]\)</span>), and we can do the same for odds. And voila, we've reduced a size-<span class="math inline">\(N\)</span> problem into half-size problems.
</p>
<p>
The "glue" is relatively easy (and <span class="math inline">\(O(N)\)</span> in runtime): we receive the evaluations of <span class="math inline">\(evens\)</span> and <span class="math inline">\(odds\)</span> as size-<span class="math inline">\(\frac{N}{2}\)</span> lists, so we simply do <span class="math inline">\(p[i] = evens\_result[i] + domain[i]\cdot odds\_result[i]\)</span> and <span class="math inline">\(p[\frac{N}{2} + i] = evens\_result[i] - domain[i]\cdot odds\_result[i]\)</span> for each index <span class="math inline">\(i\)</span>.
</p>
<p>
Here's the full code:
</p>
<pre>
def fft(vals, modulus, domain):
if len(vals) == 1:
return vals
L = fft(vals[::2], modulus, domain[::2])
R = fft(vals[1::2], modulus, domain[::2])
o = [0 for i in vals]
for i, (x, y) in enumerate(zip(L, R)):
y_times_root = y*domain[i]
o[i] = (x+y_times_root) % modulus
o[i+len(L)] = (x-y_times_root) % modulus
return o
</pre>
<p>
We can try running it:
</p>
<pre>
>>> fft([3,1,4,1,5,9,2,6], 337, [1, 85, 148, 111, 336, 252, 189, 226])
[31, 70, 109, 74, 334, 181, 232, 4]
</pre>
<p>
And we can check the result; evaluating the polynomial at the position <span class="math inline">\(85\)</span>, for example, actually does give the result <span class="math inline">\(70\)</span>. Note that this only works if the domain is "correct"; it needs to be of the form <span class="math inline">\([x^i\)</span> % <span class="math inline">\(modulus\)</span> for <span class="math inline">\(i\)</span> in <span class="math inline">\(range(n)]\)</span> where <span class="math inline">\(x^n = 1\)</span>.
</p>
<p>
An inverse FFT is surprisingly simple:
</p>
<pre>
def inverse_fft(vals, modulus, domain):
vals = fft(vals, modulus, domain)
return [x * modular_inverse(len(vals), modulus) % modulus for x in [vals[0]] + vals[1:][::-1]]
</pre>
<p>
Basically, run the FFT again, but reverse the result (except the first item stays in place) and divide every value by the length of the list.
</p>
<pre>
>>> domain = [1, 85, 148, 111, 336, 252, 189, 226]
>>> def modular_inverse(x, n): return pow(x, n - 2, n)
>>> values = fft([3,1,4,1,5,9,2,6], 337, domain)
>>> values
[31, 70, 109, 74, 334, 181, 232, 4]
>>> inverse_fft(values, 337, domain)
[3, 1, 4, 1, 5, 9, 2, 6]
</pre>
<p>
Now, what can we use this for? Here's one fun use case: we can use FFTs to multiply numbers very quickly. Suppose we wanted to multiply <span class="math inline">\(1253\)</span> by <span class="math inline">\(1895\)</span>. Here is what we would do. First, we would convert the problem into one that turns out to be slightly easier: multiply the <em>polynomials</em> <span class="math inline">\([3, 5, 2, 1]\)</span> by <span class="math inline">\([5, 9, 8, 1]\)</span> (that's just the digits of the two numbers in increasing order), and then convert the answer back into a number by doing a single pass to carry over tens digits. We can multiply polynomials with FFTs quickly, because it turns out that if you convert a polynomial into <em>evaluation form</em> (ie. <span class="math inline">\(f(x)\)</span> for every <span class="math inline">\(x\)</span> in some domain <span class="math inline">\(D\)</span>), then you can multiply two polynomials simply by multiplying their evaluations. So what we'll do is take the polynomials representing our two numbers in <em>coefficient form</em>, use FFTs to convert them to evaluation form, multiply them pointwise, and convert back:
</p>
<pre>
>>> p1 = [3,5,2,1,0,0,0,0]
>>> p2 = [5,9,8,1,0,0,0,0]
>>> x1 = fft(p1, 337, domain)
>>> x1
[11, 161, 256, 10, 336, 100, 83, 78]
>>> x2 = fft(p2, 337, domain)
>>> x2
[23, 43, 170, 242, 3, 313, 161, 96]
>>> x3 = [(v1 * v2) % 337 for v1, v2 in zip(x1, x2)]
>>> x3
[253, 183, 47, 61, 334, 296, 220, 74]
>>> inverse_fft(x3, 337, domain)
[15, 52, 79, 66, 30, 10, 1, 0]
</pre>
<p>
This requires three FFTs (each <span class="math inline">\(O(N \cdot log(N))\)</span> time) and one pointwise multiplication (<span class="math inline">\(O(N)\)</span> time), so it takes <span class="math inline">\(O(N \cdot log(N))\)</span> time altogether (technically a little bit more than <span class="math inline">\(O(N \cdot log(N))\)</span>, because for very big numbers you would need replace <span class="math inline">\(337\)</span> with a bigger modulus and that would make multiplication harder, but close enough). This is <em>much faster</em> than schoolbook multiplication, which takes <span class="math inline">\(O(N^2)\)</span> time:
</p>
<pre>
3 5 2 1
------------
5 | 15 25 10 5
9 | 27 45 18 9
8 | 24 40 16 8
1 | 3 5 2 1
---------------------
15 52 79 66 30 10 1
</pre>
<p>
So now we just take the result, and carry the tens digits over (this is a "walk through the list once and do one thing at each point" algorithm so it takes <span class="math inline">\(O(N)\)</span> time):
</p>
<pre>
[15, 52, 79, 66, 30, 10, 1, 0]
[ 5, 53, 79, 66, 30, 10, 1, 0]
[ 5, 3, 84, 66, 30, 10, 1, 0]
[ 5, 3, 4, 74, 30, 10, 1, 0]
[ 5, 3, 4, 4, 37, 10, 1, 0]
[ 5, 3, 4, 4, 7, 13, 1, 0]
[ 5, 3, 4, 4, 7, 3, 2, 0]
</pre>
<p>
And if we read the digits from top to bottom, we get <span class="math inline">\(2374435\)</span>. Let's check the answer....
</p>
<pre>
>>> 1253 * 1895
2374435
</pre>
<p>
Yay! It worked. In practice, on such small inputs, the difference between <span class="math inline">\(O(N \cdot log(N))\)</span> and <span class="math inline">\(O(N^2)\)</span> isn't <em>that</em> large, so schoolbook multiplication is faster than this FFT-based multiplication process just because the algorithm is simpler, but on large inputs it makes a really big difference.
</p>
<p>
But FFTs are useful not just for multiplying numbers; as mentioned above, polynomial multiplication and multi-point evaluation are crucially important operations in implementing erasure coding, which is a very important technique for building many kinds of redundant fault-tolerant systems. If you like fault tolerance and you like efficiency, FFTs are your friend.
</p>
<h3>
FFTs and binary fields
</h3>
<p>
Prime fields are not the only kind of finite field out there. Another kind of finite field (really a special case of the more general concept of an <em>extension field</em>, which are kind of like the finite-field equivalent of complex numbers) are binary fields. In an binary field, each element is expressed as a polynomial where all of the entries are <span class="math inline">\(0\)</span> or <span class="math inline">\(1\)</span>, eg. <span class="math inline">\(x^3 + x + 1\)</span>. Adding polynomials is done modulo <span class="math inline">\(2\)</span>, and subtraction is the same as addition (as <span class="math inline">\(-1 = 1 \bmod 2\)</span>). We select some irreducible polynomial as a modulus (eg. <span class="math inline">\(x^4 + x + 1\)</span>; <span class="math inline">\(x^4 + 1\)</span> would not work because <span class="math inline">\(x^4 + 1\)</span> can be factored into <span class="math inline">\((x^2 + 1)\cdot(x^2 + 1)\)</span> so it's not "irreducible"); multiplication is done modulo that modulus. For example, in the binary field mod <span class="math inline">\(x^4 + x + 1\)</span>, multiplying <span class="math inline">\(x^2 + 1\)</span> by <span class="math inline">\(x^3 + 1\)</span> would give <span class="math inline">\(x^5 + x^3 + x^2 + 1\)</span> if you just do the multiplication, but <span class="math inline">\(x^5 + x^3 + x^2 + 1 = (x^4 + x + 1)\cdot x + (x^3 + x + 1)\)</span>, so the result is the remainder <span class="math inline">\(x^3 + x + 1\)</span>.
</p>
<p>
We can express this example as a multiplication table. First multiply <span class="math inline">\([1, 0, 0, 1]\)</span> (ie. <span class="math inline">\(x^3 + 1\)</span>) by <span class="math inline">\([1, 0, 1]\)</span> (ie. <span class="math inline">\(x^2 + 1\)</span>):
</p>
<pre>
1 0 0 1
--------
1 | 1 0 0 1
0 | 0 0 0 0
1 | 1 0 0 1
------------
1 0 1 1 0 1
</pre>
<p>
The multiplication result contains an <span class="math inline">\(x^5\)</span> term so we can subtract <span class="math inline">\((x^4 + x + 1)\cdot x\)</span>:
</p>
<pre>
1 0 1 1 0 1
- 1 1 0 0 1 [(x⁴ + x + 1) shifted right by one to reflect being multipled by x]
------------
1 1 0 1 0 0
</pre>
<p>
And we get the result, <span class="math inline">\([1, 1, 0, 1]\)</span> (or <span class="math inline">\(x^3 + x + 1\)</span>).
</p>
<p>
<center>
<img src="https://vitalik.ca/files/posts_files/fft-files/addmult.png" style="width:600px"/><br><br> <small><i>Addition and multiplication tables for the binary field mod <span class="math inline">\(x^4 + x + 1\)</span>. Field elements are expressed as integers converted from binary (eg. <span class="math inline">\(x^3 + x^2 \rightarrow 1100 \rightarrow 12\)</span>)</i></small>
</center>
<br>
</p>
<p>
Binary fields are interesting for two reasons. First of all, if you want to erasure-code binary data, then binary fields are really convenient because <span class="math inline">\(N\)</span> bytes of data can be directly encoded as a binary field element, and any binary field elements that you generate by performing computations on it will also be <span class="math inline">\(N\)</span> bytes long. You cannot do this with prime fields because prime fields' size is not exactly a power of two; for example, you could encode every <span class="math inline">\(2\)</span> bytes as a number from <span class="math inline">\(0...65536\)</span> in the prime field modulo <span class="math inline">\(65537\)</span> (which is prime), but if you do an FFT on these values, then the output could contain <span class="math inline">\(65536\)</span>, which cannot be expressed in two bytes. Second, the fact that addition and subtraction become the same operation, and <span class="math inline">\(1 + 1 = 0\)</span>, create some "structure" which leads to some very interesting consequences. One particularly interesting, and useful, oddity of binary fields is the "<a href="https://en.wikipedia.org/wiki/Freshman%27s_dream">freshman's dream</a>" theorem: <span class="math inline">\((x+y)^2 = x^2 + y^2\)</span> (and the same for exponents <span class="math inline">\(4, 8, 16...\)</span> basically any power of two).
</p>
<p>
But if you want to use binary fields for erasure coding, and do so efficiently, then you need to be able to do Fast Fourier transforms over binary fields. But then there is a problem: in a binary field, <em>there are no (nontrivial) multiplicative groups of order <span class="math inline">\(2^n\)</span></em>. This is because the multiplicative groups are all order <span class="math inline">\(2^n\)</span>-1. For example, in the binary field with modulus <span class="math inline">\(x^4 + x + 1\)</span>, if you start calculating successive powers of <span class="math inline">\(x+1\)</span>, you cycle back to <span class="math inline">\(1\)</span> after <span class="math inline">\(\it 15\)</span> steps - not <span class="math inline">\(16\)</span>. The reason is that the total number of elements in the field is <span class="math inline">\(16\)</span>, but one of them is zero, and you're never going to reach zero by multiplying any nonzero value by itself in a field, so the powers of <span class="math inline">\(x+1\)</span> cycle through every element but zero, so the cycle length is <span class="math inline">\(15\)</span>, not <span class="math inline">\(16\)</span>. So what do we do?
</p>
<p>
The reason we needed the domain to have the "structure" of a multiplicative group with <span class="math inline">\(2^n\)</span> elements before is that we needed to reduce the size of the domain by a factor of two by squaring each number in it: the domain <span class="math inline">\([1, 85, 148, 111, 336, 252, 189, 226]\)</span> gets reduced to <span class="math inline">\([1, 148, 336, 189]\)</span> because <span class="math inline">\(1\)</span> is the square of both <span class="math inline">\(1\)</span> and <span class="math inline">\(336\)</span>, <span class="math inline">\(148\)</span> is the square of both <span class="math inline">\(85\)</span> and <span class="math inline">\(252\)</span>, and so forth. But what if in a binary field there's a different way to halve the size of a domain? It turns out that there is: given a domain containing <span class="math inline">\(2^k\)</span> values, including zero (technically the domain must be a <em><a href="https://en.wikipedia.org/wiki/Linear_subspace">subspace</a></em>), we can construct a half-sized new domain <span class="math inline">\(D'\)</span> by taking <span class="math inline">\(x \cdot (x+k)\)</span> for <span class="math inline">\(x\)</span> in <span class="math inline">\(D\)</span> using some specific <span class="math inline">\(k\)</span> in <span class="math inline">\(D\)</span>. Because the original domain is a subspace, since <span class="math inline">\(k\)</span> is in the domain, any <span class="math inline">\(x\)</span> in the domain has a corresponding <span class="math inline">\(x+k\)</span> also in the domain, and the function <span class="math inline">\(f(x) = x \cdot (x+k)\)</span> returns the same value for <span class="math inline">\(x\)</span> and <span class="math inline">\(x+k\)</span> so we get the same kind of two-to-one correspondence that squaring gives us.
</p>
<center>
<table border="1" cellpadding="10">
<tr>
<td>
<span class="math inline">\(x\)</span>
</td>
<td>
0
</td>
<td>
1
</td>
<td>
2
</td>
<td>
3
</td>
<td>
4
</td>
<td>
5
</td>
<td>
6
</td>
<td>
7
</td>
<td>
8
</td>
<td>
9
</td>
<td>
10
</td>
<td>
11
</td>
<td>
12
</td>
<td>
13
</td>
<td>
14
</td>
<td>
15
</td>
</tr>
<tr>
<td>
<span class="math inline">\(x \cdot (x+1)\)</span>
</td>
<td>
0
</td>
<td>
0
</td>
<td>
6
</td>
<td>
6
</td>
<td>
7
</td>
<td>
7
</td>
<td>
1
</td>
<td>
1
</td>
<td>
4
</td>
<td>
4
</td>
<td>
2
</td>
<td>
2
</td>
<td>
3
</td>
<td>
3
</td>
<td>
5
</td>
<td>
5
</td>
</tr>
</table>
</center>
<p><br></p>
<p>
So now, how do we do an FFT on top of this? We'll use the same trick, converting a problem with an <span class="math inline">\(N\)</span>-sized polynomial and <span class="math inline">\(N\)</span>-sized domain into two problems each with an <span class="math inline">\(\frac{N}{2}\)</span>-sized polynomial and <span class="math inline">\(\frac{N}{2}\)</span>-sized domain, but this time using different equations. We'll convert a polynomial <span class="math inline">\(p\)</span> into two polynomials <span class="math inline">\(evens\)</span> and <span class="math inline">\(odds\)</span> such that <span class="math inline">\(p(x) = evens(x \cdot (k-x)) + x \cdot odds(x \cdot (k-x))\)</span>. Note that for the <span class="math inline">\(evens\)</span> and <span class="math inline">\(odds\)</span> that we find, it will <em>also</em> be true that <span class="math inline">\(p(x+k) = evens(x \cdot (k-x)) + (x+k) \cdot odds(x \cdot (k-x))\)</span>. So we can then recursively do an FFT to <span class="math inline">\(evens\)</span> and <span class="math inline">\(odds\)</span> on the reduced domain <span class="math inline">\([x \cdot (k-x)\)</span> for <span class="math inline">\(x\)</span> in <span class="math inline">\(D]\)</span>, and then we use these two formulas to get the answers for two "halves" of the domain, one offset by <span class="math inline">\(k\)</span> from the other.
</p>
<p>
Converting <span class="math inline">\(p\)</span> into <span class="math inline">\(evens\)</span> and <span class="math inline">\(odds\)</span> as described above turns out to itself be nontrivial. The "naive" algorithm for doing this is itself <span class="math inline">\(O(N^2)\)</span>, but it turns out that in a binary field, we can use the fact that <span class="math inline">\((x^2-kx)^2 = x^4 - k^2 \cdot x^2\)</span>, and more generally <span class="math inline">\((x^2-kx)^{2^{i}} = x^{2^{i+1}} - k^{2^{i}} \cdot x^{2^{i}}\)</span> , to create yet another recursive algorithm to do this in <span class="math inline">\(O(N \cdot log(N))\)</span> time.
</p>
<p>
And if you want to do an <em>inverse</em> FFT, to do interpolation, then you need to run the steps in the algorithm in reverse order. You can find the complete code for doing this here: <a href="https://github.com/ethereum/research/tree/master/binary_fft">https://github.com/ethereum/research/tree/master/binary_fft</a>, and a paper with details on more optimal algorithms here: <a href="http://www.math.clemson.edu/~sgao/papers/GM10.pdf">http://www.math.clemson.edu/~sgao/papers/GM10.pdf</a>
</p>
<p>
So what do we get from all of this complexity? Well, we can try running the implementation, which features both a "naive" <span class="math inline">\(O(N^2)\)</span> multi-point evaluation and the optimized FFT-based one, and time both. Here are my results:
</p>
<pre>
>>> import binary_fft as b
>>> import time, random
>>> f = b.BinaryField(1033)
>>> poly = [random.randrange(1024) for i in range(1024)]
>>> a = time.time(); x1 = b._simple_ft(f, poly); time.time() - a
0.5752472877502441
>>> a = time.time(); x2 = b.fft(f, poly, list(range(1024))); time.time() - a
0.03820443153381348
</pre>
<p>
And as the size of the polynomial gets larger, the naive implementation (<code>_simple_ft</code>) gets slower much more quickly than the FFT:
</p>
<pre>
>>> f = b.BinaryField(2053)
>>> poly = [random.randrange(2048) for i in range(2048)]
>>> a = time.time(); x1 = b._simple_ft(f, poly); time.time() - a
2.2243144512176514
>>> a = time.time(); x2 = b.fft(f, poly, list(range(2048))); time.time() - a
0.07896280288696289
</pre>
<p>
And voila, we have an efficient, scalable way to multi-point evaluate and interpolate polynomials. If we want to use FFTs to recover erasure-coded data where we are <em>missing</em> some pieces, then algorithms for this <a href="https://ethresear.ch/t/reed-solomon-erasure-code-recovery-in-n-log-2-n-time-with-ffts/3039">also exist</a>, though they are somewhat less efficient than just doing a single FFT. Enjoy!
</p>
Sun, 12 May 2019 18:03:10 -0700
https://vitalik.ca/general/2019/05/12/fft.html
https://vitalik.ca/general/2019/05/12/fft.htmlgeneral