View on GitHub
File Changes
\title{Formal Specification of the Extended UTXO Model (\red{\textsf{DRAFT}})}

                      
\pagestyle{plain}
-
\date{12th September 2019}
+
\date{9th October 2019}
\author{}

                      
\documentclass[a4paper]{article}
\usepackage{listings}
\usepackage{xcolor}
\usepackage{verbatim}
-
\usepackage[disable]{todonotes}
+
\usepackage{alltt}
+
\usepackage{todonotes}
+
%\usepackage[disable]{todonotes}

                      
\usepackage{hyperref}  % This has to go at the end of the packages.

                      

                      
\newcommand{\validator}{\mi{validator}}
\newcommand{\redeemer}{\mi{redeemer}}
-
\newcommand{\datascript}{\mi{datascript}}
+
\newcommand{\valdata}{\mi{valdata}}
+
\newcommand{\Data}{\ensuremath{\mathsf{Data}}}
+

                      
\newcommand{\outputref}{\mi{outputRef}}
\newcommand{\txin}{\mi{in}}
\newcommand{\txout}{\mi{out}}
\newcommand\B{\ensuremath{\mathbb{B}}}
\newcommand\N{\ensuremath{\mathbb{N}}}
\newcommand\Z{\ensuremath{\mathbb{Z}}}
+
\renewcommand\H{\ensuremath{\mathbb{H}}}
+
%% \H is usually the Hungarian double acute accent

                      
\newcommand{\emptymap}{\ensuremath{\{\}}}

                      

                      
\begin{document}
\maketitle
-
\todokwxm{Michael suggests (and I agree) that we should try to make
-
  clear the difference between the ledger layer and the script layer.
-
  It would be good if we could make the EUTXO model independent of the
-
  details of the scripting language, just saying ``We don't care what
-
  the scripting language actually is: as long as you can do
-
  \textsf{A},\textsf{B},\textsf{C}, and \textsf{D}, then we can make
-
  the EUTXO model work'' (where
-
  \textsf{A},\textsf{B},\textsf{C},\textsf{D} are things like encoding
-
  the \textsf{PendingTx} data, performing validation, verifying the
-
  content of the output data scripts, ...).  Then we can say separately
-
  ``Here's how you can do \textsf{A},\textsf{B},\textsf{C}, and
-
  \textsf{D} in Plutus Core''.  This would make things modular, giving
-
  us a clear and well-defined interface between the EUTXO model and
-
  Plutus Core, which would surely be a good thing.  However, the business
-
  about validating data scripts \textit{really} messes this up.}
-

                      
-
\bigskip

                      
\section{Introduction: The Extended UTXO Model}
\label{sec:intro}
multiple transactions, and our aim is to define a transaction model
which enables the implementation of highly expressive contracts.

                      
-
An important component of our UTXO models are \textit{scripts},
+
An important feature of our UTXO models is \textit{scripts},
programs which run on the blockchain to check the validity of
transactions.  In Cardano, scripts will be programs in the Plutus Core
language~\citep{Plutus-Core-spec}. The Extended UTXO models are largely

                      
\subsection{Structure of the document}
\label{sec:doc-structure}
-
This document proposes two extensions of the basic UTXO model (EUTXO
-
stands for \textit{Extended UTXO}):
+
The papers~\citep{Zahnentferner18-Chimeric} and
+
\citep{Zahnentferner18-UTxO} give a formal specification of a basic
+
UTXO model. See Note~\ref{note:basic-utxo} for some background on
+
this model.
+

                      
+
\medskip
+
\noindent This document proposes two extensions of the basic UTXO
+
model (EUTXO stands for \textit{Extended UTXO}):

                      
\begin{itemize}
  \item \textbf{EUTXO-1} (Section~\ref{sec:eutxo-1}): this extends the
    \textit{custom currencies} and \textit{non-fungible tokens}.
\end{itemize}

                      
-
\noindent We give formal specifications of each of these versions,
-
based on the models appearing in the
-
papers~\citep{Zahnentferner18-Chimeric} and
-
\citep{Zahnentferner18-UTxO}.

                      
\medskip

                      
book~\citep{Plutus-book}.

                      

                      
-

                      
-
\newpage
\section{Notation}
This section defines some basic notation.  We generally follow the
notation established by \citep{Zahnentferner18-UTxO}.

                      
+
\subsection{Basic types}
+
\noindent Throughout the document we assume a number of basic types.
+
These are shown in Figure~\ref{fig:basic-types}.
+

                      
+
\begin{ruledfigure}{H}
+
  \center{
+
    \begin{tabular}{rl}
+
      \B&: the type of booleans, $\{\false, \true\}$\\
+
      \N&: the type of natural numbers, $\{0, 1, 2, \ldots\}$\\
+
      \Z&: the type of integers, $\{,\ldots, -2, -1, 0, 1, 2, \ldots\}$\\
+
      \H&: the type of \textit{bytestrings}, $\bigcup_{n=0}^{\infty}\{0,1\}^{8n}$\\
+
      $\qty = \N$&: an amount of currency, always non-negative\\
+
      $\qtypm = \Z$&: a possibly negative amount of currency\\
+
      $\s{SlotNumber} = \N$&: a slot number\\
+
      $\s{Address} = \N$&: the address of an object in the blockchain\\
+
      $\s{TxId} = \N$&: the identifier of a previous transaction on the chain\\
+
    \end{tabular}
+
  }
+
  \caption{Basic types}
+
  \label{fig:basic-types}
+
\end{ruledfigure}
+

                      
+
\noindent We regard $\N$ as a subtype of $\Z$ and convert freely between
+
natural numbers and non-negative integers.  A bytestring is a sequence
+
of 8-bit bytes.
+

                      
+
\medskip
+
\noindent In practice an \textsf{Address} will be a
+
collision-resistant hash of some object (so the address of an object
+
$X$ is $X^{\#}$ in the notation of Figure~\ref{fig:basic-notation}),
+
and the blockchain will provide an efficient way to retrieve the
+
original object given its hash.  Compiled Plutus Core scripts are
+
stored and transmitted in a serialised form, and when we talk about
+
the hash of a Plutus Core script we mean the hash of the serialised
+
version.
+

                      
+
% Accessing objects indirectly via addresses is helpful because it can
+
% help to reduce memory and disk usage: for example, there may be
+
% scripts for common validation scenarios used in many transactions, and
+
% it is more efficient to store single copies of such scripts rather
+
% than having hundreds of transactions each with their own copy.
+

                      
+
\medskip
+
\noindent We assume that every transaction in a ledger has a unique identifier
+
of type \s{TxId}: in an implementation this would probably be the
+
hash of the transaction.
+

                      
+
\subsection{The \Data{} type}
+
We also define a type \Data{} which can be used to pass information
+
into scripts in a type-safe manner: see Figure~\ref{fig:data-defn}. The
+
definition is given here in EBNF form, but can easily be translated to
+
a Haskell type, for instance.
+

                      
+
\begin{ruledfigure}{H}
+
\begin{alltt}
+
  \Data =
+
      "I" \(\N\)
+
    | "B" \(\H\)
+
    | "Constr" \(\N (\List{\Data})\)
+
    | "Map" \(\List{\Data\times\Data}\)
+
\end{alltt}
+
\caption{The \Data{} type}
+
\label{fig:data-defn}
+
\end{ruledfigure}
+

                      
+
\noindent Thus values of type \Data{} are nested sums and products built up
+
recursively from the base types of integers and bytestrings. This
+
allows one to encode a large variety of first-order data structures:
+
for example, we could encode the list [11,22,33] as \texttt{Constr 1
+
  (I 11, Constr 1 (I 22, Constr 1 (I 33, Constr 0)))}, using
+
\texttt{Constr 0} to represent the \texttt{Nil} constructor of a list
+
type, and \texttt{Constr 1} to represent the \texttt{Cons}
+
constructor.
+

                      
+
The \texttt{Map} constructor is strictly redundant, but is included for
+
convenience to allow a straightforward encoding of records.
+

                      
+
We assume that the scripting language has the ability to parse values
+
of type \Data{}, converting them into a suitable internal representation.
+

                      
+
\subsection{Further notation}
+
Figure~\ref{fig:basic-notation} presents some further notation for use in the
+
sequel. 
+
\todokwxm{Split this over the page break once things have stabilised.}
\begin{ruledfigure}{H}
  \begin{itemize}
\item Types are typeset in $\mathsf{sans~serif}$.
\item The only operation on $\mathsf{Script}$ from the ledger's