Verifying the ArbOS Code Difference

The current ArbOS version used on Arbitrum One and Arbitrum Nova is ArbOS 32, corresponding to the Arbitrum Nitro consensus-v32 git tag.

To audit the code difference from ArbOS 32 to ArbOS 40, you could simply generate a full nitro diff with git diff consensus-v32 consensus-v40 (and also generate a diff of the go-ethereum submodule mentioned in that nitro diff). However, that includes a lot of code that isn’t part of the WASM module root. To filter down to just the replay binary which defines the state transition function, you can start by generating a list of files in the nitro and go-ethereum repositories included by the replay binary in either ArbOS 32 or ArbOS 40 with bash:

#!/usr/bin/env bash
set -e
mkdir -p ~/tmp # this script uses ~/tmp as scratch space and output
# this script should be run in the nitro repository
rm -rf bold contracts-legacy safe-smart-contracts go-ethereum nitro-testnode
git checkout consensus-v32
git submodule update --init --recursive
make .make/solgen
go list -f "{{.Deps}}" ./cmd/replay | tr -d '[]' | sed 's/ /\\n/g' | grep 'github.com/offchainlabs/nitro/' | sed 's@github.com/offchainlabs/nitro/@@' | while read dir; do find "$dir" -type f -name '*.go' -maxdepth 1; done | grep -v '_test\\.go$' > ~/tmp/consensus-v32-nitro-files.txt
go list -f "{{.Deps}}" ./cmd/replay | tr -d '[]' | sed 's/ /\\n/g' | grep 'github.com/ethereum/go-ethereum/' | sed 's@github.com/ethereum/go-ethereum/@go-ethereum/@' | while read dir; do find "$dir" -type f -name '*.go' -maxdepth 1; done | grep -v '_test\\.go$' > ~/tmp/consensus-v32-geth-files.txt
rm -rf go-ethereum nitro-testnode
git checkout consensus-v40
git submodule update --init --recursive
make .make/solgen
go list -f "{{.Deps}}" ./cmd/replay | tr -d '[]' | sed 's/ /\\n/g' | grep 'github.com/offchainlabs/nitro/' | sed 's@github.com/offchainlabs/nitro/@@' | while read dir; do find "$dir" -type f -name '*.go' -maxdepth 1; done | grep -v '_test\\.go$' > ~/tmp/consensus-v40-nitro-files.txt
go list -f "{{.Deps}}" ./cmd/replay | tr -d '[]' | sed 's/ /\\n/g' | grep 'github.com/ethereum/go-ethereum/' | sed 's@github.com/ethereum/go-ethereum/@go-ethereum/@' | while read dir; do find "$dir" -type f -name '*.go' -maxdepth 1; done | grep -v '_test\\.go$' > ~/tmp/consensus-v40-geth-files.txt
sort -u ~/tmp/consensus-v32-nitro-files.txt ~/tmp/consensus-v40-nitro-files.txt > ~/tmp/replay-binary-nitro-dependencies.txt
sort -u ~/tmp/consensus-v32-geth-files.txt ~/tmp/consensus-v40-geth-files.txt | sed 's@^[./]*go-ethereum/@@' > ~/tmp/replay-binary-geth-dependencies.txt

Now, ~/tmp/replay-binary-dependencies.txt contains a list of dependencies of the replay binary that were present in ArbOS 32 or 40. To use that to generate a smaller diff of the nitro repository, you can run:

git diff consensus-v32 consensus-v40 -- cmd/replay $(cat ~/tmp/replay-binary-nitro-dependencies.txt) > ~/tmp/arbos40-nitro.diff

For the go-ethereum submodule, we have added tags for each of consensus-v32 and consensus-v40 to make it easy to address them in git diff commands. You can again use git diff and the files generated by the earlier script to generate a diff limited to code used by the replay binary:

# this should be run inside the go-ethereum submodule folder
git diff consensus-v32 consensus-v40 -- $(cat ~/tmp/replay-binary-geth-dependencies.txt)

This diff also includes the diff between upstream go-ethereum versions v1.13.11 and v1.15.5, as ArbOS 32 used the former and ArbOS 40 uses the latter. To filter out that difference, you can use this tool to find the intersection of two git diffs: Git diff intersection finder

We can use that to find the intersection of the diff of ArbOS 40’s go-ethereum against ArbOS 32’s go-ethereum and the diff of ArbOS 40’s go-etheruem against upstream go-ethereum v1.15.5:

# this should be run inside the go-ethereum submodule folder
git diff consensus-v32 consensus-v40 -- $(cat ~/tmp/replay-binary-geth-dependencies.txt) > ~/tmp/arbos-32-vs-40-geth.diff
git diff v1.15.5 consensus-v40 -- $(cat ~/tmp/replay-binary-geth-dependencies.txt) > ~/tmp/arbos-40-vs-upstream-geth.diff
diff-intersection.py ~/tmp/arbos-32-vs-40-geth.diff ~/tmp/arbos-40-vs-upstream-geth.diff --ignore-files 'core/blockchain*.go' arbitrum_types/txoptions.go 'rawdb/**' 'rpc/**' > ~/tmp/arbos-40-geth-line-intersection.diff

The above command ignores files that are included by the replay binary but whose components are not used with these arguments: --ignore-files 'core/blockchain*.go' arbitrum_types/txoptions.go 'rawdb/**' 'rpc/**'. To also review those diffs, you can remove those arguments.

Note that by default, diff-intersection.py does a line based intersection. To instead do an intersection based on chunks in the diff, known as hunks in git terminology, you can add the --only-hunks flag.

diff-intersection.py --only-hunks ~/tmp/arbos-32-vs-40-geth.diff ~/tmp/arbos-40-vs-upstream-geth.diff --ignore-files 'core/blockchain*.go' arbitrum_types/txoptions.go 'rawdb/**' 'rpc/**' > ~/tmp/arbos-40-geth-hunk-intersection.diff