# Articles

• ### Bayesian Inference On 1st Order Logic

ðŸ”—
Category:Post
February 21, 2021

David Chapmanâ€™s blog post titled Probability theory does not extend logic has stirred up some controversy. In it, Chapman argues that so-called Bayesian logic, as it currently understood, is limited to propositional logic (0th order logic), but cannot generalize to higher order logics (e.g. predicate logic a.k.a. 1st order logic), and thus cannot be a general foundation for inference from data under uncertainty.

Chapman provides a few counter-examples that supposedly demonstrate that doing Bayesian inference on statements in 1st order logic is incoherent. I think there is a lot of confusion surrounding this point because Chapman does not use proper probability notation. In the following article I show how Chapmanâ€™s examples can be properly written and made sense of using random variables. Hopefully this clarifies some things.

• ### Primer to Probability Theory and Its Philosophy

ðŸ”—
Category:Post
June 19, 2020

Probability is a measure defined on events, which are sets of primitive outcomes. Probability theory mostly comes down to constructing events and measuring them. A measure is a generalization of size which corresponds to length, area, and volume (rather than the bijective mapping definition of cardinality).

• ### Notes: Probability & AI Curriculum

ðŸ”—
Category:Notes
June 17, 2020

This is a snapshot of my curriculum for exploring the following questions:

• Is probability theory all you need to develop AI?
• If not, what is missing?
• Should a theory of AI be expressed in the framework of probability theory at all?
• Do Brains use probability?

• ### Notes: Dutch Book Argument

ðŸ”—
Category:Notes
June 11, 2020

• ### Notes: Complete Class Theorems

ðŸ”—
Category:Notes
June 11, 2020

• ### Primer to Shannon's Information Theory

ðŸ”—
Category:Post
June 9, 2020

Shannonâ€™s theory of information is usually just called information theory, but is it deserving of that title? Does Shannonâ€™s theory completely capture every possible meaning of the word information? In the grand quests to creating AI and understanding the rules of the universe (i.e. grand unified theory) information may be key. Intelligent agents search for information and manipulate it. Particle interactions in physics may be viewed as information transfer. The physics of information may be key to interpreting quantum mechanics and resolving the measurement problem.

If you endeavor to answer these hard questions, it is prudent to understand existing so-called theories of information so you can evaluate whether they are powerful enough and to take inspiration from them.

Shannonâ€™s information theory is a hard nut to crack. Hopefully this primer gets you far enough along to be able to read a textbook like Elements of Information Theory. At the end I start to explore the question of whether Shannonâ€™s theory is a complete theory of information, and where it might be lacking.

This post is long. That is because Shannonâ€™s information theory is a framework of thought. That framework has a vocabulary which is needed to appreciate the whole. I attempt to gradually build up this vocabulary, stopping along the way to build intuition. With this vocabulary in hand, you will be ready to explore the big questions at the end of this post.

• ### Notes: Wallace - Emergence of particles from QFT

ðŸ”—
Category:Notes
January 10, 2020

• ### Notes: Visualizing Quantum Field States

ðŸ”—
Category:Notes
January 10, 2020

• ### Notes: Solomonoff Induction

ðŸ”—
Category:Notes
December 31, 2019

• ### Notes: Weak Measurement (Quantum Mechanics)

ðŸ”—
Category:Notes
December 24, 2019

• ### Notes: Topology - Sphere & Torus

ðŸ”—
Category:Notes
December 23, 2019

• ### Notes: Cox's Theorem

ðŸ”—
Category:Notes
December 23, 2019

• ### Quantum State

ðŸ”—
Category:Post
December 22, 2019

The two views of quantum state:

1. Quantum states are $L^2$-normalized complex-valued functions over classical configuration space.
2. Quantum states are unit vectors residing in a complex Hilbert space, $\mathcal{H}$.
$$\newcommand{\bm}{\boldsymbol} \newcommand{\diff}[1]{\mathop{\mathrm{d}#1}} \newcommand{\bra}[1]{\langle#1\rvert} \newcommand{\ket}[1]{\lvert#1\rangle} \newcommand{\braket}[2]{\langle#1\vert#2\rangle}$$

• ### Bias-Variance Decomposition For Machine Learning

ðŸ”—
Category:Post
July 14, 2019
$$\newcommand{\Real}{ {\mathbb{R}} } \newcommand{\E}{ {\mathbb{E}} } \newcommand{\V}{ {\mathbb{V}} } \newcommand{\D}{\mathcal{D}} \newcommand{\Var}{\mathrm{Var}} \newcommand{\Bias}{\mathrm{Bias}} \newcommand\Yh{ {\hat{Y}} } \newcommand{\ep}{ {\boldsymbol{\varepsilon}} } \newcommand{\s}{\mathbb{S}} \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\argmin}{argmin}$$

All about the bias-variance decomposition as it pertains to machine learning. All you need to know:

\begin{align*} & \E_D[(f(x; D) - y(x))^2] \qquad\quad\ \textrm{Avg. error}\\ & = (\E_D[f(x; D)] - y(x))^2 \qquad \textrm{Bias}_y(f)^2\\ &\phantom{=}\, + \V_D[f(x; D)] \qquad\qquad\quad\ \, \textrm{Variance}(f)\\ \end{align*}
zhat - pragmanym