# Articles

• ### Bayesian Inference On 1st Order Logic

🔗
Category:Post
February 21, 2021

David Chapman’s blog post titled Probability theory does not extend logic has stirred up some controversy. In it, Chapman argues that so-called Bayesian logic, as it currently understood, is limited to propositional logic (0th order logic), but cannot generalize to higher order logics (e.g. predicate logic a.k.a. 1st order logic), and thus cannot be a general foundation for inference from data under uncertainty.

Chapman provides a few counter-examples that supposedly demonstrate that doing Bayesian inference on statements in 1st order logic is incoherent. I think there is a lot of confusion surrounding this point because Chapman does not use proper probability notation. In the following article I show how Chapman’s examples can be properly written and made sense of using random variables. Hopefully this clarifies some things.

• ### Primer to Probability Theory and Its Philosophy

🔗
Category:Post
June 19, 2020

Probability is a measure defined on events, which are sets of primitive outcomes. Probability theory mostly comes down to constructing events and measuring them. A measure is a generalization of size which corresponds to length, area, and volume (rather than the bijective mapping definition of cardinality).

• ### Notes: Probability & AI Curriculum

🔗
Category:Notes
June 17, 2020

This is a snapshot of my curriculum for exploring the following questions:

• Is probability theory all you need to develop AI?
• If not, what is missing?
• Should a theory of AI be expressed in the framework of probability theory at all?
• Do Brains use probability?

• ### Notes: Dutch Book Argument

🔗
Category:Notes
June 11, 2020

• ### Notes: Complete Class Theorems

🔗
Category:Notes
June 11, 2020

• ### Primer to Shannon's Information Theory

🔗
Category:Post
June 9, 2020

Shannon’s theory of information is usually just called information theory, but is it deserving of that title? Does Shannon’s theory completely capture every possible meaning of the word information? In the grand quests to creating AI and understanding the rules of the universe (i.e. grand unified theory) information may be key. Intelligent agents search for information and manipulate it. Particle interactions in physics may be viewed as information transfer. The physics of information may be key to interpreting quantum mechanics and resolving the measurement problem.

If you endeavor to answer these hard questions, it is prudent to understand existing so-called theories of information so you can evaluate whether they are powerful enough and to take inspiration from them.

Shannon’s information theory is a hard nut to crack. Hopefully this primer gets you far enough along to be able to read a textbook like Elements of Information Theory. At the end I start to explore the question of whether Shannon’s theory is a complete theory of information, and where it might be lacking.

This post is long. That is because Shannon’s information theory is a framework of thought. That framework has a vocabulary which is needed to appreciate the whole. I attempt to gradually build up this vocabulary, stopping along the way to build intuition. With this vocabulary in hand, you will be ready to explore the big questions at the end of this post.

• ### Notes: Wallace - Emergence of particles from QFT

🔗
Category:Notes
January 10, 2020

• ### Notes: Visualizing Quantum Field States

🔗
Category:Notes
January 10, 2020

• ### Notes: Solomonoff Induction

🔗
Category:Notes
December 31, 2019

• ### Notes: Weak Measurement (Quantum Mechanics)

🔗
Category:Notes
December 24, 2019

• ### Notes: Topology - Sphere & Torus

🔗
Category:Notes
December 23, 2019

• ### Notes: Cox's Theorem

🔗
Category:Notes
December 23, 2019

• ### Quantum State

🔗
Category:Post
December 22, 2019

The two views of quantum state:

1. Quantum states are $L^2$-normalized complex-valued functions over classical configuration space.
2. Quantum states are unit vectors residing in a complex Hilbert space, $\mathcal{H}$.
$$\newcommand{\bm}{\boldsymbol} \newcommand{\diff}{\mathop{\mathrm{d}#1}} \newcommand{\bra}{\langle#1\rvert} \newcommand{\ket}{\lvert#1\rangle} \newcommand{\braket}{\langle#1\vert#2\rangle}$$

• ### Bias-Variance Decomposition For Machine Learning

🔗
Category:Post
July 14, 2019
$$\newcommand{\Real}{ {\mathbb{R}} } \newcommand{\E}{ {\mathbb{E}} } \newcommand{\V}{ {\mathbb{V}} } \newcommand{\D}{\mathcal{D}} \newcommand{\Var}{\mathrm{Var}} \newcommand{\Bias}{\mathrm{Bias}} \newcommand\Yh{ {\hat{Y}} } \newcommand{\ep}{ {\boldsymbol{\varepsilon}} } \newcommand{\s}{\mathbb{S}} \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\argmin}{argmin}$$

All about the bias-variance decomposition as it pertains to machine learning. All you need to know:

\begin{align*} & \E_D[(f(x; D) - y(x))^2] \qquad\quad\ \textrm{Avg. error}\\ & = (\E_D[f(x; D)] - y(x))^2 \qquad \textrm{Bias}_y(f)^2\\ &\phantom{=}\, + \V_D[f(x; D)] \qquad\qquad\quad\ \, \textrm{Variance}(f)\\ \end{align*}
zhat - pragmanym