Vector spaces have a long history of applications in mathematics and physics. To this collection of applications a new one has been added recently - Large Language Models. It turned out that representing words, phrases and documents as vectors in a high-dimensional vector space provides an effective way to capture semantic relationships and emulate contextual understanding. This theory has nothing to do with LLM's however - it just defines vector space as a mathematical structure as it has been understood from at least the beginning of the XXth century.
The canonical example of a vector space is \(\mathbb{R}^n\) - the set of \(n\)-tuples of real numbers. We can add them adding respective coordinates and scale them by multiplying all coordinates by the same number. In a more abstract approach we start with an abelian group (of vectors) and a field (of scalars) and define an operation of multiplying a vector by a scalar so that the distributive properties \(x(v_1 + v_2) = s v_1 + s v_2\) and \((s_1+s_2)v =s_1 v + s_2 v\) are satisfied for any scalars \(s,s_1,s_2\) and vectors \(v,v_1,v_2\).
A vector space is a field action on an abelian group. The notion of an action is defined in Module_ZF theory as a ring homomorphism valuesd in the ring of endomorphisms of some (abelian) group. In the definition \(S\) is a the ring carrier, \(A\) is the set representing the addition operation of the ring, \(M\) is the ring multiplication, \(V\) is the carrier of the abelian group, \(A_V\) represents the group operation, i.e. the vector addition and \(H\) is the ring homomorphism defining the action.
definition
\( \text{IsVectorSpace}(S,A,M,V,A_V,H) \equiv \) \( \text{IsAfield}(S,A,M) \wedge \text{IsAgroup}(V,A_V) \wedge (A_V \text{ is commutative on } V) \wedge \text{IsAction}(S,A,M,V,A_V,H) \)
The next locale defines context (i.e. common assumptions and notation) when considering vector spaces. We reuse notation from the field0 locale adding more similarly to the module0 locale.
locale vector_space0 = field0 +
assumes mAbGr: \( \text{IsAgroup}(V,A_V) \wedge (A_V \text{ is commutative on } V) \)
assumes mAction: \( \text{IsAction}(K,A,M,V,A_V,H) \)
defines \( \Theta \equiv \text{ TheNeutralElement}(V,A_V) \)
defines \( v_1 +_V v_2 \equiv A_V\langle v_1,v_2\rangle \)
defines \( s \cdot _S v \equiv (H(s))(v) \)
defines \( - v \equiv \text{GroupInv}(V,A_V)(v) \)
defines \( v_1 -_V v_2 \equiv v_1 +_V ( - v_2) \)
We indeed talk about vector spaces in the vector_space0 context.
lemma (in vector_space0) V_vec_space:
shows \( \text{IsVectorSpace}(K,A,M,V,A_V,H) \) using mAbGr, mAction, Field_ZF_1_L1 unfolding IsVectorSpace_defIf a quintuple of sets forms a vector space then the assumptions of the vector_spce0 hold for those sets.
lemma vec_spce_vec_spce_contxt:
assumes \( \text{IsVectorSpace}(K,A,M,V,A_V,H) \)
shows \( vector\_space0(K, A, M, V, A_V, H) \)proofThe assumptions of module0 context hold in the vector_spce0 context.
lemma (in vector_space0) vec_spce_mod:
shows \( module0(K, A, M, V, A_V, H) \)proofPropositions proven in the module0 context are valid in the vector_spce0 context.
using vec_spce_mod
In this section we show that the definition of a vector space as a field action on an abelian group implies the vector space axioms as listed on Wikipedia (March 2024). The first four axioms just state that vectors with addition form an abelian group. That is fine of course, but in such case the axioms for scalars being a field should be listed too, and they are not. The entry on modules is more consistent, it states that module elements form an abelian group, scalars form a ring and lists only four properties of multiplication of scalars by vectors as module axioms. The remaining four axioms are just restatemenst of module axioms and since vector spaces are modules we can prove them by refering to the module axioms proven in the module0 context
Vector addition is associative.
lemma (in vector_space0) vec_spce_ax1:
assumes \( u\in V \), \( v\in V \), \( w\in V \)
shows \( u +_V (v +_V w) = (u +_V v) +_V w \) using assms, group_oper_assocVector addition is commutative.
lemma (in vector_space0) vec_spce_ax2:
assumes \( u\in V \), \( v\in V \)
shows \( u +_V v = v +_V u \) using mAbGr, assms unfolding IsCommutative_defThe zero vector is a vector.
lemma (in vector_space0) vec_spce_ax3a:
shows \( \Theta \in V \) using zero_in_modThe zero vector is the neutral element of addition of vectors.
lemma (in vector_space0) vec_spce_ax3b:
assumes \( v\in V \)
shows \( v +_V \Theta = v \) using assms, zero_neutralThe additive inverse of a vector is a vector.
lemma (in vector_space0) vec_spce_ax4a:
assumes \( v\in V \)
shows \( ( - v) \in V \) using assms, inverse_in_groupSum of of a vector and it's additive inverse is the zero vector.
lemma (in vector_space0) vec_spce_ax4b:
assumes \( v\in V \)
shows \( v +_V ( - v) = \Theta \) using assms, group0_2_L6Scalar multiplication and field multiplication are "compatible" (as Wikipedia calls it).
lemma (in vector_space0) vec_spce_ax5:
assumes \( x\in K \), \( y\in K \), \( v\in V \)
shows \( x\cdot _S(y\cdot _Sv) = (x\cdot y)\cdot _Sv \) using assms, module_ax3Multiplying the identity element of the field by a vector gives the vector.
lemma (in vector_space0) vec_spce_ax6:
assumes \( v\in V \)
shows \( 1 \cdot _Sv = v \) using assms, module_ax4Scalar multiplication is distributive with respect to vector addition.
lemma (in vector_space0) vec_spce_ax7:
assumes \( x\in K \), \( u\in V \), \( v\in V \)
shows \( x\cdot _S(u+_Vv) = x\cdot _Su +_V x\cdot _Sv \) using assms, module_ax1Scalar multiplication is distributive with respect to field addition.
lemma (in vector_space0) vec_spce_ax8:
assumes \( x\in K \), \( y\in K \), \( v\in V \)
shows \( (x + y)\cdot _Sv = x\cdot _Sv +_V y\cdot _Sv \) using assms, module_ax2assumes \( a\in G \), \( b\in G \), \( c\in G \)
shows \( a\cdot (b\cdot c) = a\cdot b\cdot c \)assumes \( x\in \mathcal{M} \)
shows \( x +_V \Theta = x \) and \( \Theta +_V x = x \)assumes \( x\in G \)
shows \( x^{-1}\in G \)assumes \( x\in G \)
shows \( x\cdot x^{-1} = 1 \wedge x^{-1}\cdot x = 1 \)assumes \( r\in R \), \( s\in R \), \( x\in \mathcal{M} \)
shows \( (r\cdot s)\cdot _Sx = r\cdot _S(s\cdot _Sx) \)assumes \( x\in \mathcal{M} \)
shows \( 1 \cdot _Sx = x \)assumes \( r\in R \), \( x\in \mathcal{M} \), \( y\in \mathcal{M} \)
shows \( r\cdot _S(x+_Vy) = r\cdot _Sx +_V r\cdot _Sy \)assumes \( r\in R \), \( s\in R \), \( x\in \mathcal{M} \)
shows \( (r + s)\cdot _Sx = r\cdot _Sx +_V s\cdot _Sx \)