A probability space with a sample space $$I \times G \times S$$, intelligence $$I = \{i^0, i^1\}$$, grade $$G = \{g^0, g^1, g^2 \}$$ and SAT score $$S = \{s^0, s^1\}$$ in general requires [...] independent parameters to define a probability measure.
$P(I, S, G) = P(S \vert I)P(G \vert I)P(I)$
This is a factorization of the joint distribution into a product of three conditional probability distributions. The parameterization involves three bernoulli distributions, $$P(I), P(S \vert i^0)$$, $$P(S \vert i^1)$$, and two three-valued multinomial distributions,  $$P(G \vert i^0), P(G \vert i^1)$$. The total independent parameter count is thus [...]. Thus, the representation is more compact.
It is important to note another advantage of this way of representing the joint: modularity. When we added the new variable G, the joint distribution changed entirely. Had we used the explicit representation of the joint, we would have had to write down twelve new numbers. In the factored representation, we could reuse our local probability models for the variables $$I$$ and $$S$$, and specify only the probability model for $$G$$ - the CPD $$P(G \vert I)$$. This property will turn out to be invaluable in modeling real-world systems.