What is the VC dimension?

289    Asked by DavidEDWARDS in Data Science , Asked on Feb 14, 2023

 I'm studying machine learning from Andrew Ng Stanford lectures and just came across the theory of VC dimensions. According to the lectures and what I understood, the definition of VC dimension can be given as,


If you can find a set of n points, so that it can be shattered by the classifier (i.e. classify all possible 2n labeling correctly) and you cannot find any set of n+1 points that can be shattered (i.e. for any set of n+1 points there is at least one labeling order so that the classifier can not separate all points correctly), then the VC dimension is n.


Also the Professor took an example and explained this nicely. Which is:

Let,

H={set of linear classifiers in 2 Dimensions}

Then any 3 points can be classified by H correctly with separating hyper planes as shown in the following figure.

And that's why the VC dimension of H is 3. Because for any 4 points in a 2D plane, a linear classifier can not shatter all the combinations of the points. For example,

For this set of points, there is no separating hyperplane that can be drawn to classify this set. So the VC dimension is 3.

I got the idea till here. But what if we've followed the following type of pattern?

Or the pattern where three points coincide with each other. Here also we can not draw a separating hyperplane between 3 points. But still this pattern is not considered in the definition of the VC dimension. Why? The same point is also discussed in the lectures I'm watching Here at 16:24 but the professor does not mention the exact reason behind this.

Answered by Diya tomar

The definition of VC dimension is: if there exists a set of n points that can be shattered by the classifier and there is no set of n+1 points that can be shattered by the classifier, then the VC dimension of the classifier is n. The definition does not say: if any set of n points can be shattered by the classifier... If a classifier's VC dimension is 3, it does not have to shatter all possible arrangements of 3 points. If of all arrangements of 3 points you can find at least one such arrangement that can be shattered by the classifier, and cannot find 4 points that can be shattered, then VC dimension is 3.



Your Answer

Interviews

Parent Categories