Can we consider ALTER TABLE DROP COLUMN SQL server a metadata-only operation?
Even though I have found several sources supporting the argument that states ALTER TABLE DROP COLUMN is a metadata-only operation, I want to know how this is possible. Is the data optional to be purged from underlying non-clustered indexes and clustered heap while working on the DROP COLUMN? Also, why does Microsoft Docs say that it is a logged operation? I also have another question: How does the engine track the dropped columns if the data is not erased from underlying pages?
The column definitions for a given table are not stored in any page containing rows. They are only stored in the database metadata, including sys. sysrscols, sys. sysrowsets.
The storage engine marks the column definition as absent when dropping a column that is not referenced by any project. It happens because the permanent details from multiple system tables get deleted. As a result, the meta-data invalidates the procedure cache, instigating a recompile when a query subsequently references a specific table. Since the recompile gives back the columns existing in the table, the column details become secondary, and the storage engine skips the bytes preserved on every page for that particular column.
During a subsequent DML operation against the table, the affected pages are re-written without data for the dropped column. However, if you create a clustered index, all the bytes for the dropped column are not returned to the page. It can spread the load of dropping the column over time.
However, sometimes, SQL server drop column or dropping a column is not an option (when a column is added in an index). In that case, the referencing object must be altered before the final dropping of the column.