Why do we use group by 1 2 in SQL query?

458    Asked by AndreaBailey in SQL Server , Asked on Oct 4, 2022

In SQL queries, we do use Group by clause to apply aggregate functions. But what is the purpose behind using numeric value instead of column name with Group by clause? For example: Group by 1.

Answered by Amit Sinha
We use group by 1 2 because -

Consider below case:
+------------+--------------+-----------+
| date | services | downloads |
+------------+--------------+-----------+
| 2016-05-31 | Apps | 1 |
| 2016-05-31 | Applications | 1 |
| 2016-05-31 | Applications | 1 |
| 2016-05-31 | Apps | 1 |
| 2016-05-31 | Videos | 1 |
| 2016-05-31 | Videos | 1 |
| 2016-06-01 | Apps | 3 |
| 2016-06-01 | Applications | 4 |
| 2016-06-01 | Videos | 2 |
| 2016-06-01 | Apps | 2 |
+------------+--------------+-----------+
You've to find out the number of downloads per service per day considering Apps and Applications as the same service. Grouping by date, services would result in Apps and Applications being considered separate services.
In that case, query would be:
 select date, services, sum(downloads) as downloads
 from test.zvijay_test
 group by date,services
And Output:
+------------+--------------+-----------+
| date | services | downloads |
+------------+--------------+-----------+
| 2016-05-31 | Applications | 2 |
| 2016-05-31 | Apps | 2 |
| 2016-05-31 | Videos | 2 |
| 2016-06-01 | Applications | 4 |
| 2016-06-01 | Apps | 5 |
| 2016-06-01 | Videos | 2 |
+------------+--------------+-----------+
But this is not what you want since Applications and Apps to be grouped is the requirement. So what can we do?
One way is to replace Apps with Applications using a CASE expression or the IF function and then grouping them over services as:
select
  date,
  if(services='Apps','Applications',services) as services,
  sum(downloads) as downloads
from test.zvijay_test
group by date,services
But this still groups services considering Apps and Applications as different services and gives the same output as previously:
+------------+--------------+-----------+
| date | services | downloads |
+------------+--------------+-----------+
| 2016-05-31 | Applications | 2 |
| 2016-05-31 | Applications | 2 |
| 2016-05-31 | Videos | 2 |
| 2016-06-01 | Applications | 4 |
| 2016-06-01 | Applications | 5 |
| 2016-06-01 | Videos | 2 |
+------------+--------------+-----------+

Grouping over a column number allows you to group data on an aliased column.

select

  date,
  if(services='Apps','Applications',services) as services,
  sum(downloads) as downloads
from test.zvijay_test
group by date,2;
And thus giving you desired output as below:
+------------+--------------+-----------+
| date | services | downloads |
+------------+--------------+-----------+
| 2016-05-31 | Applications | 4 |
| 2016-05-31 | Videos | 2 |
| 2016-06-01 | Applications | 9 |
| 2016-06-01 | Videos | 2 |
+------------+--------------+-----------+

I've read many times that this is a lazy way of writing queries or grouping over an aliased column does not work in MySQL, but this is the way of grouping over aliased columns. This isn't the preferred way of writing queries, use it only when you really need to group over an aliased column.



Your Answer

Interviews

Parent Categories