A user wants to perform tf-idf on a very large dataset and want to make a one column in csv format that will contain each term with its tfidf, in non-decreasing. How to do that?
The above code works only in small size but crashes in large document.
For solving this problem, we should not coerce the TDM to a matrix. That will most likely cause an integer overflow issue with so many documents. The tm package uses the slam package to represent the tdm/dtm's. It has some functions for doing row- or column-wise operations without having to coerce to dense matrix.
The following code should work to fix the problem