International Womens Day : Flat 30% off on live classes + 2 free self-paced courses - SCHEDULE CALL

Select Course
Blog
Corporate Training

+1 202 599 3842

(4.8/5 ) | 1.5K+ Ratings

- Data Science Blogs -

How to import Data into R using Excel, CSV, Text and XML

Importing Data into R

R packages provide some sample data to learn the tools of data science but in real-time you have to work on your custom data for real-world applications. The objective of this tutorial is to explain how to read plain-text rectangular files into R. We will mainly focus on data import packages, but many of the principles will translate to other forms of data. We’ll explore some concepts of packages that are useful for other types of data.

Importing external data in R environment means that we can read data from external files, write data to external files, and can access those files from outside the R environment. File formats like csv, xml, xlsx, json, and web data can be imported into the R environment to read the data and perform data analysis, data manipulations and after data analysis data in R can be exported to external files in the same file formats.

Importing TXT/CSV Files to R

Using Base Functions

The methods given below are the base functions of R for importing different types of external files, so no separate packages are required to install to use these functions.

Each of the functions come with default arguments which separates them from each other. These arguments are:-

header: logical Boolean value. If the value is TRUE, then function assumes that your file has a header row. If headers are not given in excel file, then give value as FALSE.
fill: logical value. For TRUE value, unequal length rows will be added with blank fields implicitly.
sep: the field separator character. For example, “\t” is used for a tab-delimited file and a comma is used for comma-separated files.
dec: the character used for decimal points.
stringsAsFactor This should be set to TRUE if you want your text data to be converted to factors while importing.

We also need to give filename with a complete path or complete URL of the file is on the web.

Data Science Training - Using R and Python

Detailed Coverage
Best-in-class Content
Prepared by Industry leaders
Latest Technology Covered

Download Curriculum

Reading a Local File:

To import a file located in your local machine, follow the below steps: -

Set your working directory to point to the folder path containing your file with the command setwd ("") and then provide the file name in the function.
Use choose() within the import function. This will ask to select a file from your local machine.

Using the readr Package

The functions in this package (more or less) are used in a similar way as of the base functions. The main thing is that readr package functions work almost 10 times much faster than the base functions. So in the case of bulk files, importing this package will give very good speed in importing data.

delim: separates character values in the data file.
col_names: can be either TRUE (default value), FALSE or a character vector specifying column names. If it is TRUE, then the first row of the file will be returned as column names.

Reading CSV Files

CSV file is a kind of text file in which the values in columns are separated by a comma.

First, we have to set our working directory with the setwd() function to the directory where the file is located locally.

For example:

setwd("C:\Users\sarvendra.singh\Desktop\Janbask\Blogs\New blogs_july19\Random forest in R”)

Then to read values from CSV, there is a built-inunction read.csv() that export the data from the file as a data frame.

For example:


read.data <- read.csv("file1.csv")
print(read.data)

Analyzing a CSV File


#For printing number of columns
print(ncol(read.data))
Output:
[1] 5

#For printing  number of rows
print(nrow(read.data))
Output:
[1] 8

Writing to a CSV File

Writing values of pandas dataframe to an external CSV file, we use the write.csv() function.


per.sal <- subset(read.data, empsalary >= "30000" & empsalary <= "40000")
# Writing data into a new CSV file
write.csv(per.sal,"output.csv")
new.data <- read.csv("output.csv")
print(new.data)

Reading XML Files

XML (Extensible Markup Language) file contains both data and file format using the ASCII text. It is a little similar to an HTML file that contains markup tags, but the tags in an XML file describe the meaning of the data contained in the file rather than the structure of the page.

To load data of XML file in R, we have to install the XML package, as described below-


install.packages("XML")
To read XML files, we use the in-built function xmlParse().
For example:
#To load required xml package to read XML files
library("XML")
#To load other required packages
library("methods")
#To give the input file name to the function
newfile <- xmlParse(file = "samplefile.xml")
print(newfile)

Converting an XML to a Data Frame

Read: How Satistical Inference Like Terms Helps In Analysis?

To perform data analysis effectively after importing data in R, we convert the data in an XML file to a Data Frame. After converting, we can perform data manipulation and other operations as performed in a data frame.

For example:


library("XML")
library("methods")
#To convert the data in xml file to a data frame
xmlRdataframe <- xmlToDataFrame("samplefile.xml")
print(xmlRdataframe)

Output:

	ID	NAME	SALARY	STARTDATE	DEPT
1	1	Sam	32000	01/01/2001	HR
2	2	Rob	36000	09/03/2006	IT
3	3	Max	42000	01/05/2011	Sales
4	4	Ivar	50000	25/01/2001	Tech
5	5	Robert	25000	13/07/2015	Sales
6	6	Leon	57000	05/01/2000	IT
7	7	Samuel	45000	27/03/2003	Operations
8	8	Jack	24000	06/01/2016	Sales

Reading JSON Files

To import JSON files data into R, we first need to install or load the rjson package. JSON (JavaScript Object Notation) file is generally used for data exchange between a web application and a server. They are text-based human-readable files and editable by a normal text editor.

install.packages("rjson")

Now to read json files, we use the in-built function from JSON() which stores the data as a list.

For example:


#To load rjson package
library("rjson")
#To give the file name to the function
newfile <- fromJSON(file = "samplefile1.json")
#To print the file
print(newfile)

Output:
$ID

[1] "1" "2" "3" "4" "5" "6" "7" "8"

$Name

[1] "Sam"          "Rob"   "Max"  "Robert" "Ivar"   "Leon"   "Samuel" "Ivar"

$Salary

[1] "32000" "27000" "35000" "25000" "37000" "41000" "36000" "51000"

$StartDate

[1] "1/1/2001"  "9/3/2003"  "1/5/2004"  "14/11/2007" "13/7/2015" "4/3/2007"
[7] "27/3/2013"  "25/7/2000"

$Dept
[1] "IT"               "HR"                   "Tech"               "HR"                   "Sales"                "HR"
[7] "Operations" "IT"

Converting a JSON File to a Data Frame

To convert JSON file to a Data Frame, we use the as.data.frame() function.

For example:


library("rjson")
newfile <- fromJSON(file = "samplefile1.json")
#To convert a JSON file to a data frame
jsonRdataframe <- as.data.frame(newfile)
print(jsonRdataframe)

Output:

	ID	NAME	SALARY	STARTDATE	DEPT
1	1	Sam	32000	01/01/2001	IT
2	2	Rob	27000	09/03/2003	HR
3	3	Max	35000	01/05/2004	Tech
4	4	Ivar	25000	14/11/2007	HR
5	5	Robert	37000	13/07/2015	Sales
6	6	Leon	41000	04/03/2007	HR
7	7	Samuel	36000	27/03/2013	Operations
8	8	Jack	51000	25/07/2000	IT

Data Science Training - Using R and Python

No cost for a Demo Class

Industry Expert as your Trainer

Available as per your schedule

Customer Support Available

Enrol For a Free Demo Class

Reading Excel Files

Microsoft Excel is a very popular file format that stores data in xls and xlsx format. We can read data from excel to R and write data from R back to Excel file using the readxl package in R.

To install the readxl package, run the following command


install.packages("readxl")
read_excel() function is used to import file into R and  stores it as a data frame.
newfile <- read_excel("sheet1.xlsx)
print(newfile)

Output:

	ID	NAME	DEPT	SALARY	AGE
1	1	SAM	SALES	32000	35
2	2	ROB	HR	36000	23
3	3	MAC	IT	37000	40
4	4	IVAR	IT	25000	37
5	5	MAX	R&D	30000	22
6	6	ROBERT	HR	27000	32
7	7	SAMUEL	FINANCE	50000	41
8	8	RAGNAR	SALES	45000	29

Reading HTML Tables

Reading and retrieving HTML tables from the web, we use the XML and RCurl packages in R programming.

Below scripts are used to install XML and Rcurl libraries.


install.packages("XML")
install.packages("RCurl")
For load the packages, use the following command:
library("XML")
library("RCurl")

For example, we will fetch the ‘Fiscal annual growth table from a url using the readHTMLTable() function which stores it as a Data Frame.


#To fetch a table from any website paste the url
url <- "https://en.wikipedia.org/wiki/Fiscal_Annuak_Growth#Ranking"
tabs <- getURL(url)
#To fetch the first table, if the webpage has more than one table, we use which = 1
tabs <- readHTMLTable(tabs,which = 1, stringsAsFactors = F)
head(tabs)

Output:

	V1	V2	V3	V4	V5	V6	V7	V8	V9	V10	V11	V12	V13
1	Classification	Jurisdiction	2019	2018	2017	2016	2015	2014	2013	2012	2011	2010	2009
2	Very High	New Zealand	1	1	1	2	2	3	3	3	3	2	2
3	Very Low	Singapore	2	2	2	1	1	1	1	1	1	1	1
4	Medium	Denmark	3	3	3	3	4	5	5	5	6	6	5
5	Outstanding	Hong Kong	4	5	4	4	3	2	2	2	2	3	4
6	Very Easy	South Korea	5	4	5	5	5	7	8	8	16	19	23

str() function to analyze the structure of the data frame.

Read: What is Data Science? Learn from This Data Science Tutorial

str(tabs)

Output:


'data.frame':  191 obs. of  16 variables:
$ V1 : chr  "Classification" "Very Easy" "Very Easy" "Very Easy" ...
$ V2 : chr  "Jurisdiction" "New Zealand" "Singapore" "Denmark" ...
$ V3 : chr  "2019" "1" "2" "3" ...
$ V4 : chr  "2018" "1" "2" "3" ...
$ V5 : chr  "2017" "1" "2" "3" ...
$ V6 : chr  "2016" "2" "1" "3" ...
$ V7 : chr  "2015" "2" "1" "4" ...
$ V8 : chr  "2014" "3" "1" "5" ...
$ V9 : chr  "2013" "3" "1" "5" ...
$ V10: chr  "2012" "3" "1" "5" ...
$ V11: chr  "2011" "3" "1" "6" ...
$ V12: chr  "2010" "2" "1" "6" ...
$ V13: chr  "2009" "2" "1" "5" ...
$ V14: chr  "2008" "2" "1" "5" ...
$ V15: chr  "2007" "2" "1" "7" ...
$ V16: chr  "2006" "1" "2" "8" ...

Reading from SPSS File

foreign package is required to install to read SPSS files in R. ”read” function is used to read and SPS files in R."to.data.frame" option is used to return a data frame from reading spss file.


library(foreign)  # load the foreign package
help(read.spss)  # documentation
mydata = read.spss("myfile", to.data.frame=TRUE)

Reading from Minitab File

To read Minitab Portable Worksheet format in R, function read.mtp can be used from the foreign package. It returns a list of components in the Minitab worksheet.


library(foreign)    # load the foreign package
help(read.mtp)     # documentation
mydata = read.mtp("mydata.mtp")  # read from .mtp file

Connecting to MYSQL Server from R

In Real-world applications data is mostly located in external databases, not in CSV files. So it is must to know how to make a connection and access data in R from SQL/NO SQL Databases.

Connecting R to MySQL is easy with the RMySQL package. First, install the package and load the library.


install.packages("RMySQL")
library(RMySQL)

Connecting to MySQL Database:

Once the RMySQL library is installed create a database connection object.credentials like username, password needs to be provided as arguments in dbconnect function.


mysqldb = dbConnect(MySQL(), user='user', password='password', dbname='database_name', host='host')

Listing Tables and columns:

After the connection has been established we list the tables and fields in the database we connected to.

dbListTables(mysqldb)

A list of the tables will be returned in our connection.

dbListFields(mysqldb, 'some_table')

This will return a list of the fields in some_table.

Executing Queries:

With dbSendQuery function, we can run queries on tables.

dbSendQuery(mydb, 'drop table if exists some_table, some_other_table')

Creating Tables:

We can create tables in the database using R data frames.

dbWriteTable(mydb, name='table_name', value=data.frame.name)

Retrieving data from MySQL:

To retrieve data from the database we need to save a results set object.

Read: PCA - A Simple & Easy Approach for Dimensionality Reduction

rs = dbSendQuery(mydb, "select * from some_table")

Query results will remain on the MySQL server. To access the results in R, we need to use the fetch function.

data = fetch(rs, n=-1)

fetch command will save the results of the query in a data frame object. The n in the function specifies the number of records to retrieve, using n=-1 retrieves all pending records.

Connecting R to MongoDB

MongoDB connection with R is quite straightforward. The Package required to install for this connection is RMongo package. New features of MongoDB which are implemented in the last few years have not been included in the community R drivers.


require(RMongo)
mongo <- mongoDbConnect('kenblog', 'localhost', 27017)

In the mongoDbConnect method, we need to give the name of the database, server name, and port number to which we want to connect.

Now to retrieve records, we create a query. For this example, let’s get only the examdata from our scores collection. We can use the dbGetQuery method for this which takes a connection object, the collection name, and the query.

examQuery <- dbGetQuery(mongo, 'scores', "{'type': 'exam'}")

This loads in all the records from our scores collection of type exam. Let’s take the values of our exam scores and create a vector from them.

exam_scores <- examQuery[c('score')]

Now we can get a simple summary of our data with command summary(exam_scores):


score
Min.   : 60.00
1st Qu.: 72.00
Median : 79.00
Mean   : 79.45
3rd Qu.: 86.00
Max.   :100.00

Data Science Training - Using R and Python

Personalized Free Consultation

Access to Our Learning Management System

Access to Our Course Curriculum

Be a Part of Our Free Demo Class

Sign Up Now

PostgreSQL Connection with R

If we are working with large datasets that may consume lots of memory of our system, its better to have the data loaded in own server with an SQL/PostgreSQL database on it, where you can query the data in smaller digestible chunks

Let’s learn how to connect the PostgreSQL database with R. We need to install RPostgreSQL package for this approach.

To connect, we need to enter the following commands in R:


# install.packages("RPostgreSQL")
require("RPostgreSQL")
# create a connection
# save the password that we can "hide" it as best as we can by collapsing it
pw <- {
"new_user_password"
}
# loads the PostgreSQL driver
drv <- dbDriver("PostgreSQL")
# creates a connection to the postgres database
# note that "con" will be used later in each connection to the database
con <- dbConnect(drv, dbname = "postgres",
host = "localhost", port = 5432,
user = "openpg", password = pw)
rm(pw) # removes the password
# check for the cartable
dbExistsTable(con, "cartable")

Library	Objective	Function
base	Export csv	write.csv()
xlsx	Export excel	write.xlsx()
haven	Export spss	write_sav()
haven	Export sas	write_sas()
haven	Export stata	write_dta()

base	Export R	save()
googledrive	Upload Google Drive	drive_upload()
googledrive	Open in Google Drive	drive_browse()
googledrive	Retrieve file ID	drive_get(as_id())
googledrive	Dowload from Google Drive	download_google()
googledrive	Remove file from Google Drive	drive_rm()
rdrop2	Authentification	drop_auth()
rdrop2	Create a folder	drop_create()
rdrop2	Upload to Dropbox	drop_upload()
rdrop2	Read csv from Dropbox	drop_read_csv
rdrop2	Delete file from Dropbox	drop_delete()

Conclusion

Reading data from different sources for analysis and exporting the results to some other system for report writing can be frustrating tasks because this can only take a lot of time compared to statistical analysis. Below is the summarised table having a library and corresponding functions name to import a particular file in R.

Were you able to understand how to import data into R? Now that you can import data in R, Login to Janbask Training and learn about more such techniques!

Read: An Easy Way to Understand Adaboost

Data Science Tutorial Overview

Introduction

Careers

Data Science Vs. Different Technologies

Tools

Useful Resources

Interview

FaceBook

Twitter

JanBask Training Team

The JanBask Training Team includes certified professionals and expert writers dedicated to helping learners navigate their career journeys in QA, Cybersecurity, Salesforce, and more. Each article is carefully researched and reviewed to ensure quality and relevance.

Comments

Data Science Course
Upcoming Batches

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

View Detail

Trending Courses

Cyber Security

Introduction to cybersecurity
Cryptography and Secure Communication
Cloud Computing Architectural Framework
Security Architectures and Models

Upcoming Class

6 days 25 Jul 2025

View Details

Introduction and Software Testing
Software Test Life Cycle
Automation Testing and API Testing
Selenium framework development using Testing

Upcoming Class

6 days 25 Jul 2025

View Details

Salesforce

Salesforce Configuration Introduction
Security & Automation Process
Sales & Service Cloud
Apex Programming, SOQL & SOSL

Upcoming Class

4 days 23 Jul 2025

View Details

Business Analyst

BA & Stakeholders Overview
BPMN, Requirement Elicitation
BA Tools & Design Documents
Enterprise Analysis, Agile & Scrum

Upcoming Class

6 days 25 Jul 2025

View Details

MS SQL Server

Introduction & Database Query
Programming, Indexes & System Functions
SSIS Package Development Procedures
SSRS Report Design

Upcoming Class

6 days 25 Jul 2025

View Details

Data Science

Data Science Introduction
Hadoop and Spark Overview
Python & Intro to R Programming
Machine Learning

Upcoming Class

13 days 01 Aug 2025

View Details

DevOps

Intro to DevOps
GIT and Maven
Jenkins & Ansible
Docker and Cloud Computing

Upcoming Class

-0 day 19 Jul 2025

View Details

Hadoop

Architecture, HDFS & MapReduce
Unix Shell & Apache Pig Installation
HIVE Installation & User-Defined Functions
SQOOP & Hbase Installation

Upcoming Class

7 days 26 Jul 2025

View Details

Python

Features of Python
Python Editors and IDEs
Data types and Variables
Python File Operation

Upcoming Class

6 days 25 Jul 2025

View Details

Artificial Intelligence

Components of AI
Categories of Machine Learning
Recurrent Neural Networks
Recurrent Neural Networks

Upcoming Class

9 days 28 Jul 2025

View Details

Machine Learning

Introduction to Machine Learning & Python
Machine Learning: Supervised Learning
Machine Learning: Unsupervised Learning

Upcoming Class

6 days 25 Jul 2025

View Details

Tableau

Introduction to Tableau Desktop
Data Transformation Methods
Configuring tableau server
Integration with R & Hadoop

Upcoming Class

7 days 26 Jul 2025

View Details

Browse Categories

Data Science vs Machine Learning- Career That is Right for You

Apr 08, 2024 eye-dark

3.8k

The Battle Between R and Python

Dec 22, 2019 eye-dark

4.2k

Logistic Regression is Easy to Understand

Jan 24, 2020 eye-dark

5.8k

Search Posts

Reset

Data Science vs Machine Learning- Career That is Right for You 3.8k

The Battle Between R and Python 4.2k

Logistic Regression is Easy to Understand 5.8k

Random Forest: An Easy Explanation of the Forest 5.4k

Difference Between Data Scientist and Data Analyst 430.9k

Data Science Course
Upcoming Batches

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

View Detail

Receive Latest Materials and Offers on Data Science Course

By submitting my contact details, I agree Privacy Policy ... and I consent to receiving SMS/call/email, including marketing and promotional SMS. Read More

Scroll

How to import Data into R using Excel, CSV, Text and XML

Contents Index

Importing Data into R

Importing TXT/CSV Files to R

Using Base Functions

Reading a Local File:

Using the readr Package

Reading CSV Files

Writing to a CSV File

Reading XML Files

Reading JSON Files

Data Science Training - Using R and Python

No cost for a Demo Class

Industry Expert as your Trainer

Available as per your schedule

Customer Support Available

Enrol For a Free Demo Class

Reading Excel Files

Reading HTML Tables

Reading from SPSS File

Reading from Minitab File

Connecting to MYSQL Server from R

Connecting to MySQL Database:

Retrieving data from MySQL:

Connecting R to MongoDB

Data Science Training - Using R and Python

Personalized Free Consultation

Access to Our Learning Management System

Access to Our Course Curriculum

Be a Part of Our Free Demo Class

Sign Up Now

PostgreSQL Connection with R

Data Science Tutorial Overview

Introduction

Careers

Data Science Vs. Different Technologies

Tools

Useful Resources

Interview

JanBask Training Team

Comments

Trending Courses

Browse Categories

Related Posts

How to import Data into R using Excel, CSV, Text and XML

Contents Index

Importing Data into R

Importing TXT/CSV Files to R

Using Base Functions

Reading a Local File:

Using the readr Package

Reading CSV Files

Writing to a CSV File

Reading XML Files

Reading JSON Files

Data Science Training - Using R and Python No cost for a Demo Class Industry Expert as your Trainer Available as per your schedule Customer Support Available Enrol For a Free Demo Class

Reading Excel Files

Reading HTML Tables

Reading from SPSS File

Reading from Minitab File

Connecting to MYSQL Server from R

Connecting to MySQL Database:

Retrieving data from MySQL:

Connecting R to MongoDB

Data Science Training - Using R and Python Personalized Free Consultation Access to Our Learning Management System Access to Our Course Curriculum Be a Part of Our Free Demo Class Sign Up Now

PostgreSQL Connection with R

Data Science Tutorial Overview

Introduction

Careers

Data Science Vs. Different Technologies

Tools

Useful Resources

Interview

JanBask Training Team

Comments

Trending Courses

Browse Categories

Related Posts

Data Science Training - Using R and Python

No cost for a Demo Class

Industry Expert as your Trainer

Available as per your schedule

Customer Support Available

Enrol For a Free Demo Class

Data Science Training - Using R and Python

Personalized Free Consultation

Access to Our Learning Management System

Access to Our Course Curriculum

Be a Part of Our Free Demo Class

Sign Up Now