Data Description

I went to Falls Creek for a ski trip before the Master course commenced. During my 3 days and 2 nights stay, the weather was pretty bad. I would like to understand more about the data structure of weather records from Bureau of Meteorology. I downloaded the data file “IDCJDW3027.201907.csv” (which recorded the July 2019 weather records in Falls Creek) and read that into R.

My data set includes:

  • numeric variables, below columns are all numeric variables:
    • Min Temp
    • Max Temp
    • Rainfall (mm)
    • Speed of maximum wind gust (km/h)
    • 9am Temperature
    • 9am relative humidity
    • 3pm Temperature
    • 3pm relative humidity
  • qualitative (categorical) variables, below columns are all qualitative variables:
    • Direction of maximum wind Gust
    • 9am wind direction
    • 3pm wind direction

Read/Import Data

I read and imported the data into R as a data frame, I have saved and named it as “fallsCreekJulyWeather”. I used the readr package to do so. The first 5 rows of the file is plain text info, which is irrelevant and shouldn’t be read in. Row 6 is the column header, in R, reading in \(^{\circ}\)C scrambled the “Min Temp” and “Max Temp” header, so I decided to skip that row as well, and simply specified the header of the table in the col_names arguments.

There are no data in some of the columns such as Evaporation (mm), Sunshine (hours), 9am cloud amount (oktas) etc. I skipped those columns by specifying “-” in the col_types arguments. I specified the following attributes in the col_types arguments to read in all the fields properly:

  • “f” for qualitative variables
    • Direction of maximum wind Gust
    • 9am wind direction
    • 3pm wind direction
  • “i” for the following numeric variables which are integer
    • Speed of maximum wind gust (km/h)
    • 9am relative humidity
    • 3pm relative humidity
  • “d” for the below numeric variables which are listed with decimal places
    • Min Temp
    • Max Temp
    • Rainfall (mm)
    • 9am Temperature
    • 3pm Temperature
  • “c” for the below time and date information, and I would convert these to proper Date and POSIX class in next section. 9 am wind speed and 3 pm wind speed suppose to be a numeric field, however when I scanned thru the data, “Calm” can be one of the values of these fields, so I have also set these variables as character.
    • Date
    • Time of maximum wind gust
    • 9am wind speed (km/h)
    • 3pm wind speed (km/h)

There are only 31 rows in the dataset, I used the head function and specified 31 as the second argument to check the entire read-in outputs. Below are the associated R codes:

# use read_csv to read-in the file IDCJDW3027.201907.csv, the first 6
# rows are skipped.


fallsCreekJulyWeather <- read_csv("src/R/data/IDCJDW3027.201907.csv", skip = 6, col_types = "-cddd--ficdi-fc-di-fc-",
                                  col_names = c("Date", "Min Temp", "Max Temp", "Rainfall (mm)", "Direction of maximum wind Gust", "Speed of maximum wind gust (km/h)", "Time of maximum wind gust", "9am Temperature", "9am relative humidity", "9am wind direction", "9am wind speed (km/h)", "3pm Temperature", "3pm relative humidity", "3pm wind direction", "3pm wind speed (km/h)"))

# Check whether fallsCreekJulyWeather is a data frame

is.data.frame(fallsCreekJulyWeather)
## [1] TRUE
#View the read-in output 

head(fallsCreekJulyWeather, 31)
## # A tibble: 31 x 15
##    Date  `Min Temp` `Max Temp` `Rainfall (mm)` `Direction of m~ `Speed of maxim~
##    <chr>      <dbl>      <dbl>           <dbl> <fct>                       <int>
##  1 1/7/~       -3.7       -0.8             1.6 NW                             24
##  2 2/7/~       -4.2       -0.2             0   NNW                            11
##  3 3/7/~       -3          1               0.8 SE                             24
##  4 4/7/~       -2.3        4.3             0.2 ESE                            19
##  5 5/7/~       -0.7        6.1             0   NNE                            28
##  6 6/7/~        1.3        6               0   NNW                            37
##  7 7/7/~        1.5        3.5             0   N                              61
##  8 8/7/~        1.5        2              17.6 N                              52
##  9 9/7/~       -2.5       -0.6             3.4 NNW                            54
## 10 10/7~       -2.1       -0.1             8.6 NW                             43
## # ... with 21 more rows, and 9 more variables: `Time of maximum wind
## #   gust` <chr>, `9am Temperature` <dbl>, `9am relative humidity` <int>, `9am
## #   wind direction` <fct>, `9am wind speed (km/h)` <chr>, `3pm
## #   Temperature` <dbl>, `3pm relative humidity` <int>, `3pm wind
## #   direction` <fct>, `3pm wind speed (km/h)` <chr>

Inspect and Understand

Inspect the data frame and variables using R functions :

  • I used the dim function to check the dimensions of the data frame.

    • There are 30 observations [rows] and 15 variables [columns]
  • I used the str and attributes function to check the attributes and structure of the data frame.

    • Use the as.Date function to convert the Date from character to Date.
    • Use the paste and strptime function to convert Time of maximum wind gust from character to time of the associated date in posix.
  • There are 3 similar factor variables,

    • Direction of maximum wind Gust
    • 9am wind direction
    • 3pm wind direction

    All of these factors, are the classification of wind direction. To classify all the directions inclusively, there should be four cardinal directions <North - N, East - E, South - S, West - W>, four intercardinal directions <NE, SE, SW, NW> and eight more divisions <NNE, ENE, ESE, SSE, SSW, WSW, WNW, NNW>. I have specified all these into levels and order them accordingly from clockwise direction, starting from North - N.

  • The column names in the data frame are already renamed in the previous section. The row names are good with 1 to 31, which make the date of July self-explantory. I have no intention to alter anymore.

Below are my R codes with outputs.

# Dimension 

dim(fallsCreekJulyWeather)
## [1] 31 15
# Structure
str(fallsCreekJulyWeather)
## tibble [31 x 15] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Date                             : chr [1:31] "1/7/2019" "2/7/2019" "3/7/2019" "4/7/2019" ...
##  $ Min Temp                         : num [1:31] -3.7 -4.2 -3 -2.3 -0.7 1.3 1.5 1.5 -2.5 -2.1 ...
##  $ Max Temp                         : num [1:31] -0.8 -0.2 1 4.3 6.1 6 3.5 2 -0.6 -0.1 ...
##  $ Rainfall (mm)                    : num [1:31] 1.6 0 0.8 0.2 0 0 0 17.6 3.4 8.6 ...
##  $ Direction of maximum wind Gust   : Factor w/ 11 levels "NW","NNW","SE",..: 1 2 3 4 5 2 6 6 2 1 ...
##  $ Speed of maximum wind gust (km/h): int [1:31] 24 11 24 19 28 37 61 52 54 43 ...
##  $ Time of maximum wind gust        : chr [1:31] "0:03" "1:43" "18:22" "20:28" ...
##  $ 9am Temperature                  : num [1:31] -3.1 -3 -1.7 1 2.6 3.4 2.2 1.7 -2.1 -0.7 ...
##  $ 9am relative humidity            : int [1:31] 97 97 98 91 85 75 96 99 97 99 ...
##  $ 9am wind direction               : Factor w/ 7 levels "NW","NNW","WNW",..: 1 2 NA 3 4 4 2 3 5 2 ...
##  $ 9am wind speed (km/h)            : chr [1:31] "7" "6" "Calm" "6" ...
##  $ 3pm Temperature                  : num [1:31] -1.6 -0.7 0.1 3.3 5.6 5 2.2 0.9 -0.9 -1 ...
##  $ 3pm relative humidity            : int [1:31] 98 98 99 88 72 73 99 99 99 99 ...
##  $ 3pm wind direction               : Factor w/ 5 levels "NNW","NNE","WNW",..: 1 NA NA 1 1 2 1 3 1 1 ...
##  $ 3pm wind speed (km/h)            : chr [1:31] "7" "Calm" "Calm" "4" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   col_skip(),
##   ..   Date = col_character(),
##   ..   `Min Temp` = col_double(),
##   ..   `Max Temp` = col_double(),
##   ..   `Rainfall (mm)` = col_double(),
##   ..   col_skip(),
##   ..   col_skip(),
##   ..   `Direction of maximum wind Gust` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   `Speed of maximum wind gust (km/h)` = col_integer(),
##   ..   `Time of maximum wind gust` = col_character(),
##   ..   `9am Temperature` = col_double(),
##   ..   `9am relative humidity` = col_integer(),
##   ..   col_skip(),
##   ..   `9am wind direction` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   `9am wind speed (km/h)` = col_character(),
##   ..   col_skip(),
##   ..   `3pm Temperature` = col_double(),
##   ..   `3pm relative humidity` = col_integer(),
##   ..   col_skip(),
##   ..   `3pm wind direction` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   `3pm wind speed (km/h)` = col_character(),
##   ..   col_skip()
##   .. )
#Attributes
attributes(fallsCreekJulyWeather)
## $names
##  [1] "Date"                              "Min Temp"                         
##  [3] "Max Temp"                          "Rainfall (mm)"                    
##  [5] "Direction of maximum wind Gust"    "Speed of maximum wind gust (km/h)"
##  [7] "Time of maximum wind gust"         "9am Temperature"                  
##  [9] "9am relative humidity"             "9am wind direction"               
## [11] "9am wind speed (km/h)"             "3pm Temperature"                  
## [13] "3pm relative humidity"             "3pm wind direction"               
## [15] "3pm wind speed (km/h)"            
## 
## $class
## [1] "spec_tbl_df" "tbl_df"      "tbl"         "data.frame" 
## 
## $row.names
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26 27 28 29 30 31
## 
## $spec
## cols(
##   col_skip(),
##   Date = col_character(),
##   `Min Temp` = col_double(),
##   `Max Temp` = col_double(),
##   `Rainfall (mm)` = col_double(),
##   col_skip(),
##   col_skip(),
##   `Direction of maximum wind Gust` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   `Speed of maximum wind gust (km/h)` = col_integer(),
##   `Time of maximum wind gust` = col_character(),
##   `9am Temperature` = col_double(),
##   `9am relative humidity` = col_integer(),
##   col_skip(),
##   `9am wind direction` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   `9am wind speed (km/h)` = col_character(),
##   col_skip(),
##   `3pm Temperature` = col_double(),
##   `3pm relative humidity` = col_integer(),
##   col_skip(),
##   `3pm wind direction` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   `3pm wind speed (km/h)` = col_character(),
##   col_skip()
## )
#convert Date from character to Date type, check the class
fallsCreekJulyWeather$Date <- as.Date(fallsCreekJulyWeather$Date, "%d/%m/%Y")
class(fallsCreekJulyWeather$Date)
## [1] "Date"
#convert Time of maximum wind gust from character to time of the associated date in posix, check the class 
fallsCreekJulyWeather$`Time of maximum wind gust` <- paste(fallsCreekJulyWeather$Date, " ", fallsCreekJulyWeather$`Time of maximum wind gust`)
  
fallsCreekJulyWeather$`Time of maximum wind gust` <- strptime(fallsCreekJulyWeather$`Time of maximum wind gust`, "%Y-%m-%d %H:%M")
class(fallsCreekJulyWeather$`Time of maximum wind gust`)
## [1] "POSIXlt" "POSIXt"
# Order all the wind direction variables <Direction of maximum wind Gust, 9am wind direction, 3pm wind direction> and include all the possible directions: four cardinal directions <North - N, East - E, South - S, West - W>, four intercardinal directions <NE, SE, SW, NW> and eight more divisions <NNE, ENE, ESE, SSE, SSW, WSW, WNW, NNW>. 

fallsCreekJulyWeather$`Direction of maximum wind Gust` <- factor(fallsCreekJulyWeather$`Direction of maximum wind Gust`,
                                                                 levels = c("N", "NNE", "NE", "ENE", "E", "ESE", "SE", "SSE", "S", "SSW", "SW", "WSW", "W", "WNW", "NW", "NNW"), ordered = TRUE)


fallsCreekJulyWeather$`9am wind direction` <- factor(fallsCreekJulyWeather$`9am wind direction`,
                                                     levels = c("N", "NNE", "NE", "ENE", "E", "ESE", "SE", "SSE", "S", "SSW", "SW", "WSW", "W", "WNW", "NW", "NNW"), ordered = TRUE)

fallsCreekJulyWeather$`3pm wind direction` <- factor(fallsCreekJulyWeather$`3pm wind direction`,
                                                     levels = c("N", "NNE", "NE", "ENE", "E", "ESE", "SE", "SSE", "S", "SSW", "SW", "WSW", "W", "WNW", "NW", "NNW"), ordered = TRUE)

# Check the structure again after executing the above conversion and reorganising the factors

str(fallsCreekJulyWeather)
## tibble [31 x 15] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Date                             : Date[1:31], format: "2019-07-01" "2019-07-02" ...
##  $ Min Temp                         : num [1:31] -3.7 -4.2 -3 -2.3 -0.7 1.3 1.5 1.5 -2.5 -2.1 ...
##  $ Max Temp                         : num [1:31] -0.8 -0.2 1 4.3 6.1 6 3.5 2 -0.6 -0.1 ...
##  $ Rainfall (mm)                    : num [1:31] 1.6 0 0.8 0.2 0 0 0 17.6 3.4 8.6 ...
##  $ Direction of maximum wind Gust   : Ord.factor w/ 16 levels "N"<"NNE"<"NE"<..: 15 16 7 6 2 16 1 1 16 15 ...
##  $ Speed of maximum wind gust (km/h): int [1:31] 24 11 24 19 28 37 61 52 54 43 ...
##  $ Time of maximum wind gust        : POSIXlt[1:31], format: "2019-07-01 00:03:00" "2019-07-02 01:43:00" ...
##  $ 9am Temperature                  : num [1:31] -3.1 -3 -1.7 1 2.6 3.4 2.2 1.7 -2.1 -0.7 ...
##  $ 9am relative humidity            : int [1:31] 97 97 98 91 85 75 96 99 97 99 ...
##  $ 9am wind direction               : Ord.factor w/ 16 levels "N"<"NNE"<"NE"<..: 15 16 NA 14 1 1 16 14 13 16 ...
##  $ 9am wind speed (km/h)            : chr [1:31] "7" "6" "Calm" "6" ...
##  $ 3pm Temperature                  : num [1:31] -1.6 -0.7 0.1 3.3 5.6 5 2.2 0.9 -0.9 -1 ...
##  $ 3pm relative humidity            : int [1:31] 98 98 99 88 72 73 99 99 99 99 ...
##  $ 3pm wind direction               : Ord.factor w/ 16 levels "N"<"NNE"<"NE"<..: 16 NA NA 16 16 2 16 14 16 16 ...
##  $ 3pm wind speed (km/h)            : chr [1:31] "7" "Calm" "Calm" "4" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   col_skip(),
##   ..   Date = col_character(),
##   ..   `Min Temp` = col_double(),
##   ..   `Max Temp` = col_double(),
##   ..   `Rainfall (mm)` = col_double(),
##   ..   col_skip(),
##   ..   col_skip(),
##   ..   `Direction of maximum wind Gust` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   `Speed of maximum wind gust (km/h)` = col_integer(),
##   ..   `Time of maximum wind gust` = col_character(),
##   ..   `9am Temperature` = col_double(),
##   ..   `9am relative humidity` = col_integer(),
##   ..   col_skip(),
##   ..   `9am wind direction` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   `9am wind speed (km/h)` = col_character(),
##   ..   col_skip(),
##   ..   `3pm Temperature` = col_double(),
##   ..   `3pm relative humidity` = col_integer(),
##   ..   col_skip(),
##   ..   `3pm wind direction` = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   `3pm wind speed (km/h)` = col_character(),
##   ..   col_skip()
##   .. )

Subsetting I

Subset the data frame using first 10 observations (include all variables). Then convert it to a matrix.

I checked the structure of the matrix, as matrix is homogeneous, everything has been converted to character.

Character is the data type with highest order in coercion, all the other data types have been coerced into Character in my matrix.

# Subset the data set and convert it to a matrix 

subFallsCreekJulyWeather <- fallsCreekJulyWeather[1:10,] 


# Viewing the subset and the first 10 rows with all the variables are listed.
head(subFallsCreekJulyWeather, 10)
## # A tibble: 10 x 15
##    Date       `Min Temp` `Max Temp` `Rainfall (mm)` `Direction of m~
##    <date>          <dbl>      <dbl>           <dbl> <ord>           
##  1 2019-07-01       -3.7       -0.8             1.6 NW              
##  2 2019-07-02       -4.2       -0.2             0   NNW             
##  3 2019-07-03       -3          1               0.8 SE              
##  4 2019-07-04       -2.3        4.3             0.2 ESE             
##  5 2019-07-05       -0.7        6.1             0   NNE             
##  6 2019-07-06        1.3        6               0   NNW             
##  7 2019-07-07        1.5        3.5             0   N               
##  8 2019-07-08        1.5        2              17.6 N               
##  9 2019-07-09       -2.5       -0.6             3.4 NNW             
## 10 2019-07-10       -2.1       -0.1             8.6 NW              
## # ... with 10 more variables: `Speed of maximum wind gust (km/h)` <int>, `Time
## #   of maximum wind gust` <dttm>, `9am Temperature` <dbl>, `9am relative
## #   humidity` <int>, `9am wind direction` <ord>, `9am wind speed (km/h)` <chr>,
## #   `3pm Temperature` <dbl>, `3pm relative humidity` <int>, `3pm wind
## #   direction` <ord>, `3pm wind speed (km/h)` <chr>
# Checking the structure of the subset, it's the same as whole data set. 
str(subFallsCreekJulyWeather)
## tibble [10 x 15] (S3: tbl_df/tbl/data.frame)
##  $ Date                             : Date[1:10], format: "2019-07-01" "2019-07-02" ...
##  $ Min Temp                         : num [1:10] -3.7 -4.2 -3 -2.3 -0.7 1.3 1.5 1.5 -2.5 -2.1
##  $ Max Temp                         : num [1:10] -0.8 -0.2 1 4.3 6.1 6 3.5 2 -0.6 -0.1
##  $ Rainfall (mm)                    : num [1:10] 1.6 0 0.8 0.2 0 0 0 17.6 3.4 8.6
##  $ Direction of maximum wind Gust   : Ord.factor w/ 16 levels "N"<"NNE"<"NE"<..: 15 16 7 6 2 16 1 1 16 15
##  $ Speed of maximum wind gust (km/h): int [1:10] 24 11 24 19 28 37 61 52 54 43
##  $ Time of maximum wind gust        : POSIXlt[1:10], format: "2019-07-01 00:03:00" "2019-07-02 01:43:00" ...
##  $ 9am Temperature                  : num [1:10] -3.1 -3 -1.7 1 2.6 3.4 2.2 1.7 -2.1 -0.7
##  $ 9am relative humidity            : int [1:10] 97 97 98 91 85 75 96 99 97 99
##  $ 9am wind direction               : Ord.factor w/ 16 levels "N"<"NNE"<"NE"<..: 15 16 NA 14 1 1 16 14 13 16
##  $ 9am wind speed (km/h)            : chr [1:10] "7" "6" "Calm" "6" ...
##  $ 3pm Temperature                  : num [1:10] -1.6 -0.7 0.1 3.3 5.6 5 2.2 0.9 -0.9 -1
##  $ 3pm relative humidity            : int [1:10] 98 98 99 88 72 73 99 99 99 99
##  $ 3pm wind direction               : Ord.factor w/ 16 levels "N"<"NNE"<"NE"<..: 16 NA NA 16 16 2 16 14 16 16
##  $ 3pm wind speed (km/h)            : chr [1:10] "7" "Calm" "Calm" "4" ...
# Checking the class of the subset, it's still a data frame
class(subFallsCreekJulyWeather)
## [1] "tbl_df"     "tbl"        "data.frame"
# convert the subset from data frame to Matrix
subFallsCreekJulyWeather <- as.matrix(subFallsCreekJulyWeather)

# Check the class, now the subset is a matrix
class(subFallsCreekJulyWeather)
## [1] "matrix" "array"
# Check the structure of the matrix, it is character
str(subFallsCreekJulyWeather)
##  chr [1:10, 1:15] "2019-07-01" "2019-07-02" "2019-07-03" "2019-07-04" ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:15] "Date" "Min Temp" "Max Temp" "Rainfall (mm)" ...

Subsetting II

Subset the data frame including only first and the last variable in the data set, save it as an R object file (.RData). Below is the relevant R codes with explanations and outputs:

# Subset the dataset to include only the first and last variable  
subFallsCreekJulyWeather1and15 <- fallsCreekJulyWeather[,c(1,15)] 

# View this new subset, there are only 2 variables <columns> and 31 observations <rows> 
head(subFallsCreekJulyWeather1and15, 31)
## # A tibble: 31 x 2
##    Date       `3pm wind speed (km/h)`
##    <date>     <chr>                  
##  1 2019-07-01 7                      
##  2 2019-07-02 Calm                   
##  3 2019-07-03 Calm                   
##  4 2019-07-04 4                      
##  5 2019-07-05 9                      
##  6 2019-07-06 7                      
##  7 2019-07-07 19                     
##  8 2019-07-08 15                     
##  9 2019-07-09 24                     
## 10 2019-07-10 26                     
## # ... with 21 more rows
# Check the structure of this subset, it is still a data frame with the same data type as the whole original data set.
str(subFallsCreekJulyWeather1and15)
## tibble [31 x 2] (S3: tbl_df/tbl/data.frame)
##  $ Date                 : Date[1:31], format: "2019-07-01" "2019-07-02" ...
##  $ 3pm wind speed (km/h): chr [1:31] "7" "Calm" "Calm" "4" ...
is.data.frame(subFallsCreekJulyWeather1and15)
## [1] TRUE
# Save this subset into a R object file
save(subFallsCreekJulyWeather1and15, file="output/R/2_Weather_1and15.Rdata")

Create a new Data Frame

Create a data frame with 2 variables and 4 observations. The data frame contains one integer variable intVar and one ordinal variable ordVar.

ordVar has been factored with proper order. I have checked the structure of my variables and the levels of the ordinal variable as below. After that I have created a numeric vector, Avg_waiting_time and use cbind() to add it to my data frame.

I then checked the attributes and the dimension of your new data frame. Below is the relevant R codes with explanations and outputs:

# A new data frame with 2 variables and 4 observations. It contains one integer variable and one ordinal variable.

# Integer Variable with 4 observations
intVar <- c(1L, 2L, 3L, 4L)

# Ordinal Variable with 4 observations.  It has been ordered properly 
ordVar <- c("slow", "fast", "normal", "express")
ordVar <- factor(ordVar, levels=c("slow","normal", "fast", "express"), ordered=TRUE)

# Print these 2 variables 
intVar
## [1] 1 2 3 4
ordVar
## [1] slow    fast    normal  express
## Levels: slow < normal < fast < express
# Combine these 2 variables into a data frame
df_q6 <- data.frame(col1=intVar, col2=ordVar)

# Assign these 2 variables with proper names and print the data frame
colnames(df_q6) <- c("Queue No", "Speed")
df_q6
##   Queue No   Speed
## 1        1    slow
## 2        2    fast
## 3        3  normal
## 4        4 express
# Check the structure of the data frame
str(df_q6)
## 'data.frame':    4 obs. of  2 variables:
##  $ Queue No: int  1 2 3 4
##  $ Speed   : Ord.factor w/ 4 levels "slow"<"normal"<..: 1 3 2 4
# Create a numeric vector 
Avg_waiting_time <- c(8.8, 3.6, 6.3, 2.1)

# Check the class of the numeric vector
class(Avg_waiting_time)
## [1] "numeric"
# Use cbind to add onto the dataframe
df_q6 <- cbind(df_q6, Avg_waiting_time)

# Check the attributes of the new dataframe, it shows the column and row name.  The class of df_q6 is stil a data frame.
attributes(df_q6)
## $names
## [1] "Queue No"         "Speed"            "Avg_waiting_time"
## 
## $class
## [1] "data.frame"
## 
## $row.names
## [1] 1 2 3 4
# Check the structure of the new dataframe, there are now 3 variables, Queue No is integer, Speed is an ordered factor, Avg_waiting_time is numeric
str(df_q6)
## 'data.frame':    4 obs. of  3 variables:
##  $ Queue No        : int  1 2 3 4
##  $ Speed           : Ord.factor w/ 4 levels "slow"<"normal"<..: 1 3 2 4
##  $ Avg_waiting_time: num  8.8 3.6 6.3 2.1
# Check the dimension of the new dataframe, it's now 4 observations <rows> with 3 variables <columns>
dim(df_q6)
## [1] 4 3
# Finally print out the new dataframe
df_q6
##   Queue No   Speed Avg_waiting_time
## 1        1    slow              8.8
## 2        2    fast              3.6
## 3        3  normal              6.3
## 4        4 express              2.1