Title: | GIS Integration |
---|---|
Description: | Designed to facilitate the preprocessing and linking of GIS (Geographic Information System) databases <https://www.sciencedirect.com/topics/computer-science/gis-database>, the R package 'GISINTEGRATION' offers a robust solution for efficiently preparing GIS data for advanced spatial analyses. This package excels in simplifying intrica procedures like data cleaning, normalization, and format conversion, ensuring that the data are optimally primed for precise and thorough analysis. |
Authors: | Hossein Hassani [aut], Leila Marvian Mashhad [aut, cre], Sara Stewart [aut], Steve Macfeelys [aut] |
Maintainer: | Leila Marvian Mashhad <[email protected]> |
License: | GPL-3 |
Version: | 1.0 |
Built: | 2024-11-18 03:55:06 UTC |
Source: | https://github.com/cran/GISINTEGRATION |
After the pre processing of the data sets by preproc
function, a series of changes were made on the names of the two variables for uniformity. Sometimes these changes of names based on synonyms are not desired by the user. In this function, according to the output of the preproc
function, the user is asked to tell the program that any change in the name of the variables that he does not want.
chzInput(d1, d2, chz = "NULL")
chzInput(d1, d2, chz = "NULL")
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
For more details about this function, refer to preproc
function manual.
A vector of characters. It is a vector of characters that shows the names of the variables of the second data set based on the opinion of the user who said which variable name should not be changed.
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
d1 = RLdata500 d2 = RLdata10000 chzInput(d1, d2)
d1 = RLdata500 d2 = RLdata10000 chzInput(d1, d2)
First, after calling the two data sets, preliminary dat preprocessing is done using preproc
function. Then, according to its output, the user decides which variables should not be renamed. Then this function performs complementary data preprocessing such as sorting the names of the variables, matching the gender variable with different formats, etc. and produces two new data frames.
create_new_data(d1, d2, chz = "NULL")
create_new_data(d1, d2, chz = "NULL")
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
Two data frames.
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
d1 = RLdata500 d2 = RLdata10000 create_new_data(d1, d2)
d1 = RLdata500 d2 = RLdata10000 create_new_data(d1, d2)
In this function data preprocessing has been meticulously executed to cover a wide range of datasets, ensuring that variable names are standardized using synonyms.
preproc(d1, d2) ## S3 method for class 'explain' print(x,...)
preproc(d1, d2) ## S3 method for class 'explain' print(x,...)
d1 |
A data frame. |
d2 |
A data frame. |
x |
an object of class |
... |
further arguments passed to preproc function. |
Because we want users to be able to change their names. The output of this function gives the names and classes that have changed in the new version and the previous version, as well as the number of changes in both datasets. Returns the corresponding number for the chz
argument in the chzInput
function.
preproc
an object of class 'explain'
.
An object of class 'explain'
is a list containing the following components:
Changed variable's names |
Character. |
Changed variable's classes |
Character. |
Initial variable's names |
Character. |
Initial variable's classes |
Character. |
A number of changed variable values for the first dataset are |
Data frame. |
A number of changed variable values for the second dataset |
Data frame. |
Number of changed variable's names |
Vector. |
This function has a comprehensible output if changes have been made on the names of the variables for equalization, otherwise it has no specific output and everything is zero.
In addition, it should be noted that the names of the variables of the second data set are matched and the necessary changes are made based on the first data set.
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
d1 = RLdata500 d2 = RLdata10000 preproc(d1, d2)
d1 = RLdata500 d2 = RLdata10000 preproc(d1, d2)
This function enables users to effectively preprocess GIS data before conducting complex spatial analyses. By automating complex processes like data cleaning, normalization, and format transformation, GeoLinkR ensures that data are prepared for precise and reliable analysis.
preprocLinkageDBF(d1,d2,chz="NULL",var="area",threshold=0.9)
preprocLinkageDBF(d1,d2,chz="NULL",var="area",threshold=0.9)
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
var |
The vector of the names of the blocked variables that the user chooses based on the output of the |
threshold |
A numeric value between 0 and 1. |
The results are stored in the .dbf file in the system default path.
dbf file.
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
library(sf) nc1 <- system.file("shape/nc.shp", package="sf") nc2 <- system.file("shape/nc.shp", package="sf") nc1 <- st_read(nc1, stringsAsFactors = FALSE) nc2 <- st_read(nc2, stringsAsFactors = FALSE) d1 <- data.frame(nc1) d2 <- data.frame(nc2) preprocLinkageDBF(d1, d2, var='area')
library(sf) nc1 <- system.file("shape/nc.shp", package="sf") nc2 <- system.file("shape/nc.shp", package="sf") nc1 <- st_read(nc1, stringsAsFactors = FALSE) nc2 <- st_read(nc2, stringsAsFactors = FALSE) d1 <- data.frame(nc1) d2 <- data.frame(nc2) preprocLinkageDBF(d1, d2, var='area')
This function displays the names of common variables based on the create_new_data
function so that the user can give any variable he/she wants as a blocked variable in the preprocLinkage
function.
selVar(d1, d2, chz = "NULL")
selVar(d1, d2, chz = "NULL")
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
Character.
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
d1 = RLdata500 d2 = RLdata10000 selVar(d1, d2)
d1 = RLdata500 d2 = RLdata10000 selVar(d1, d2)