Learn R Programming

SparkR (version 3.1.2)

R Front End for 'Apache Spark'

Description

Provides an R Front end for 'Apache Spark' .

Copy Link

Version

Install

install.packages('SparkR')

Monthly Downloads

155

Version

3.1.2

License

Apache License (== 2.0)

Maintainer

Last Published

June 3rd, 2021

Functions in SparkR (3.1.2)

ALSModel-class

S4 class that represents an ALSModel
FMClassificationModel-class

S4 class that represents a FMClassificationModel
DecisionTreeClassificationModel-class

S4 class that represents a DecisionTreeClassificationModel
GBTRegressionModel-class

S4 class that represents a GBTRegressionModel
BisectingKMeansModel-class

S4 class that represents a BisectingKMeansModel
DecisionTreeRegressionModel-class

S4 class that represents a DecisionTreeRegressionModel
AFTSurvivalRegressionModel-class

S4 class that represents a AFTSurvivalRegressionModel
LinearSVCModel-class

S4 class that represents an LinearSVCModel
FMRegressionModel-class

S4 class that represents a FMRegressionModel
GBTClassificationModel-class

S4 class that represents a GBTClassificationModel
FPGrowthModel-class

S4 class that represents a FPGrowthModel
PrefixSpan-class

S4 class that represents a PrefixSpan
PowerIterationClustering-class

S4 class that represents a PowerIterationClustering
between

between
GeneralizedLinearRegressionModel-class

S4 class that represents a generalized linear model
NaiveBayesModel-class

S4 class that represents a NaiveBayesModel
GaussianMixtureModel-class

S4 class that represents a GaussianMixtureModel
MultilayerPerceptronClassificationModel-class

S4 class that represents a MultilayerPerceptronClassificationModel
as.data.frame

Download data from a SparkDataFrame into a R data.frame
broadcast

broadcast
GroupedData-class

S4 class that represents a GroupedData
checkpoint

checkpoint
KMeansModel-class

S4 class that represents a KMeansModel
clearCache

Clear Cache
corr

corr
column_collection_functions

Collection functions for Column operations
column_avro_functions

Avro processing functions for Column operations
count

Count
cube

cube
LDAModel-class

S4 class that represents an LDAModel
currentDatabase

Returns the current default database
LinearRegressionModel-class

S4 class that represents a LinearRegressionModel
dropFields

dropFields
dropDuplicates

dropDuplicates
LogisticRegressionModel-class

S4 class that represents an LogisticRegressionModel
RandomForestClassificationModel-class

S4 class that represents a RandomForestClassificationModel
RandomForestRegressionModel-class

S4 class that represents a RandomForestRegressionModel
attach,SparkDataFrame-method

Attach SparkDataFrame to R search path
column

S4 class that represents a SparkDataFrame column
approxQuantile

Calculates the approximate quantiles of numerical columns of a SparkDataFrame
glm,formula,ANY,SparkDataFrame-method

Generalized Linear Models (R-compliant)
KSTest-class

S4 class that represents an KSTest
cache

Cache
WindowSpec-class

S4 class that represents a WindowSpec
alias

alias
group_by

GroupBy
IsotonicRegressionModel-class

S4 class that represents an IsotonicRegressionModel
column_aggregate_functions

Aggregate functions for Column operations
SparkDataFrame-class

S4 class that represents a SparkDataFrame
avg

avg
awaitTermination

awaitTermination
StreamingQuery-class

S4 class that represents a StreamingQuery
cancelJobGroup

Cancel active jobs for the specified group
arrange

Arrange Rows by Variables
hashCode

Compute the hashCode of an object
column_ml_functions

ML functions for Column operations
head

Head
column_datetime_diff_functions

Date time arithmetic functions for Column operations
cast

Casts the column to a different data type.
join

Join
clearJobGroup

Clear current job group ID and its description
timestamp_seconds

Date time functions for Column operations
last

last
not

!
cov

cov
createTable

Creates a table based on the dataset in a data source
nrow

Returns the number of rows in a SparkDataFrame
printSchema

Print Schema of a SparkDataFrame
cacheTable

Cache Table
column_nonaggregate_functions

Non-aggregate functions for Column operations
collect

Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.
coltypes

coltypes
distinct

Distinct
drop

drop
create_lambda

Create o.a.s.sql.expressions.LambdaFunction corresponding to transformation described by func. Used by higher order functions.
dropTempTable

(Deprecated) Drop Temporary Table
createDataFrame

Create a SparkDataFrame
dropTempView

Drops the temporary view with the given view name in the catalog.
coalesce

Coalesce
describe

describe
crosstab

Computes a pair-wise frequency table of the given columns
first

Return the first row of a SparkDataFrame
crossJoin

CrossJoin
filter

Filter
install.spark

Download and Install Apache Spark to a Local Directory
insertInto

insertInto
dropna

A set of SparkDataFrame functions working with NA values
dim

Returns the dimensions of SparkDataFrame
exceptAll

exceptAll
read.jdbc

Create a SparkDataFrame representing the database table accessible via JDBC URL
queryName

queryName
read.text

Create a SparkDataFrame from a text file.
read.json

Create a SparkDataFrame from a JSON file.
select

Select
selectExpr

SelectExpr
recoverPartitions

Recovers all the partitions in the directory of a table and update the catalog
explain

Explain
spark.fmClassifier

Factorization Machines Classification Model
spark.getSparkFilesRootDirectory

Get the root directory that contains files added through spark.addFile.
spark.logit

Logistic Regression Model
sparkR.version

Get version of Spark on which this application is running
column_math_functions

Math functions for Column operations
column_string_functions

String functions for Column operations
spark.fmRegressor

Factorization Machines Regression Model
spark.glm

Generalized Linear Models
spark.mlp

Multilayer Perceptron Classification Model
column_misc_functions

Miscellaneous functions for Column operations
ncol

Returns the number of columns in a SparkDataFrame
persist

Persist
column_window_functions

Window functions for Column operations
pivot

Pivot a column of the GroupedData and perform the specified aggregation.
createOrReplaceTempView

Creates a temporary view using the given name.
createExternalTable

(Deprecated) Create an external table
dtypes

DataTypes
getLocalProperty

Get a local property set in this thread, or NULL if it is missing. See setLocalProperty.
asc

A set of operations working with SparkDataFrame columns
colnames

Column Names of SparkDataFrame
setJobDescription

Set a human readable description of the current job.
repartitionByRange

Repartition by range
repartition

Repartition
rbind

Union two or more SparkDataFrames
read.df

Load a SparkDataFrame
setJobGroup

Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
sparkRHive.init

(Deprecated) Initialize a new HiveContext
%<=>%

%<=>%
dapplyCollect

dapplyCollect
invoke_higher_order_function

Invokes higher order function expression identified by name, (relative to o.a.s.sql.catalyst.expressions)
isActive

isActive
getNumPartitions

getNumPartitions
startsWith

startsWith
take

Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame
gapply

gapply
status

status
listColumns

Returns a list of columns for the given table/view in the specified database
spark.isoreg

Isotonic Regression Model
listDatabases

Returns a list of databases available
except

except
spark.survreg

Accelerated Failure Time (AFT) Survival Regression Model
sparkR.callJMethod

Call Java Methods
spark.kmeans

K-Means Clustering Model
spark.svmLinear

Linear SVM Model
sparkR.callJStatic

Call Static Java Methods
endsWith

endsWith
read.ml

Load a fitted MLlib model from the input path.
gapplyCollect

gapplyCollect
hint

hint
histogram

Compute histogram statistics for given column
read.orc

Create a SparkDataFrame from an ORC file.
intersect

Intersect
intersectAll

intersectAll
toJSON

toJSON
with

Evaluate a R expression in an environment constructed from a SparkDataFrame
isLocal

isLocal
setCheckpointDir

Set checkpoint directory
setCurrentDatabase

Sets the current default database
fitted

Get fitted result from a k-means model
limit

Limit
lastProgress

lastProgress
localCheckpoint

localCheckpoint
freqItems

Finding frequent items for columns, possibly with false positives
%in%

Match a column with given values.
structField

structField
str

Compactly display the structure of a dataset
spark.fpGrowth

FP-growth
setLocalProperty

Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
spark.gaussianMixture

Multivariate Gaussian Mixture Model (GMM)
setLogLevel

Set new log level
write.stream

Write the streaming SparkDataFrame to a data source.
withColumn

WithColumn
write.text

Save the content of SparkDataFrame in a text file at the specified path.
substr

substr
listFunctions

Returns a list of functions registered in the specified database
unionByName

Return a new SparkDataFrame containing the union of rows, matched by column names
unionAll

Return a new SparkDataFrame containing the union of rows.
agg

summarize
print.structField

Print a Spark StructField.
listTables

Returns a list of tables or views in the specified database
orderBy

Ordering Columns in a WindowSpec
isStreaming

isStreaming
spark.naiveBayes

Naive Bayes Models
partitionBy

partitionBy
mutate

Mutate
over

over
merge

Merges two data frames
spark.assignClusters

PowerIterationClustering
otherwise

otherwise
print.structType

Print a Spark StructType.
sparkR.conf

Get Runtime Config from the current active SparkSession
sparkR.init

(Deprecated) Initialize a new Spark Context
subset

Subset
structType

structType
read.parquet

Create a SparkDataFrame from a Parquet file.
rowsBetween

rowsBetween
rollup

rollup
read.stream

Load a streaming SparkDataFrame
tableToDF

Create a SparkDataFrame from a SparkSQL table or view
tables

Tables
predict

Makes predictions from a MLlib model
schema

Get schema object
saveAsTable

Save the contents of the SparkDataFrame to a data source as a table
show

show
randomSplit

randomSplit
rangeBetween

rangeBetween
showDF

showDF
registerTempTable

(Deprecated) Register Temporary Table
sample

Sample
rename

rename
spark.bisectingKmeans

Bisecting K-Means Clustering Model
sampleBy

Returns a stratified sample without replacement
sparkR.uiWebUrl

Get the URL of the SparkUI instance for the current active SparkSession
sparkR.session.stop

Stop the Spark Session and Spark Context
sparkRSQL.init

(Deprecated) Initialize a new SQLContext
spark.lm

Linear Regression Model
spark.lda

Latent Dirichlet Allocation
sql

SQL Query
print.jobj

Print a JVM object reference.
spark.als

Alternating Least Squares (ALS) for Collaborative Filtering
spark.addFile

Add a file or directory to be downloaded with this Spark job on every node.
refreshByPath

Invalidates and refreshes all the cached data and metadata for SparkDataFrame containing path
refreshTable

Invalidates and refreshes all the cached data and metadata of the given table
spark.decisionTree

Decision Tree Model for Regression and Classification
spark.kstest

(One-Sample) Kolmogorov-Smirnov Test
spark.lapply

Run a function over a list of elements, distributing the computations with Spark
sparkR.newJObject

Create Java Objects
spark.randomForest

Random Forest Model for Regression and Classification
storageLevel

StorageLevel
spark.getSparkFiles

Get the absolute path of a file added through spark.addFile.
spark.gbt

Gradient Boosted Tree Model for Regression and Classification
spark.findFrequentSequentialPatterns

PrefixSpan
stopQuery

stopQuery
unpersist

Unpersist
uncacheTable

Uncache Table
sparkR.session

Get the existing SparkSession or initialize a new SparkSession.
summary

summary
write.orc

Save the contents of SparkDataFrame as an ORC file, preserving the schema.
unresolved_named_lambda_var

Create o.a.s.sql.expressions.UnresolvedNamedLambdaVariable, convert it to o.s.sql.Column and wrap with R Column. Used by higher order functions.
write.parquet

Save the contents of SparkDataFrame as a Parquet file, preserving the schema.
windowPartitionBy

windowPartitionBy
write.json

Save the contents of SparkDataFrame as a JSON file
tableNames

Table Names
windowOrderBy

windowOrderBy
write.ml

Saves the MLlib model to the input path
withField

withField
write.df

Save the contents of SparkDataFrame to a data source.
write.jdbc

Save the content of SparkDataFrame to an external database table via JDBC.
withWatermark

withWatermark
union

Return a new SparkDataFrame containing the union of rows
dapply

dapply