Learn R Programming

SparkR (version 3.1.2)

R Front End for 'Apache Spark'

Description

Provides an R Front end for 'Apache Spark' .

Copy Link

Version

Install

install.packages('SparkR')

Monthly Downloads

155

Version

3.1.2

License

Apache License (== 2.0)

Maintainer

Last Published

June 3rd, 2021

Functions in SparkR (3.1.2)

S4 class that represents an ALSModel

FMClassificationModel-class

S4 class that represents a FMClassificationModel

DecisionTreeClassificationModel-class

S4 class that represents a DecisionTreeClassificationModel

GBTRegressionModel-class

S4 class that represents a GBTRegressionModel

BisectingKMeansModel-class

S4 class that represents a BisectingKMeansModel

DecisionTreeRegressionModel-class

S4 class that represents a DecisionTreeRegressionModel

AFTSurvivalRegressionModel-class

S4 class that represents a AFTSurvivalRegressionModel

LinearSVCModel-class

S4 class that represents an LinearSVCModel

FMRegressionModel-class

S4 class that represents a FMRegressionModel

GBTClassificationModel-class

S4 class that represents a GBTClassificationModel

FPGrowthModel-class

S4 class that represents a FPGrowthModel

PrefixSpan-class

S4 class that represents a PrefixSpan

PowerIterationClustering-class

S4 class that represents a PowerIterationClustering

GeneralizedLinearRegressionModel-class

S4 class that represents a generalized linear model

NaiveBayesModel-class

S4 class that represents a NaiveBayesModel

GaussianMixtureModel-class

S4 class that represents a GaussianMixtureModel

MultilayerPerceptronClassificationModel-class

S4 class that represents a MultilayerPerceptronClassificationModel

Download data from a SparkDataFrame into a R data.frame

GroupedData-class

S4 class that represents a GroupedData

KMeansModel-class

S4 class that represents a KMeansModel

column_collection_functions

Collection functions for Column operations

column_avro_functions

Avro processing functions for Column operations

S4 class that represents an LDAModel

currentDatabase

Returns the current default database

LinearRegressionModel-class

S4 class that represents a LinearRegressionModel

LogisticRegressionModel-class

S4 class that represents an LogisticRegressionModel

RandomForestClassificationModel-class

S4 class that represents a RandomForestClassificationModel

RandomForestRegressionModel-class

S4 class that represents a RandomForestRegressionModel

attach,SparkDataFrame-method

Attach SparkDataFrame to R search path

S4 class that represents a SparkDataFrame column

Calculates the approximate quantiles of numerical columns of a SparkDataFrame

glm,formula,ANY,SparkDataFrame-method

Generalized Linear Models (R-compliant)

S4 class that represents an KSTest

WindowSpec-class

S4 class that represents a WindowSpec

IsotonicRegressionModel-class

S4 class that represents an IsotonicRegressionModel

column_aggregate_functions

Aggregate functions for Column operations

SparkDataFrame-class

S4 class that represents a SparkDataFrame

awaitTermination

awaitTermination

StreamingQuery-class

S4 class that represents a StreamingQuery

Cancel active jobs for the specified group

Arrange Rows by Variables

Compute the hashCode of an object

column_ml_functions

ML functions for Column operations

column_datetime_diff_functions

Date time arithmetic functions for Column operations

Casts the column to a different data type.

Clear current job group ID and its description

timestamp_seconds

Date time functions for Column operations

Creates a table based on the dataset in a data source

Returns the number of rows in a SparkDataFrame

Print Schema of a SparkDataFrame

column_nonaggregate_functions

Non-aggregate functions for Column operations

Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.

Create o.a.s.sql.expressions.LambdaFunction corresponding to transformation described by func. Used by higher order functions.

(Deprecated) Drop Temporary Table

createDataFrame

Create a SparkDataFrame

Drops the temporary view with the given view name in the catalog.

Computes a pair-wise frequency table of the given columns

Return the first row of a SparkDataFrame

Download and Install Apache Spark to a Local Directory

A set of SparkDataFrame functions working with NA values

Returns the dimensions of SparkDataFrame

Create a SparkDataFrame representing the database table accessible via JDBC URL

Create a SparkDataFrame from a text file.

Create a SparkDataFrame from a JSON file.

recoverPartitions

Recovers all the partitions in the directory of a table and update the catalog

spark.fmClassifier

Factorization Machines Classification Model

spark.getSparkFilesRootDirectory

Get the root directory that contains files added through spark.addFile.

Logistic Regression Model

Get version of Spark on which this application is running

column_math_functions

Math functions for Column operations

column_string_functions

String functions for Column operations

spark.fmRegressor

Factorization Machines Regression Model

Generalized Linear Models

Multilayer Perceptron Classification Model

column_misc_functions

Miscellaneous functions for Column operations

Returns the number of columns in a SparkDataFrame

column_window_functions

Window functions for Column operations

Pivot a column of the GroupedData and perform the specified aggregation.

createOrReplaceTempView

Creates a temporary view using the given name.

createExternalTable

(Deprecated) Create an external table

getLocalProperty

Get a local property set in this thread, or NULL if it is missing. See setLocalProperty.

A set of operations working with SparkDataFrame columns

Column Names of SparkDataFrame

setJobDescription

Set a human readable description of the current job.

repartitionByRange

Repartition by range

Union two or more SparkDataFrames

Load a SparkDataFrame

Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.

sparkRHive.init

(Deprecated) Initialize a new HiveContext

invoke_higher_order_function

Invokes higher order function expression identified by name, (relative to o.a.s.sql.catalyst.expressions)

getNumPartitions

getNumPartitions

Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame

Returns a list of columns for the given table/view in the specified database

Isotonic Regression Model

Returns a list of databases available

Accelerated Failure Time (AFT) Survival Regression Model

sparkR.callJMethod

Call Java Methods

K-Means Clustering Model

spark.svmLinear

Linear SVM Model

sparkR.callJStatic

Call Static Java Methods

Load a fitted MLlib model from the input path.

Compute histogram statistics for given column

Create a SparkDataFrame from an ORC file.

Evaluate a R expression in an environment constructed from a SparkDataFrame

setCheckpointDir

Set checkpoint directory

setCurrentDatabase

Sets the current default database

Get fitted result from a k-means model

localCheckpoint

localCheckpoint

Finding frequent items for columns, possibly with false positives

Match a column with given values.

Compactly display the structure of a dataset

setLocalProperty

Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.

spark.gaussianMixture

Multivariate Gaussian Mixture Model (GMM)

Set new log level

Write the streaming SparkDataFrame to a data source.

Save the content of SparkDataFrame in a text file at the specified path.

Returns a list of functions registered in the specified database

Return a new SparkDataFrame containing the union of rows, matched by column names

Return a new SparkDataFrame containing the union of rows.

print.structField

Print a Spark StructField.

Returns a list of tables or views in the specified database

Ordering Columns in a WindowSpec

spark.naiveBayes

Naive Bayes Models

Merges two data frames

spark.assignClusters

PowerIterationClustering

print.structType

Print a Spark StructType.

Get Runtime Config from the current active SparkSession

(Deprecated) Initialize a new Spark Context

Create a SparkDataFrame from a Parquet file.

Load a streaming SparkDataFrame

Create a SparkDataFrame from a SparkSQL table or view

Makes predictions from a MLlib model

Get schema object

Save the contents of the SparkDataFrame to a data source as a table

registerTempTable

(Deprecated) Register Temporary Table

spark.bisectingKmeans

Bisecting K-Means Clustering Model

Returns a stratified sample without replacement

sparkR.uiWebUrl

Get the URL of the SparkUI instance for the current active SparkSession

sparkR.session.stop

Stop the Spark Session and Spark Context

(Deprecated) Initialize a new SQLContext

Linear Regression Model

Latent Dirichlet Allocation

Print a JVM object reference.

Alternating Least Squares (ALS) for Collaborative Filtering

Add a file or directory to be downloaded with this Spark job on every node.

Invalidates and refreshes all the cached data and metadata for SparkDataFrame containing path

Invalidates and refreshes all the cached data and metadata of the given table

spark.decisionTree

Decision Tree Model for Regression and Classification

(One-Sample) Kolmogorov-Smirnov Test

Run a function over a list of elements, distributing the computations with Spark

sparkR.newJObject

Create Java Objects

spark.randomForest

Random Forest Model for Regression and Classification

spark.getSparkFiles

Get the absolute path of a file added through spark.addFile.

Gradient Boosted Tree Model for Regression and Classification

spark.findFrequentSequentialPatterns

Get the existing SparkSession or initialize a new SparkSession.

Save the contents of SparkDataFrame as an ORC file, preserving the schema.

unresolved_named_lambda_var

Create o.a.s.sql.expressions.UnresolvedNamedLambdaVariable, convert it to o.s.sql.Column and wrap with R Column. Used by higher order functions.

Save the contents of SparkDataFrame as a Parquet file, preserving the schema.

windowPartitionBy

windowPartitionBy

Save the contents of SparkDataFrame as a JSON file

Saves the MLlib model to the input path

Save the contents of SparkDataFrame to a data source.

Save the content of SparkDataFrame to an external database table via JDBC.

Return a new SparkDataFrame containing the union of rows