Learn R Programming

webreadr (version 0.4.0)

split_clf: split requests from a CLF-formatted file

Description

CLF (Combined/Common Log Format) files store the HTTP method, protocol and asset requested in the same field. split_clf takes this field as a vector and returns a data.frame containing these elements in distinct columns. The function also works nicely with the uri field from Amazon S3 files (see read_s3).

Usage

split_clf(requests)

Arguments

requests
the "request" field from a CLF-formatted file, read in with read_clf or read_combined.

Value

a data.frame of three columns - "method", "asset" and "protocol" - representing, respectively, the HTTP method used ("GET"), the asset requested ("/favicon.ico") and the protocol used ("HTTP/1.0"). In cases where the request is not intact (containing, for example, just the protocol or just the asset) a row of empty strings will currently be returned. In the future, this will be somewhat improved.

See Also

read_clf and read_combined for reading in these files.

Examples

Run this code
# Grab CLF data and split out the request.
data <- read_combined(system.file("extdata/combined_log.clf", package = "webreadr"))
requests <- split_clf(data$request)

# An example using S3 files
s3_data <- read_s3(system.file("extdata/s3.log", package = "webreadr"))
s3_requests <- split_clf(s3_data$uri)

Run the code above in your browser using DataLab