A histogram is a popular way of visual representation of a feature values
distribution in a series of bins. For categorical features every bin represents
exactly one of feature values plus the number of occurrences of that value.
For numeric features every bin represents a range of values (low end inclusive,
high end exclusive) plus the total number of occurrences of all values in this range.
In addition to that, with every bin for categorical and numeric features there is also
included a target feature average for values in that bin (though it can be missing
if the feature is deemed uninformative, if the project target has not been selected
yet using SetTarget
, or if the project is a multiclass project).
GetFeatureHistogram(project, featureName, binLimit = NULL)
list containing:
count numeric. The number of values in this bin's range. If a project is using weights, the value is equal to the sum of weights of all feature values in the bin's range.
target numeric. Average of the target feature for values in this bin. It may be NULL
if the feature is deemed uninformative, if the target has not yet been set
(see SetTarget
), or if the project is multiclass.
label character. The value of the feature if categorical, otherwise the low end of the bin range such that the difference between two consecutive bin labels is the length of the bin.
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier.
Name of the feature to retrieve. Note: DataRobot renames some features, so the feature name may not be the one from your original data. You can use ListFeatureInfo to list the features and check the name.
integer. Optional. Desired max number of histogram bins. The default is 60.