% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/booster2sql.R
\name{booster2sql}
\alias{booster2sql}
\title{Transform XGBoost model object to SQL query.}
\usage{
booster2sql(xgbModel, print_progress = FALSE, unique_id = NULL,
  output_file_name = NULL, input_table_name = NULL,
  input_onehot_query = NULL)
}
\arguments{
\item{xgbModel}{The trained model object of class \code{xgb.Booster}.
Current supported booster is \code{booster="gbtree"}, supported \code{objective} options are:
\itemize{
  \item – \code{reg:squarederror}: regression with squared loss.
  \item – \code{reg:logistic}: logistic regression, output probability.
  \item - \code{binary:logistic}: logistic regression for binary classification, output probability.
  \item – \code{binary:logitraw}: logistic regression for binary classification, output score before logistic transformation.
  \item - \code{binary:hinge}: hinge loss for binary classification. This makes predictions of 0 or 1, rather than producing probabilities.
  \item - \code{count:poisson}: poisson regression for count data, output mean of poisson distribution.
  \item - \code{reg:gamma}: gamma regression with log-link, output mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be gamma-distributed.
  \item - \code{reg:tweedie}: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be Tweedie-distributed.
}}

\item{print_progress}{Boolean indicator controls whether the SQL generating progress should be printed to console.}

\item{unique_id}{A row unique identifier is crucial for in-database scoring of XGBoost model. If not given, SQL query will be generated with id name "ROW_KEY".}

\item{output_file_name}{File name that the SQL syntax will write to. If empty the query will be printed to console.}

\item{input_table_name}{Name of raw data table in the database, that the SQL query will select from. If not given, SQL query will be generated with table name "MODREADY_TABLE".}

\item{input_onehot_query}{SQL query of one-hot encoding generated by \code{onehot2sql}. When \code{input_table_name} is empty while \code{input_onehot_query} is not, the final output query will include \code{input_onehot_query} as sub-query.}
}
\value{
The SQL query will write to the file specified by \code{output_file_name}.
}
\description{
This function generates SQL query for in-database scoring of XGBoost models,
providing a robust and efficient way of model deployment. It takes in the trained XGBoost model \code{xgbModel},
name of the input database table \code{input_table_name},
and name of a unique identifier within that table \code{unique_id} as input,
writes the SQL query to a file specified by \code{output_file_name}.
Note that the input database table should be generated from the raw table using the one-hot encoding query output by \code{onehot2sql()},
or to provide the one-hot encoding query as input \code{input_onehot_query} to this function, working as sub-query inside the final model scoring query.
}
\examples{
library(xgboost)
# load data
df = data.frame(ggplot2::diamonds)
head(df)

# data processing
out <- onehot2sql(df)
x <- out$model.matrix[,colnames(out$model.matrix)!='price']
y <- out$model.matrix[,colnames(out$model.matrix)=='price']

# model training
bst <- xgboost(x = x,
               y = y,
               max_depth = 3,
               learning_rate = .3,
               nrounds = 5,
               nthreads = 1,
               objective = 'reg:squarederror')

# generate model scoring SQL script with ROW_KEY and MODREADY_TABLE
booster2sql(bst)
}
