analyticsdf package
analyticsdf.analyticsdataframe module
- class analyticsdf.analyticsdataframe.AnalyticsDataframe(n, p, predictor_names=None, response_vector_name=None, seed=None)[source]
- Bases: - object- Create a AnalyticsDataframe class. - Creates a dataframe class which uses the - n,- p,- predictor_namesand- response_vector_namearguments to initialize a dataframe.- Args:
- n:
- Number of observations. 
- p:
- Number of predictors. 
- predictor_names:
- List of strings (default = [X1, X2, … Xp]). 
- response_vector_name:
- String (default = Y). 
 
- Returns:
- AnalyticsDataframe class:
- predictor_matrix: a Pandas Dataframe with Nan. response_vector: a Pandas Series with Nan. 
 
 - generate_response_vector_linear(predictor_name_list: list = None, beta: list = None, epsilon_variance: float = None)[source]
- Generates a response vector based on a linear regression generative model. - Args:
- predictor_name_list:
- A list of predictor names in the initial AnalyticsDataframe. 
- beta:
- A list, coefficients of the linear model – first coefficient is the intercept 
- epsilon_variance:
- A scalar variance specification. 
 
- Raises:
- KeyError: If the column does not exists. 
 
 - generate_response_vector_polynomial(predictor_name_list: list, polynomial_order: list, beta: list, interaction_term_betas: array, epsilon_variance: float)[source]
- Generates a response vector based on a linear regression generative model that contains polynomial terms for one or more of the predictors and interaction terms. - Args:
- predictor_name_list:
- A list of predictor names in the initial AnalyticsDataframe. 
- polynomial_order:
- A list of integers that specify the order of the polynomial for each predictor with legal values of 1 to 4. 
- beta_vector:
- A list of the betas (coefficients of the linear model)
- – First coefficient is the intercept – Next coefficients ( are the coefficients of the polynomial terms for the first predictor (as specified in the polynomial_order array) – Continuing in this manner for all the predictors specified in the predictor_names parameter - Array length must equal the sum of the values in the polynomial_order array plus one 
 
- interaction_term_betas:
- A np.array-like lower triangular matrix with both dimensions equal to the sum of the polynomial_order array containing the betas of any interaction terms 
- epsilon_variance:
- A scalar variance specification 
 
- Raises:
- KeyError: If the column does not exists. TypeError: If the column is not numeric. 
 
 - property predictor_names
 - property response_vector_name
 - update_predictor_beta(predictor_name_list, a, b)[source]
- Update the predictors of the instance as beta distributed. - Args:
- predictor_name_list:
- A list of predictor names in the initial AnalyticsDataframe. 
- a:
- float or array_like of floats. Alpha, positive (>0). 
- b:
- float or array_like of floats. Beta, positive (>0). 
 
- Raises:
- KeyError: If the column does not exists. 
 
 - update_predictor_categorical(predictor_name=None, category_names: list | None = None, prob_vector: array | None = None)[source]
- Update a predictor with categorical values. - Args:
- predictor_name:
- A predictor name in the initial AnalyticsDataframe. 
- category_names:
- A vector of strings that contains names of the different category values 
- prob_vector:
- A vector of numerics of the same length as category_names that specifies the probability (frequency) of each category value. 
 
- Raises:
- KeyError: If the column does not exists. ValueError: If sum of - prob_vectornot equal to 1. ValueError: If length of- prob_vectornot equal to- category_names.
 
 - update_predictor_multicollinear(target_predictor_name=None, dependent_predictors_list=None, beta: list | None = None, epsilon_variance: float | None = None)[source]
- Update the predictor to be multicollinear with other predictors. - Args:
- predictor_name:
- A string of target predictor name in the initial AnalyticsDataframe. 
- dependent_predictors_list:
- A list of predictor names which selected as dependents. 
- beta:
- A list, coefficients of the linear model – first coefficient is the intercept 
- epsilon_variance:
- A scalar variance specification. 
 
- Raises:
- KeyError: If the column does not exists. 
 
 - update_predictor_normal(predictor_name_list: list = None, mean: ndarray = None, covariance_matrix: ndarray = None)[source]
- Update the predictors of the instance to normally distributed. - Args:
- predictor_name_list:
- A list of predictor names in the initial AnalyticsDataframe. 
- mean:
- A numpy array or list, containing mean values. 
- covariance_matrix:
- A symmetric and positive semi-definite N * N matrix, defines correlation among N variables. 
 
- Raises:
- KeyError: If the column does not exists. ValueError: If mean and cov does not have the same size. 
 
 - update_predictor_uniform(predictor_name=None, lower_bound=0, upper_bound=1.0)[source]
- Update a predictor to uniformly distributed. - Args:
- predictor_name:
- String, a predictor name in AnalyticsDataframe object. 
- lower_bound:
- float, lower boundary of the output interval. All values generated will be greater than or equal to low. The default value is 0. 
- upper_bound:
- float, upper boundary of the output interval. All values generated will be less than or equal to high. The default value is 1.0. 
 
- Raises:
- KeyError: If the column does not exists. 
 
 - update_response_poly_categorical(predictor_name: str | None = None, betas: dict | None = None)[source]
- Add categorical factor into response in a polynomial manner. - Args:
- predictor_name:
- String, a predictor name in AnalyticsDataframe object. 
- betas:
- A dictionary key: categorical values in the current predictor value: beta value for this categorical type/value 
 
- Raises:
- KeyError: If the column does not exists. TypeError: If this is not categorical predictor.