The use of various decision support tools
Assess the use of various decision support tools and explain why outliers are sometimes called influential observations.
Discuss what could happen to the slope of a regression of Y versus a single X when an outlier is included versus when it is not included.
Will this necessarily happen when a point is an outlier?
Sample Solution
Decision support tools are used to help analyze data and generate solutions for decision making. These tools can range from simple spreadsheets to more complex software packages like artificial intelligence (AI) or machine learning. They can be used in data mining, forecasting, optimization, and other areas of decision making.
Outliers are sometimes called influential observations because they have a significant impact on the results of an analysis or model. An outlier is defined as a single data point that lies far outside the range of the rest of the data points, suggesting it has been affected by some external factor not included in the dataset as a whole.
When an outlier is included in a regression analysis involving only one independent variable (X) and one dependent variable (Y), it will affect the slope of the regression line if its X-value differs significantly from all other values in the dataset. This may result in either a decrease or an increase depending on whether it skews positively or negatively away from the general trend of other points. When an outlier is excluded, however, its influence does not exist within the context of that particular regression line and thus its absence causes no change to slope value when compared with when it was present.
However, this does not necessarily happen every time there is an outlier present; how much it affects any given statistic depends largely on what type of outlier it is and how different from overall trends it lies. For example, if all other points lie along one side of average with few outliers elsewhere then removing those outliers alone would have little bearing on any given analysis. On top of this individual effects vary between types regressions which further complicate effects had by outliers being removed – for instance removing extreme cases might reduce r-squared measures for multiple linear regressions but could actually increase them for logit regressions due to their inherently nonlinear nature resulting from binary variables being modeled by continuous ones.. Therefore influences each case must be analyzed separately before making assumptions about what changes may occur upon removal/inclusion based solely on presence/absence alone