Describing scatterplots (form, direction, strength, outliers) (article) | Khan Academy
Math·AP® Statistics·Exploring bivariate numerical data·Making and Form: Is the association linear or nonlinear? Direction: Is the association positive or negative? Strength: Does the association appear to be strong, moderately strong, It's also important to include the context of the two variables in the description of. In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables .. RDC is invariant with respect to non-linear scalings of random variables, is capable of discovering a wide range of. When you say correlation, we usually think about Pearson's correlation that determines the strength of a linear relationship between two.
So, for example, in this one here, in the horizontal axis, we might have something like age, and then here it could be accident frequency. And I'm just making this up. And I could just show these data points, maybe for some kind of statistical survey, that, when the age is this, whatever number this is, maybe this is 20 years old, this is the accident frequency.
And it could be a number of accidents per hundred. And that, when the age is 21 years old, this is the frequency. And so, these data scientists, or statisticians, went and plotted all of these in this scatter plot.
Statistics review 7: Correlation and regression
This is often known as bivariate data, which is a very fancy way of saying, hey, you're plotting things that take two variables into consideration, and you're trying to see whether there's a pattern with how they relate. And what we're going to do in this video is think about, well, can we try to fit a line, does it look like there's a linear or non-linear relationship between the variables on the different axes?
How strong is that variable? Is it a positive, is it a negative relationship? And then, we'll think about this idea of outliers. So let's just first think about whether there's a linear or non-linear relationship. And I'll get my little ruler tool out here. So, this data right over here, it looks like I could get a, I could put a line through it that gets pretty close through the data. You're not gonna, it's very unlikely you're gonna be able to go through all of the data points, but you can try to get a line, and I'm just doing this.
There's more numerical, more precise ways of doing this, but I'm just eyeballing it right over here.
And it looks like I could plot a line that looks something like that, that goes roughly through the data. So this looks pretty linear. And so I would call this a linear relationship. And since, as we increase one variable, it looks like the other variable decreases. This is a downward-sloping line. I would say this is a negative. This is a negative linear relationship.
But this one looks pretty strong. So, because the dots aren't that far from my line. This one gets a little bit further, but it's not, there's not some dots way out there.
And so, most of 'em are pretty close to the line. So I would call this a negative, reasonably strong linear relationship.Multiple regression 7 - nonlinear relationships
Negative, strong, I'll call it reasonably, I'll just say strong, but reasonably strong, linear, linear relationship between these two variables.
Now, let's look at this one. And pause this video and think about what this one would be for you. I'll get my ruler tool out again. And it looks like I can try to put a line, it looks like, generally speaking, as one variable increases, the other variable increases as well, so something like this goes through the data and approximates the direction.
And this looks positive. As one variable increases, the other variable increases, roughly. So this is a positive relationship. But this is weak. A lot of the data is off, well off of the line. But I'd say this is still linear. It seems that, as we increase one, the other one increases at roughly the same rate, although these data points are all over the place.
What Is a Non Linear Relationship? | Sciencing
So, I would still call this linear. Now, there's also this notion of outliers.
If I said, hey, this line is trying to describe the data, well, we have some data that is fairly off the line. So, for example, even though we're saying it's a positive, weak, linear relationship, this one over here is reasonably high on the vertical variable, but it's low on the horizontal variable.
And so, this one right over here is an outlier.
regression - Non-linear Relationship between two variables - Cross Validated
It's quite far away from the line. You could view that as an outlier. And this is a little bit subjective. Outliers, well, what looks pretty far from the rest of the data?
This could also be an outlier. Let me label these. Now, pause the video and see if you can think about this one. Is this positive or negative, is it linear, non-linear, is it strong or weak? I'll get my ruler tool out here. So, this goes here. The points in Plot 2 follow the line closely, suggesting that the relationship between the variables is strong. Weak linear relationship Plot 4: Nonlinear relationship The data points in Plot 3 appear to be randomly distributed. They do not fall close to the line indicating a very weak relationship if one exists.
If a relationship between two variables is not linear, the rate of increase or decrease can change as one variable changes, causing a "curved pattern" in the data. This curved trend might be better modeled by a nonlinear function, such as a quadratic or cubic function, or be transformed to make it linear.
Plot 4 shows a strong relationship between two variables. This relationship illustrates why it is important to plot the data in order to explore any relationships that might exist. Monotonic relationship In a monotonic relationship, the variables tend to move in the same relative direction, but not necessarily at a constant rate.
In a linear relationship, the variables move in the same direction at a constant rate. Plot 5 shows both variables increasing concurrently, but not at the same rate. This relationship is monotonic, but not linear. The Pearson correlation coefficient for these data is 0.