# Least Squares for Solving Linear Regression Equations

## Preface

Write this little thing because he was driven mad by his math homework
The principle is least squares.

$\hat{b}=\frac{\sum\limits^n_{i=1}x_iy_i-n\overline{x}\overline{y}}{\sum\limits^n_{i=1}x_i^2-n\overline{x}^2}$

$\hat{a}=\overline{y}-\hat{b}\overline{x}$

## Usage method

The number of input data groups in the first row.
The second line enters the value of $$x$$ separated by spaces.
The third line enters the corresponding value of $$y$$, separated by spaces.
The number of data groups cannot exceed 10000.
Note: Small errors in actual calculations may result from different digits $$\hat{b}$$ reserved for \hat{a}\).Six decimal places are reserved by default.
UPD: Increase the sum of squares of residuals $$sum\limits_The calculation of {i=1}^n\hat{e}^2$$ and related index $$R^2$$.

$\sum\limits_{i=1}^n\hat{e}^2=\sum\limits_{i=1}^n(y_i-\hat{y}_i)^2$

$R^2=1-\frac{\sum\limits_{i=1}^n(y_i-\hat{y}_i)^2}{\sum\limits^n_{i=1}(y_i-\overline{y})^2}$

UPD: Add system("pause"); for easy viewing of results.

## sample input

5
15.0 25.8 30.0 36.6 44.4
39.4 42.9 42.9 43.1 49.2

b=0.291046
a=34.663848
e^2=8.426560
R^2=0.832073

## Code

#include <bits/stdc++.h>
using namespace std;
const int maxn=1e4+10;
int n;
double x[maxn],y[maxn];
double a,b,R2;

double cal(double k){
return b*k+a;
}

int main(){
scanf("%d",&n);

double avex=0,avey=0;
for(int i=1;i<=n;i++){
scanf("%lf",&x[i]);
avex+=x[i];
}
for(int i=1;i<=n;i++){
scanf("%lf",&y[i]);
avey+=y[i];
}

avex/=n;avey/=n;

double sum1=0,sum2=0;
for(int i=1;i<=n;i++){
sum1+=x[i]*y[i];
sum2+=x[i]*x[i];
}

b=(sum1-n*avex*avey)/(sum2-n*avex*avex);
a=avey-b*avex;

sum1=0,sum2=0;
for(int i=1;i<=n;i++){
sum1+=(y[i]-cal(x[i]))*(y[i]-cal(x[i]));
sum2+=(y[i]-avey)*(y[i]-avey);
}

R2=1-sum1/sum2;

printf("b=%lf\na=%lf\ne^2=%lf\nR^2=%lf\n",b,a,sum1,R2);

system("pause");
return 0;
}


## Linear regression equation transformed from function model

UPD: Added calculation of linear regression equation transformed from quadratic and exponential function models:

$\hat{y}=c_1x^2+c_2$

$\hat{y}=c_3e^{c_4x}$

#include <bits/stdc++.h>
using namespace std;
const int maxn=1e4+10;
int n;
double x[maxn],y[maxn];
double a,b,R2;

double cal(double k){
return b*k+a;
}

int main(){
scanf("%d",&n);

double avex=0,avey=0;
for(int i=1;i<=n;i++){
scanf("%lf",&x[i]);
x[i]*=x[i];//Actually, there is just one more sentence...
avex+=x[i];
}
for(int i=1;i<=n;i++){
scanf("%lf",&y[i]);
avey+=y[i];
}

avex/=n;avey/=n;

double sum1=0,sum2=0;
for(int i=1;i<=n;i++){
sum1+=x[i]*y[i];
sum2+=x[i]*x[i];
}

b=(sum1-n*avex*avey)/(sum2-n*avex*avex);
a=avey-b*avex;

sum1=0,sum2=0;
for(int i=1;i<=n;i++){
sum1+=(y[i]-cal(x[i]))*(y[i]-cal(x[i]));
sum2+=(y[i]-avey)*(y[i]-avey);
}

R2=1-sum1/sum2;

printf("b=%lf\na=%lf\ne^2=%lf\nR^2=%lf\n",b,a,sum1,R2);

system("pause");
return 0;
}


#### Exponential function model

• For $$z=ln\y$$, the output of this code corresponds to $$z=\hat{b}x+\hat{a}$$ and $$hat{a}$$ respectively.The corresponding regression equation is (\hat{y}=e^{\hat{b}x+hat{a}}\)
• Note: Different values of this code $$e$$ may also result in minor errors, which is 2.718281.Daily calculations usually take 2.7, which may lead to errors in $$[-0.01,0.01]$$.
#include <bits/stdc++.h>
using namespace std;
const int maxn=1e4+10;
int n;
double x[maxn],y[maxn],z[maxn];
double a,b,R2;

double cal(double k){
return pow(2.718281,b*k+a);
}

int main(){
scanf("%d",&n);

double avex=0,avez=0;
for(int i=1;i<=n;i++){
scanf("%lf",&x[i]);
avex+=x[i];
}
for(int i=1;i<=n;i++){
scanf("%lf",&y[i]);
z[i]=log(y[i]);
avez+=z[i];
}

avex/=n;avez/=n;

double sum1=0,sum2=0;
for(int i=1;i<=n;i++){
sum1+=x[i]*z[i];
sum2+=x[i]*x[i];
}

b=(sum1-n*avex*avez)/(sum2-n*avex*avex);
a=avez-b*avex;

sum1=0,sum2=0;
for(int i=1;i<=n;i++){
sum1+=(y[i]-cal(x[i]))*(y[i]-cal(x[i]));
sum2+=(y[i]-avez)*(y[i]-avez);
}

R2=1-sum1/sum2;

printf("b=%lf\na=%lf\ne^2=%lf\nR^2=%lf",b,a,sum1,R2);

system("pause");
return 0;
}