6. Gradient Descent with R code

Ch7. Gradient Descent with R Code

Gradient Descent를 R code로 구현해보도록 하겠습니다.

1. 샘플 데이터 생성(난수)


x = runif(300,-10,10) # 균일분포 난수 생성
Noise = rnorm(n = 300, mean = 0 , sd = 3)
y = x + Noise


x
DF = data.frame(x  = x, 
                y = y)
library(ggplot2)
ggplot(DF,aes(x= x, y= y)) +
  geom_point(col = 'royalblue') +
  theme_bw()

2. Learning Rate 설정


xxxxxxxxxx
alpha = 0.01

3. 초기 가중치 행렬 생성


xxxxxxxxxx
Weights = matrix(c(0,0),nrow = 2)
Weights


xxxxxxxxxx
     [,1]
[1,]    0
[2,]    0

4. 회귀식 계산을 위한 행렬 생성

\begin{bmatrix} \beta_0 \\ \beta_1 \end{bmatrix} * \begin{bmatrix} 1 & x_1\\ 1 & x_2\\ 1 & x_3\\ 1 & x_4\\ \cdot & \cdot \\ 1 & x_{n-1}\\ 1 & x_n\\ \end{bmatrix}= \begin{bmatrix} \beta_0+\beta_1x_1\\ \beta_0+\beta_1x_2\\ \beta_0+\beta_1x_3\\ \beta_0+\beta_1x_4\\ \cdot \\ \beta_0+\beta_1x_{n-1}\\ \beta_0+\beta_1x_{n}\\ \end{bmatrix}

이러한 식을 R에서 만들어주어야 하는데, 코드는 다음과 같습니다.


xxxxxxxxxx
# 행렬형태로 만들어 주기
X = matrix(x)
X = cbind(1,X)
head(X)


xxxxxxxxxx
     [,1]      [,2]
[1,]    1 5.3357724
[2,]    1 6.8523958
[3,]    1 8.6630348
[4,]    1 4.2440585
[5,]    1 3.2915996
[6,]    1 0.4789394

5. Error Loss function 만들기

L_2(M_w,X)=\frac{1}{2}\Sigma^{n}_{i=1}(y_i-(w\cdot x_i))^2


xxxxxxxxxx
# Error 계산
# %*%는 행렬의 곱셈을 할 때 사용합니다.
Error = function(x, y, Weight){
    sum(( y - x %*% Weight )^2) / (2*length(y))
}

6. 알고리즘 학습

w[j] = w[j] + \alpha\;*\Sigma_{i=1}^{n}((y_i-M_w(x_i))*x_i[j])

학습을 돌리기 전에, Error(Cost)값과 가중치(회귀계수)가 저장될 빈공간을 만들어야 합니다.


xxxxxxxxxx
Error_Surface = c()
Weight_Value = list()


xxxxxxxxxx
for ( i in 1 : 300){
  # X는 (300,2) 행렬
  # Weights는 (2,1)행렬 
  # X * Weights => (300,1) 행렬[각 데이터에서의 Error 연산]
  error = (X %*% Weights - y)
  # Delta Funtion 계산
  Delta_function = t(X) %*% error / length(y)
  # 가중치 수정
  Weights = Weights - alpha * Delta_function
  Error_Surface[i] = Error(X, y, Weights)
  Weight_Value[[i]] = Weights 
}

7. 시각화


xxxxxxxxxx
p = ggplot(DF,aes(x = x, y = y)) +
  geom_point(col = 'royalblue', alpha = 0.4)+
  theme_bw()
for( i in 1:300){
  p = p +
    geom_abline(slope = Weight_Value[[i]][2], intercept = Weight_Value[[i]][1], col = 'red', alpha = 0.4)
}
p

다음과 같이 가중치(회귀계수)가 조금씩 움직이면서 최적의 회귀선을 찾아가는 것을 알 수 있습니다.


xxxxxxxxxx
DF$num = 1:300
DF$Error_value = Error_Surface
ggplot(DF) +
  geom_line(aes(x = num , y = Error_value),group = 1) +
  geom_point(aes(x = num, y = Error_value )) +
  theme_bw() +
  ggtitle("Error Function") + xlab("Num of iterations")

Error 값 또한 감소하면서 최적의 가중치(기울기)에 수렴하는 것을 확인할 수 있습니다.

8. 최소제곱법을 이용한 선형회귀식과의 비교

일반 선형회귀(최소제곱법)


xxxxxxxxxx
REG = lm(y ~ x )
summary(REG)


xxxxxxxxxx
Call:
lm(formula = y ~ x)
Residuals:
    Min      1Q  Median      3Q     Max 
-8.7002 -1.9927  0.0004  1.8941  8.1390 
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   0.4661     0.1707    2.73   0.0067 ** 
x             1.0014     0.0301   33.27   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.957 on 298 degrees of freedom
Multiple R-squared:  0.7879,    Adjusted R-squared:  0.7872 
F-statistic:  1107 on 1 and 298 DF,  p-value: < 2.2e-16

일반 최소제곱법으로 회귀식을 추정했을 경우, $\beta_0$ 는 0.4661, $\beta_1$ 은 1.0014입니다.

Gradient


xxxxxxxxxx
Weight_Value[[300]]
           [,1]
[1,] -0.3328329
[2,]  1.0313280

경사하강법으로 회귀식을 추정했을 경우, $\beta_0$ 는 -0.3328, $\beta_1$ 은 1.031입니다. $R^2$ 를 구해보면 다음과 같습니다.


xxxxxxxxxx
GR_MODEL = -0.3328329  + 1.0313280 * x
actual = y
rss <- sum((GR_MODEL - actual) ^ 2)
tss <- sum((actual - mean(actual)) ^ 2)
rsq <- 1 - rss/tss
rsq


xxxxxxxxxx
[1] 0.7716216

재미라도 꿈꾸자

이 블로그 검색

6. Gradient Descent with R code

Ch7. Gradient Descent with R Code

1. 샘플 데이터 생성(난수)

2. Learning Rate 설정

3. 초기 가중치 행렬 생성

4. 회귀식 계산을 위한 행렬 생성

5. Error Loss function 만들기

6. 알고리즘 학습

7. 시각화

8. 최소제곱법을 이용한 선형회귀식과의 비교

태그

댓글

댓글 쓰기

이 블로그의 인기 게시물

6.1.2 고수들이 자주 쓰는 R코드 소개 2편 [중복 데이터 제거 방법]

4.4.1 R 문자열(TEXT) 데이터 처리하기 1

3. Resampling 방법론(Leave one out , Cross Validation)

4. 통계적 추정(점추정,구간추정)

3.2.3 R 시각화[ggplot2] 2편 (히스토그램, 밀도글래프, 박스플롯, 산점도)