'linear regression' 태그의 글 목록

linear regression

multi-variable linear regression(with matrix multiplication) + tensorflow file load 2018.12.18
Linear Regression 응용 (Placeholder) 2018.12.18
Linear Regression with 텐서플로우 (Pycharm 환경) 2018.12.18

multi-variable linear regression(with matrix multiplication) + tensorflow file load

2018. 12. 18. 15:01

*multi-variable linear regression은 이전 linear regression과 다르게 x값이 여러개이다

1) one-variable regression Hypothesis

H(x) = Wx + b

2) two-variable regression Hypothesis

H(x1, x2) = w1x1 + w2x2 + b

3) multi-variable regression Hypothesis

H(x1,x2...xn) = w1x1 + w2x2 + w3x3 ... + wnxn + b

위 수식은 비효율적 따라서 아래 Matrix Multiplication 수식으로 변경해준다

H(x1, x2...xn) = [w1, w2, w3] [x1 x2 x3] (세로) + b

H(x1, x2...xn) = [x1, x2, x3] [w1 w2 w3] (세로) + b

H(X) = WX + b

H(X) = XW + b

b term을 없앤 simplified된 형태 => w 괄호 안으로 넣어준다.

H(x1, x2...xn) = [b w1, w2, w3] [x1 x2 x3] (세로) + b

H(x1, x2...xn) = [x1, x2, x3] [b w1 w2 w3] (세로) + b

H(X) = WX

H(X) = XW

아래와 같은 Transpose 형태로 쓸 수도 있다 :

w = [w1 w2 w3] (세로) x= [x1 x2 x3] (세로)

H(X) = WtX + b

*Multi-variable linear regression에서의 Cost function :

Cost Function은 이전 linear regression과 똑같다 :

Gradient Descent 알고리즘을 사용한다.

cost(W,b) = 1/m 평균(H(x)- y ) 제곱

* Multi-variable linear regression 구현 실습 :

1) 2개의 x variable( 비효율적인 방법 = matrix 형태아님 ) :

import tensorflow as tf
 
x1_data = [1.,0.,3.,0.,5.]
x2_data = [0.,2.,0.,4.,0.]
y_data = [1,2,3,4,5]
 
W1 = tf.Variable(tf.random_uniform([1], -1, 1))
W2 = tf.Variable(tf.random_uniform([1], -1, 1))
b = tf.Variable(tf.random_uniform([1], -1, 1))
 
hypothesis = W1*x1_data + W2*x2_data + b
cost = tf.reduce_mean(tf.square(hypothesis - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(cost)
 
sess = tf.Session()
sess.run(tf.global_variables_initializer())
 
for step in range(2001) :
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(cost), sess.run(W1), sess.run(W2), sess.run(b))
 

Colored by Color Scripter

2) 2개의 x variable( 효율적인 방법 = matrix 형태 ) :

import tensorflow as tf
 
x_data = [[1.,0.,3.,0.,5.],[0.,2.,0.,4.,0.]]
y_data = [1,2,3,4,5]
 
W = tf.Variable(tf.random_uniform([1,2], -1, 1))
b = tf.Variable(tf.random_uniform([1], -1, 1))
 
hypothesis = tf.matmul(W, x_data) + b  # H(X) = WX + b
cost = tf.reduce_mean(tf.square(hypothesis - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(cost)
 
sess = tf.Session()
sess.run(tf.global_variables_initializer())
 
for step in range(2001) :
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(cost), sess.run(W), sess.run(b))
 

Colored by Color Scripter

1) 과 2) 의 비교

b term을 없애고 matrix를 [1,3]으로 변경 시 :

import tensorflow as tf
 
x_data = [[1,1,1,1,1],
         [1.,0.,3.,0.,5.],
          [0.,2.,0.,4.,0.]]
y_data = [1,2,3,4,5]
 
W = tf.Variable(tf.random_uniform([1,3], -1, 1))
 
 
hypothesis = tf.matmul(W, x_data)  # H(X) = WX
cost = tf.reduce_mean(tf.square(hypothesis - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(cost)
 
sess = tf.Session()
sess.run(tf.global_variables_initializer())
 
for step in range(2001) :
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(cost), sess.run(W))
 

Colored by Color Scripter

*Data를 load 하여 학습 시키기 :

=> Pycharm에 data를 추가한다.

data.zip

프로젝트 폴더에 놓는다

import tensorflow as tf
import numpy as np
 
xy = np.loadtxt('./data/03train.txt', dtype='float32')
print(xy)
 
x_data = xy[:, 0 : -1]
y_data = xy[:, [-1]]
print(x_data.shape, y_data.shape)
 
W = tf.Variable(tf.random_uniform([3,1], -1., 1.)) # 3 = X 열의 개수 
hypothesis = tf.matmul(x_data, W)
cost = tf.reduce_mean(tf.square(hypothesis - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(cost)
 
sess =tf.Session()
sess.run(tf.global_variables_initializer())
 
for step in range(2001) :
    sess.run(train)
    if step % 20 == 0:  # 20 번에 1번씩
        print(step, sess.run(cost), sess.run(W))

*CSV 파일을 읽어서 출력하기(delimiter는 , 콤마로 ) :

1
2
3
4
5
6
7
8
9
10
import tensorflow as tf
import numpy as np
 
xy = np.loadtxt('./data/test-score.csv', delimiter=',', dtype='float32')
print(xy)
 
x_data = xy[:, 0 : -1]
y_data = xy[:, [-1]]
print(x_data.shape, y_data.shape)
 
Colored by Color Scripter
cs

*File input linear regression :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import tensorflow as tf
import numpy as np
 
tf.set_random_seed(777)  
 
xy = np.loadtxt('./data/test-score.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
 
print(x_data.shape, x_data, len(x_data))
print(y_data.shape, y_data)
 
X = tf.placeholder(tf.float32, shape=[None, 3])
Y = tf.placeholder(tf.float32, shape=[None, 1])
 
W = tf.Variable(tf.random_normal([3, 1]))
b = tf.Variable(tf.random_normal([1]))
 
hypothesis = tf.matmul(X, W) + b
 
cost = tf.reduce_mean(tf.square(hypothesis - Y))
 
optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)
train = optimizer.minimize(cost)
 
sess = tf.Session()
sess.run(tf.global_variables_initializer())
 
for step in range(2001):
    cost_val, hy_val, _ = sess.run(
        [cost, hypothesis, train], feed_dict={X: x_data, Y: y_data})
    if step % 10 == 0:   # 10번에 1번씩
        print(step, "Cost: ", cost_val, sess.run(W), sess.run(b))
 
print("=====prediction=====")
print(sess.run(hypothesis, feed_dict={X: [[100, 70, 101]]}))
print(sess.run(hypothesis, feed_dict={X: [[60, 70, 110], [90, 100, 80]]}))
 
 
 
Colored by Color Scripter
cs

저작자표시

'Python 활용 딥러닝' 카테고리의 다른 글

Softmax Classification(multinominal classification) (0)	2018.12.18
Logistic Regression 사용법 (0)	2018.12.18
Linear Regression 응용 (Placeholder) (0)	2018.12.18
Linear Regression with 텐서플로우 (Pycharm 환경) (0)	2018.12.18
모두를 위한 머신러닝 + 기계학습 절차 Machine Learning Concept (0)	2018.12.18

Linear Regression 응용 (Placeholder)

2018. 12. 18. 12:11

import tensorflow as tf
x_data = [1., 2., 3.]
y_data = [1., 2., 3.]
w = tf.Variable(tf.random_uniform([1], -1., 1.)) #초기 값을 랜덤하게 준다 -1에서 1 사이에서 1개
b = tf.Variable(tf.random_uniform([1], -1., 1.)) #초기 값을 랜덤하게 준다 -1에서 1 사이에서 1개
hypothesis = w * x_data + b
cost = tf.reduce_mean(tf.square(hypothesis - y_data)) # Cost를 구한다
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(cost)  #가장 작은 cost륵 가져온다
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for step in range(2001):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(cost), sess.run(w), sess.run(b))
 

Colored by Color Scripter

위 코드는 하드코딩으로 값을 주고 있음으로 placeholder를 사용해서 재사용할 수 있도록 만들어 준다 :

import tensorflow as tf
 
x_data = [1., 2., 3.]
y_data = [1., 2., 3.]
 
w = tf.Variable(tf.random_uniform([1], -1., 1.)) #초기 값을 랜덤하게 준다 -1에서 1 사이에서 1개
b = tf.Variable(tf.random_uniform([1], -1., 1.)) #초기 값을 랜덤하게 준다 -1에서 1 사이에서 1개
 
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
 
hypothesis = w * X + b
cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Cost를 구한다
 
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(cost)  #가장 작은 cost륵 가져온다
 
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
 
for step in range(2001):
    sess.run(train, feed_dict={X:x_data, Y:y_data})
    if step % 20 == 0:
        print(step, sess.run(cost, feed_dict={X:x_data, Y:y_data}), sess.run(w), sess.run(b))

위 내용을 비교를 하면 :

위 코드에서 Hypothesis를 구하려면 X(x_data) 만 있으면 된다(Y는 필요없음)

코드 맨 아래줄에 아래와 같이 추가해준다 :

1
print(sess.run(hypothesis, feed_dict={X:5}))
cs

전체 코드 :

import tensorflow as tf
 
x_data = [1., 2., 3.]
y_data = [1., 2., 3.]
 
w = tf.Variable(tf.random_uniform([1], -1., 1.)) #초기 값을 랜덤하게 준다 -1에서 1 사이에서 1개
b = tf.Variable(tf.random_uniform([1], -1., 1.)) #초기 값을 랜덤하게 준다 -1에서 1 사이에서 1개
 
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
 
hypothesis = w * X + b
cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Cost를 구한다
 
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(cost)  #가장 작은 cost륵 가져온다
 
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
 
for step in range(2001):
    sess.run(train, feed_dict={X:x_data, Y:y_data})
    if step % 20 == 0:
        print(step, sess.run(cost, feed_dict={X:x_data, Y:y_data}), sess.run(w), sess.run(b))
 
print(sess.run(hypothesis, feed_dict={X:5}))

Colored by Color Scripter

*Cost의 최소화 방법 :

H(x) = Wx + b

Wx 와 H(x) 값이 거의 비슷( = 예측값)

cost(W,b) = W와 b에 대한 함수

cost(W) = b가 값이 minor함으로 W에 대한 함수로 변경

cost(W) = 1/m 평균 (Wx - y)제곱

cost가 가장 작을 때 W와 b의 값을 구하기 위해서

Gradient descent algorithm을 사용한다.

Local Minimum에 도달할 때까지 줄여나간다...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import tensorflow as tf
import matplotlib.pyplot as plt
tf.set_random_seed(777)  # for reproducibility - seed를 주는 것은 random variable을 일정하게 주기위함

X = [1, 2, 3]
Y = [1, 2, 3]
 
W = tf.placeholder(tf.float32)
 
# Our hypothesis for linear model X * W
hypothesis = X * W
 
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
 
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
 
# Variables for plotting cost function
W_vals = []
cost_vals = []
 
for i in range(-30, 50):
    curr_W = i * 0.1 # 촘촘하게 그려주기 위해 0.1을 준다.
    curr_cost = sess.run(cost, feed_dict={W: curr_W})
    W_vals.append(curr_W)
    cost_vals.append(curr_cost)
 
# Show the cost function
plt.plot(W_vals, cost_vals) # X= W_vals, Y=cost_vals
plt.show() # 화면에 보여주기
 
Colored by Color Scripter
cs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import tensorflow as tf
tf.set_random_seed(777)  # for reproducibility
 
x_data = [1, 2, 3]
y_data = [1, 2, 3]
 
# Try to find values for W and b to compute y_data = W * x_data + b
# We know that W should be 1 and b should be 0
# But let's use TensorFlow to figure it out
W = tf.Variable(tf.random_normal([1]), name='weight')
 
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
 
# Our hypothesis for linear model X * W
hypothesis = X * W # Simplified Hypothesis
 
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
 
# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative
# Gradient Descent Optimizer 대신에 아래와 같은 공식을 사용한다 :
learning_rate = 0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W - learning_rate * gradient
update = W.assign(descent)
 
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
 
for step in range(21):
    sess.run(update, feed_dict={X: x_data, Y: y_data})
    print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
 
'''
0 5.81756 [ 1.64462376]
1 1.65477 [ 1.34379935]
2 0.470691 [ 1.18335962]
3 0.133885 [ 1.09779179]
4 0.0380829 [ 1.05215561]
5 0.0108324 [ 1.0278163]
6 0.00308123 [ 1.01483536]
7 0.000876432 [ 1.00791216]
8 0.00024929 [ 1.00421977]
9 7.09082e-05 [ 1.00225055]
10 2.01716e-05 [ 1.00120032]
11 5.73716e-06 [ 1.00064015]
12 1.6319e-06 [ 1.00034142]
13 4.63772e-07 [ 1.00018203]
14 1.31825e-07 [ 1.00009704]
15 3.74738e-08 [ 1.00005174]
16 1.05966e-08 [ 1.00002754]
17 2.99947e-09 [ 1.00001466]
18 8.66635e-10 [ 1.00000787]
19 2.40746e-10 [ 1.00000417]
20 7.02158e-11 [ 1.00000226]
'''
 
Colored by Color Scripter
cs

저작자표시

'Python 활용 딥러닝' 카테고리의 다른 글

Logistic Regression 사용법 (0)	2018.12.18
multi-variable linear regression(with matrix multiplication) + tensorflow file load (0)	2018.12.18
Linear Regression with 텐서플로우 (Pycharm 환경) (0)	2018.12.18
모두를 위한 머신러닝 + 기계학습 절차 Machine Learning Concept (0)	2018.12.18
Tensorflow Basic(Ranks, Shapes, Types) + Pycharm 설치 (0)	2018.12.17

Linear Regression with 텐서플로우 (Pycharm 환경)

2018. 12. 18. 10:51

*일반적인 직선을 나타내는 공식(선형을 나타내는 가설 Hypothesis) :

y = wx + b

H(x) = wx + b

위 공식에서 W와 B가 미지수이다.

결국 W와 b를 찾는것이 목적!!

w= weight 기울기

b= bias

(x, y) , (x1, y1), (x2, y2) 의 값이 있을 때

(x, H(x)), (x1, H(x1)), (x2, H(x2))

Cost = 1/N <<(H(x) - y)2

Cost는 w의 제곱식

Cost는 작으면 작을 수록 좋다!(Loss가 적다는 의미)

*Linear Regression을 하기 위한 기본 적인 3가지 요소

- Hypothesis

- Cost 정의(에측값에서 제곱을 해서 평균을 구한 값)

- Cost를 최소화 하기 위한 알고리즘(Gradient Decent...)

* 최적의 Hypothesis의 선택 :
=> H(x) - y를 최소화하는 직선을 찾으면 됨!

- Cost = Hypothesis에서 예측값 목표를 뺀값의 제곱의 평균

(tf.reduce_mean(tf.square(hypothesis - y_data)))

*Linear Regression Example (with Pycharm) :

1) Pycharm 가상환경에서 tensorflow 설치 :

(Terminal에서 'pip install tensorflow' 입력)

2) Pycharm에서 Python 파일 생성하기 :

실행하기 (shift + F10) or 마우스 우클릭 후 Run linear_Regression

3) tensorflow로 학습시키기 :

소스코드 :

import tensorflow as tf
 
x_data = [1., 2., 3.]
y_data = [1., 2., 3.]
 
 
w = tf.Variable(tf.random_uniform([1], -1., 1.)) #초기 값을 랜덤하게 준다 -1에서 1 사이에서 1개
b = tf.Variable(tf.random_uniform([1], -1., 1.)) #초기 값을 랜덤하게 준다 -1에서 1 사이에서 1개
 
hypothesis = w * x_data + b
cost = tf.reduce_mean(tf.square(hypothesis - y_data)) # Cost를 구한다
 
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(cost)  #가장 작은 cost륵 가져온다
 
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
 
for step in range(2001):
    sess.run(train)
    if step % 20 == 0:
        print(step, sess.run(cost), sess.run(w), sess.run(b))

저작자표시

'Python 활용 딥러닝' 카테고리의 다른 글

multi-variable linear regression(with matrix multiplication) + tensorflow file load (0)	2018.12.18
Linear Regression 응용 (Placeholder) (0)	2018.12.18
모두를 위한 머신러닝 + 기계학습 절차 Machine Learning Concept (0)	2018.12.18
Tensorflow Basic(Ranks, Shapes, Types) + Pycharm 설치 (0)	2018.12.17
Scipy 사용법 (0)	2018.12.17

PREV 1 NEXT

Penguin's Repository