cs231n & eecs498 Regularization and Optimization 교안 정리

2024-02-01 2 분 소요

0. Last Time : Loss function quantify preferences

kyumly

총 2개의 분류 Loss Function 공부했음
하지만 Training data 결과가 좋지만, Test Data 결과가 안좋은것을 확인할 수 있음
이번 강의에서는 Training data를 가지고 Test Data 사용할 때 좋은 결과를 도출하는 알고리즘을 배움

1. Overfitting

위에 나오는 현상을 Overfitting(과적합) 이라고 정의됨
3개의 데이터 모두 overfit 경우라고 볼 수 있음
loss 더 낮으면 과적합에 가깝다고 볼 수 있다.

2. Regularization : Beyond Training Error

kyumly

Loss 총 2개의 파트로 나눠짐
data loss 실제로 훈련 도중에 data에서 나오는 loss 값
Regularization data loss에 과적합을 방지하기 위한 값을 추가
방법으로는 L2, L1 방식으로 나눠짐

kyumly
kyumly

3. Finding a goog W

kyumly

Loss function은 총 data loss, regularization 구성되어 있음
그럼 최적에 Loss 값을 찾기 위해서는, W 값을 최적화 해야함

3.1 Random search

kyumly

첫번째 방식으로는 W 값을 모두 랜덤으로 설정하는것
정말 좋지 못한 방법

3.2 Follow the slope

kyumly

수학적 기법인 미분을 사용해 해당 W 최솟값을 찾아가는 과정
편미분을 활용해 최솟값을 찾아가는 과정
가장 가파르게 최소 점으로 가야한다.

3.2.1 Numeric Gradient

kyumly

내가 가지고 있는 값에서 W 값을 0으로 수렴하도록 보내면, 미분 값을 얻일 수있다.
속도가 느리고, 구현하기 쉽다, 대략적인 값

3.2.2 Analytic gradient

kyumly

속도가 빠르면, 정확한 값이지만 오류가 크다
근데 대부분 Analytic gradient 방식을 사용한다.
그리고 numeric gradient 방식은 체크하기 위해 같이 사용된다.

  def eval_numerical_gradient(f, x, verbose=True, h=0.00001):
    """
    a naive implementation of numerical gradient of f at x
    - f should be a function that takes a single argument
    - x is the point (numpy array) to evaluate the gradient at
    """
    fx = f(x)  # evaluate function value at original point
    grad = np.zeros_like(x)
    # iterate over all indexes in x
    it = np.nditer(x, flags=["multi_index"], op_flags=["readwrite"])
    
    while not it.finished:
        # evaluate function at x+h
        ix = it.multi_index
        oldval = x[ix]
        x[ix] = oldval + h  # increment by h
        fxph = f(x)  # evalute f(x + h)
        x[ix] = oldval - h
        fxmh = f(x)  # evaluate f(x - h)
        x[ix] = oldval  # restore

        # compute the partial derivative with centered formula
        grad[ix] = (fxph - fxmh) / (2 * h)  # the slope
        if verbose:
            print(ix, grad[ix])
        it.iternext()  # step to next dimension

    return grad